DeployedModel¶
- class verta.deployment.DeployedModel(prediction_url, token=None, creds=None)¶
Object for interacting with deployed models.
This class provides functionality for sending predictions to a deployed model on the Verta backend.
Authentication credentials will be picked up from environment variables if they are not supplied explicitly in the creds parameter.
Changed in version 0.23.0: The
from_url
method has been removed in favor of directly instantiatingDeployedModel
.- Parameters:
prediction_url (str) – URL of the prediction endpoint
token (str, optional) – Prediction token. Can be copy and pasted directly from the Verta Web App.
creds (
Credentials
, optional) – Authentication credentials to attach to each prediction request.
- Variables:
prediction_url (str) – Full prediction endpoint URL. Can be copy and pasted directly from the Verta Web App.
Examples
# Preferred method for instantiating an object. endpoint = client.get_or_create_endpoint('endpoint_name') deployed_model = endpoint.get_deployed_model() deployed_model.predict(['here is a prediction']) # Instantiating directly is also possible. DeployedModel( "https://app.verta.ai/api/v1/predict/01234567-0123-0123-0123-012345678901", token="abcdefgh-abcd-abcd-abcd-abcdefghijkl", ) # <DeployedModel at https://app.verta.ai/api/v1/predict/01234567-0123-0123-0123-012345678901>
- batch_predict(df, batch_size: int = 100, compress: bool = False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None)¶
Makes a prediction using input df of type pandas.DataFrame.
New in version 0.22.2.
- Parameters:
df (pd.DataFrame) – A batch of inputs for the model. The dataframe must have an index (note that most pandas dataframes are created with an automatically-generated index).
compress (bool, default False) – Whether to compress the request body.
batch_size (int, default 100) – The number of rows to send in each request.
max_retries (int, default 13) – Maximum number of retries on status codes listed in
retry_status
.retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use:
retry_status={404, 429, 500, 503, 504, 409}
backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern:
{backoff factor} * (2 ** ({number of total retries} - 1))
with a maximum sleep time between requests of 120 seconds.prediction_id (str, default None) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.
- Returns:
prediction (pd.DataFrame) – Output returned by the deployed model for input df.
- Raises:
RuntimeError – If the deployed model encounters an error while running the prediction.
requests.HTTPError – If the server encounters an error while handing the HTTP request.
- get_curl()¶
Gets a valid cURL command.
- Returns:
str
- headers()¶
Returns a copy of the headers attached to prediction requests.
- predict(x: List[Any], compress=False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None) Dict[str, Any] ¶
Makes a prediction using input x.
Changed in version 0.23.0: The
always_retry_404
andalways_retry_429
parameters have been removed. Status codes404
and429
, among others, are included by default in theretry_status
parameter. Default is 13 retries over 10 minutes. This behavior can be changed by adjustingmax_retries
andbackoff_factor
.New in version 0.22.0: The
retry_status
parameter.New in version 0.22.0: The
backoff_factor
parameter.New in version 0.22.0: The
prediction_id
parameter.- Parameters:
x (list) – A batch of inputs for the model.
compress (bool, default False) – Whether to compress the request body.
max_retries (int, default 13) – Maximum number of retries on status codes listed in
retry_status
.retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use:
retry_status={404, 429, 500, 503, 504, 409}
backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern:
{backoff factor} * (2 ** ({number of total retries} - 1))
with a maximum sleep time between requests of 120 seconds.prediction_id (str, default None) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.
- Returns:
prediction (list) – Output returned by the deployed model for x.
- Raises:
RuntimeError – If the deployed model encounters an error while running the prediction.
requests.HTTPError – If the server encounters an error while handing the HTTP request.
- predict_with_id(x: List[Any], compress=False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None) Tuple[str, List[Any]] ¶
Makes a prediction using input x the same as predict, but returns a tuple including the ID of the prediction request along with the prediction results.
New in version 0.22.0: The prediction_id parameter.
New in version 0.22.0: The retry_status parameter.
New in version 0.22.0: The backoff_factor parameter.
- Parameters:
x (list) – A batch of inputs for the model.
compress (bool, default False) – Whether to compress the request body.
max_retries (int, default 13) – Maximum number of retries on status codes listed in
retry_status
parameter only.retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use:
retry_status={404, 429, 500, 503, 504, 409}
backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern:
{backoff factor} * (2 ** ({number of total retries} - 1))
with a maximum sleep time between requests of 120 seconds.prediction_id (str, optional) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.
- Returns:
id (str) – The prediction ID.
prediction (List[Any]) – The output returned by the deployed model for x.
- Raises:
RuntimeError – If the deployed model encounters an error while running the prediction.
requests.HTTPError – If the server encounters an error while handing the HTTP request.