DeployedModel

class verta.deployment.DeployedModel(prediction_url, token=None, creds=None)

Object for interacting with deployed models.

This class provides functionality for sending predictions to a deployed model on the Verta backend.

Authentication credentials will be picked up from environment variables if they are not supplied explicitly in the creds parameter.

Changed in version 0.23.0: The from_url method has been removed in favor of directly instantiating DeployedModel.

Parameters:
  • prediction_url (str) – URL of the prediction endpoint

  • token (str, optional) – Prediction token. Can be copy and pasted directly from the Verta Web App.

  • creds (Credentials, optional) – Authentication credentials to attach to each prediction request.

Variables:

prediction_url (str) – Full prediction endpoint URL. Can be copy and pasted directly from the Verta Web App.

Examples

 # Preferred method for instantiating an object.
 endpoint = client.get_or_create_endpoint('endpoint_name')
 deployed_model = endpoint.get_deployed_model()
 deployed_model.predict(['here is a prediction'])

 # Instantiating directly is also possible.
 DeployedModel(
     "https://app.verta.ai/api/v1/predict/01234567-0123-0123-0123-012345678901",
     token="abcdefgh-abcd-abcd-abcd-abcdefghijkl",
 )
 # <DeployedModel at https://app.verta.ai/api/v1/predict/01234567-0123-0123-0123-012345678901>
batch_predict(df, batch_size: int = 100, compress: bool = False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None)

Makes a prediction using input df of type pandas.DataFrame.

New in version 0.22.2.

Parameters:
  • df (pd.DataFrame) – A batch of inputs for the model. The dataframe must have an index (note that most pandas dataframes are created with an automatically-generated index).

  • compress (bool, default False) – Whether to compress the request body.

  • batch_size (int, default 100) – The number of rows to send in each request.

  • max_retries (int, default 13) – Maximum number of retries on status codes listed in retry_status.

  • retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use: retry_status={404, 429, 500, 503, 504, 409}

  • backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern: {backoff factor} * (2 ** ({number of total retries} - 1)) with a maximum sleep time between requests of 120 seconds.

  • prediction_id (str, default None) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.

Returns:

prediction (pd.DataFrame) – Output returned by the deployed model for input df.

Raises:
  • RuntimeError – If the deployed model encounters an error while running the prediction.

  • requests.HTTPError – If the server encounters an error while handing the HTTP request.

get_curl()

Gets a valid cURL command.

Returns:

str

headers()

Returns a copy of the headers attached to prediction requests.

predict(x: List[Any], compress=False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None) Dict[str, Any]

Makes a prediction using input x.

Changed in version 0.23.0: The always_retry_404 and always_retry_429 parameters have been removed. Status codes 404 and 429, among others, are included by default in the retry_status parameter. Default is 13 retries over 10 minutes. This behavior can be changed by adjusting max_retries and backoff_factor.

New in version 0.22.0: The retry_status parameter.

New in version 0.22.0: The backoff_factor parameter.

New in version 0.22.0: The prediction_id parameter.

Parameters:
  • x (list) – A batch of inputs for the model.

  • compress (bool, default False) – Whether to compress the request body.

  • max_retries (int, default 13) – Maximum number of retries on status codes listed in retry_status.

  • retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use: retry_status={404, 429, 500, 503, 504, 409}

  • backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern: {backoff factor} * (2 ** ({number of total retries} - 1)) with a maximum sleep time between requests of 120 seconds.

  • prediction_id (str, default None) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.

Returns:

prediction (list) – Output returned by the deployed model for x.

Raises:
  • RuntimeError – If the deployed model encounters an error while running the prediction.

  • requests.HTTPError – If the server encounters an error while handing the HTTP request.

predict_with_id(x: List[Any], compress=False, max_retries: int = 13, retry_status: Set[int] = {404, 429, 500, 503, 504}, backoff_factor: float = 0.3, prediction_id: str = None) Tuple[str, List[Any]]

Makes a prediction using input x the same as predict, but returns a tuple including the ID of the prediction request along with the prediction results.

New in version 0.22.0: The prediction_id parameter.

New in version 0.22.0: The retry_status parameter.

New in version 0.22.0: The backoff_factor parameter.

Parameters:
  • x (list) – A batch of inputs for the model.

  • compress (bool, default False) – Whether to compress the request body.

  • max_retries (int, default 13) – Maximum number of retries on status codes listed in retry_status parameter only.

  • retry_status (set, default {404, 429, 500, 503, 504}) – Set of status codes, as integers, for which retry attempts should be made. Overwrites default value. Expand the set to include more. For example, to add status code 409 to the existing set, use: retry_status={404, 429, 500, 503, 504, 409}

  • backoff_factor (float, default 0.3) – A backoff factor to apply between retry attempts. Uses standard urllib3 sleep pattern: {backoff factor} * (2 ** ({number of total retries} - 1)) with a maximum sleep time between requests of 120 seconds.

  • prediction_id (str, optional) – A custom string to use as the ID for the prediction request. Defaults to a randomly generated UUID.

Returns:

  • id (str) – The prediction ID.

  • prediction (List[Any]) – The output returned by the deployed model for x.

Raises:
  • RuntimeError – If the deployed model encounters an error while running the prediction.

  • requests.HTTPError – If the server encounters an error while handing the HTTP request.