RegisteredModelVersion

class verta.registry.entities.RegisteredModelVersion(conn, conf, msg)

Object representing a version of a registered model.

New in version 0.24.0: experiment_run_id attribute.

There should not be a need to instantiate this class directly; please use RegisteredModel.get_or_create_version().

Variables:
  • id (int) – ID of this model version.

  • name (str) – Name of this model version.

  • has_environment (bool) – Whether there is an environment associated with this model version.

  • has_model (bool) – Whether there is a model associated with this model version.

  • registered_model_id (int) – ID of this version’s registered model.

  • experiment_run_id (str or None) – ID of this version’s source experiment run, if it was created via RegisteredModel.create_version_from_run().

  • stage (str) – Model version stage.

  • url (str) – Verta web app URL.

add_attribute(key, value, overwrite=False)

Adds an attribute to this model version.

Parameters:
  • key (str) – Name of the attribute.

  • value (one of {None, bool, float, int, str, list, dict}) – Value of the attribute.

  • overwrite (bool, default False) – Whether to allow overwriting an existing attribute with key key.

add_attributes(attrs, overwrite=False)

Adds potentially multiple attributes to this model version.

Parameters:
  • attrs (dict of str to {None, bool, float, int, str, list, dict}) – Attributes.

  • overwrite (bool, default False) – Whether to allow overwriting an existing attribute with key key.

add_label(label)

Adds a label to this model version.

Parameters:

str – Label to add.

add_labels(labels)

Adds multiple labels to this model version.

Parameters:

labels (list of str) – Labels to add.

change_stage(stage_change)

Change this model version’s stage, bypassing the approval cycle.

New in version 0.19.2.

Note

User must have read-write permissions.

Parameters:

stage_change (stage_change) – Desired stage change.

Returns:

str – This model version’s new stage.

Examples

See documentation for individual stage change objects for usage examples.

create_external_build(location: str, requires_root: Optional[bool] = None, scan_external: Optional[bool] = None, self_contained: Optional[bool] = None) Build

(alpha) Creates a new external build for this model version.

New in version 0.24.1.

Parameters:
  • location (str) – The location of the build.

  • requires_root (bool, optional) – Whether the build requires root access.

  • scan_external (bool, optional) – Whether to scan the build for vulnerabilities using the external provider.

  • self_contained (bool, optional) – Whether the build is self-contained.

Returns:

Build

del_artifact(key)

Deletes the artifact with name key from this model version.

Parameters:

key (str) – Name of the artifact.

del_attribute(key)

Deletes the attribute with name key from this model version

Parameters:

key (str) – Name of the attribute.

del_dataset_version(key)

Deletes the DatasetVersion with name key from this model version.

New in version 0.21.1.

Parameters:

key (str) – Name of dataset version.

del_environment()

Deletes the environment of this model version.

del_label(label)

Deletes a label from this model version.

Parameters:

str – Label to delete.

del_model()

Deletes model of this model version.

delete()

Deletes this model version.

New in version 0.17.3.

download_artifact(key, download_to_path)

Downloads the artifact with name key to path download_to_path.

Parameters:
  • key (str) – Name of the artifact.

  • download_to_path (str) – Path to download to.

Returns:

downloaded_to_path (str) – Absolute path where artifact was downloaded to. Matches download_to_path.

download_docker_context(download_to_path, self_contained=False)

Downloads this model version’s Docker context tgz.

Parameters:
  • download_to_path (str) – Path to download Docker context to.

  • self_contained (bool, default False) – Whether the downloaded Docker context should be self-contained.

Returns:

downloaded_to_path (str) – Absolute path where Docker context was downloaded to. Matches download_to_path.

download_model(download_to_path)

Downloads the model logged with log_model() to path download_to_path.

Parameters:

download_to_path (str) – Path to download to.

Returns:

downloaded_to_path (str) – Absolute path where artifact was downloaded to. Matches download_to_path.

fetch_artifacts(keys)

Downloads artifacts that are associated with a Standard Verta Model.

Parameters:

keys (list of str) – Keys of artifacts to download.

Returns:

dict of str to str – Map of artifacts’ keys to their cache filepaths—for use as the artifacts parameter to a Standard Verta Model.

Examples

run.log_artifact("weights", open("weights.npz", 'rb'))
# upload complete (weights)
run.log_artifact("text_embeddings", open("embedding.csv", 'rb'))
# upload complete (text_embeddings)
artifact_keys = ["weights", "text_embeddings"]
artifacts = run.fetch_artifacts(artifact_keys)
artifacts
# {'weights': '/Users/convoliution/.verta/cache/artifacts/50a9726b3666d99aea8af006cf224a7637d0c0b5febb3b0051192ce1e8615f47/weights.npz',
#  'text_embeddings': '/Users/convoliution/.verta/cache/artifacts/2d2d1d809e9bce229f0a766126ae75df14cadd1e8f182561ceae5ad5457a3c38/embedding.csv'}
ModelClass(artifacts=artifacts).predict(["Good book.", "Bad book!"])
# [0.955998517288053, 0.09809996313422353]
run.log_model(ModelClass, artifacts=artifact_keys)
# upload complete (custom_modules.zip)
# upload complete (model.pkl)
# upload complete (model_api.json)
finetune(destination_registered_model: Union[str, RegisteredModel], train_dataset: DatasetVersion, eval_dataset: Optional[DatasetVersion] = None, test_dataset: Optional[DatasetVersion] = None, name: Optional[str] = None, finetuning_config: Optional[_FinetuningConfig] = None) RegisteredModelVersion

Fine-tune this model version using the provided dataset(s).

Parameters:
  • destination_registered_model (str or RegisteredModel) – Registered model (or simply its name) in which to create the new fine-tuned model version.

  • train_dataset (DatasetVersion) – Dataset version to use for training. The content passed to Dataset.create_version() must have enable_mdb_versioning=True.

  • eval_dataset (DatasetVersion, optional) – Dataset version to use for evaluation. The content passed to Dataset.create_version() must have enable_mdb_versioning=True.

  • test_dataset (DatasetVersion, optional) – Dataset version to use for final testing at the end of fine-tuning. The content passed to Dataset.create_version() must have enable_mdb_versioning=True.

  • name (str, optional) – Name for the new fine-tuned model version. If no name is provided, one will be generated.

  • finetuning_config (fine-tuning configuration, default LoraConfig) – Fine-tuning algorithm and configuration.

Returns:

RegisteredModelVersion – New fine-tuned model version.

get_artifact(key)

Gets the artifact with name key from this model version.

If the artifact was originally logged as just a filesystem path, that path will be returned. Otherwise, bytes representing the artifact object will be returned.

Parameters:

key (str) – Name of the artifact.

Returns:

str or object or bytes – Path of the artifact, the artifact object, or a bytestream representing the artifact.

get_artifact_keys()

Gets the artifact keys of this model version.

Returns:

list of str – List of artifact keys of this model version.

get_attribute(key)

Gets the attribute with name key from this model version.

Parameters:

key (str) – Name of the attribute.

Returns:

one of {None, bool, float, int, str} – Value of the attribute.

get_attributes()

Gets all attributes from this model version.

Returns:

dict of str to {None, bool, float, int, str} – Names and values of all attributes.

get_code_version(key)

Get a code version snapshot.

New in version 0.19.0.

Parameters:

key (str) – Name of the code version.

Returns:

code – Code version.

Examples

model_ver.get_code_version("training")
# Git Version
#     commit 52f3d22
#     in repo git@github.com:VertaAI/models.git
get_code_versions()

Get all code version snapshots.

New in version 0.19.0.

Returns:

dict of str to code – Code versions mapped to names.

Examples

model_ver.get_code_versions()
# {'training': Git Version
#      commit 52f3d22
#      in repo git@github.com:VertaAI/models.git,
#  'inference_code': Git Version
#      commit 26f9787
#      in repo git@github.com:VertaAI/data-processing.git}
get_dataset_version(key)

Gets the DatasetVersion with name key from this model version.

New in version 0.21.1.

Parameters:

key (str) – Name of the dataset version.

Returns:

DatasetVersion – DatasetVersion associated with the given key.

get_dataset_versions()

Gets all DatasetVersions associated with this model version.

New in version 0.24.0.

Returns:

list of DatasetVersion – DatasetVersions associated with this model version.

get_docker()

Get logged Docker image information.

Returns:

DockerImage

get_environment()

Get the logged environment.

Returns:

environment – Logged environment.

get_experiment_run() Optional[ExperimentRun]

Get this model’s source experiment run, if it was created via RegisteredModel.create_version_from_run().

Returns:

ExperimentRun or None

get_hide_input_label()

Gets whether to hide the model version’s input label on the preview.

Returns:

hide (bool)

get_hide_output_label()

Gets whether to hide the model version’s output label on the preview.

Returns:

hide (bool)

get_input_description()

Gets this description of the model version’s input.

This field helps users have a quick view of what type of data will be used as input for a model. This field also helps non-tech users to understand model behavior at a glance.

Returns:

desc (str)

get_labels()

Gets all labels of this model version.

Returns:

list of str – List of all labels of this model version.

get_lock_level()

Gets this model version’s lock level.

Returns:

lock_level (lock) – This model version’s lock level.

get_model()

Gets the model of this model version.

If the model was originally logged as just a filesystem path, that path will be returned. Otherwise, bytes representing the model object will be returned.

Returns:

str or object or bytes – Path of the model, the model object, or a bytestream representing the model.

get_output_description()

Gets this description of the model version’s output.

This field helps users have a quick view of what type of data will be produced as a result of executing a model. This field also helps non-tech users to understand model behavior at a glance.

Returns:

desc (str)

get_schema() Dict[str, dict]

Gets the input and output JSON schemas, in the format:

{
    "input": <input schema>,
    "output": <output schema>
}

New in version 0.24.0.

If no output schema was provided, output will not be included in the returned dict.

Returns:

dict of str to dict – Input and output JSON schemas.

list_builds() List[Build]

Gets this model version’s past and present builds.

New in version 0.23.0.

Builds are returned in order of creation (most recent first).

Returns:

list of Build

Examples

To fetch builds that have passed their scans:

passed_builds = list(filter(
    lambda build: build.get_scan().passed,
    model_ver.list_builds(),
))

To fetch builds that haven’t been scanned in a while:

from datetime import datetime, timedelta

past_builds = list(filter(
    lambda build: build.get_scan().date_updated < datetime.now().astimezone() - timedelta(days=30),
    model_ver.list_builds(),
))
log_artifact(key, artifact, overwrite=False, _extension=None)

Logs an artifact to this model version.

Note

The following artifact keys are reserved for internal use within the Verta system:

  • "custom_modules"

  • "model"

  • "model.pkl"

  • "model_api.json"

  • "requirements.txt"

  • "train_data"

  • "tf_saved_model"

  • "setup_script"

Parameters:
  • key (str) – Name of the artifact.

  • artifact (str or file-like or object) –

    Artifact or some representation thereof.
    • If str, then it will be interpreted as a filesystem path, its contents read as bytes, and uploaded as an artifact. If it is a directory path, its contents will be zipped.

    • If file-like, then the contents will be read as bytes and uploaded as an artifact.

    • Otherwise, the object will be serialized and uploaded as an artifact.

  • overwrite (bool, default False) – Whether to allow overwriting an existing artifact with key key.

log_code_version(key, code_version)

Log a code version snapshot.

New in version 0.19.0.

Parameters:
  • key (str) – Name for the code version.

  • code_version (code) – Code version.

Examples

from verta.code import Git

training_code = Git(
    repo_url="git@github.com:VertaAI/models.git",
    commit_hash="52f3d22",
    autocapture=False,
)
inference_code = Git(
    repo_url="git@github.com:VertaAI/data-processing.git",
    commit_hash="26f9787",
    autocapture=False,
)

model_ver.log_code_version("training", training_code)
model_ver.log_code_version("inference_code", inference_code)
log_code_versions(code_versions)

Log multiple code version snapshots in a batched request.

New in version 0.19.0.

Parameters:

code_versions (dict of str to code) – Code versions mapped to names.

Examples

from verta.code import Git

code_versions = {
    "training": Git(
        repo_url="git@github.com:VertaAI/models.git",
        commit_hash="52f3d22",
        autocapture=False,
    ),
    "inference_code": Git(
        repo_url="git@github.com:VertaAI/data-processing.git",
        commit_hash="26f9787",
        autocapture=False,
    ),
}

model_ver.log_code_versions(code_versions)
log_dataset_version(key, dataset_version)

Logs a Verta DatasetVersion to this model version with the given key.

New in version 0.21.1.

Parameters:
  • key (str) – Name of the dataset version.

  • dataset_version (DatasetVersion) – Dataset version.

log_docker(docker_image, model_api=None, overwrite=False)

Log Docker image information for deployment.

New in version 0.20.0.

Note

This feature may require an upgrade to the platform; please reach out to the Verta team at support@verta.ai to verify its availability in your version of the system.

Note

This method cannot be used alongside log_environment().

Parameters:
  • docker_image (DockerImage) – Docker image information.

  • model_api (ModelAPI, optional) – Model API specifying the model’s expected input and output

  • overwrite (bool, default False) – Whether to allow overwriting existing Docker image information.

Examples

from verta.registry import DockerImage

model_ver.log_docker(
    DockerImage(
        port=5000,
        request_path="/predict_json",
        health_path="/health",

        repository="012345678901.dkr.ecr.apne2-az1.amazonaws.com/models/example",
        tag="example",

        env_vars={"CUDA_VISIBLE_DEVICES": "0,1"},
    )
)
log_environment(env, overwrite=False)

Log an environment.

Parameters:
  • env (environment) – Environment to log.

  • overwrite (bool, default False) – Whether to allow overwriting an existing artifact with key key.

log_model(model, custom_modules=None, model_api=None, artifacts=None, overwrite=False)

Logs a model and associated code dependencies.

Note

If using an XGBoost model from their scikit-learn API, "scikit-learn" must also be specified in log_environment() (in addition to "xgboost").

Parameters:
  • model (str or object) –

    Model. For deployment, this parameter can be one of the following types:
    For more general model logging, the following types are also supported:
    • str path to a file or directory

    • arbitrary pickleable object

  • custom_modules (list of str, optional) –

    Paths to local Python modules and other files that the deployed model depends on. Modules from the standard library should not be included here.
    • If directories are provided, all files within—excluding virtual environments—will be included.

    • If module names are provided, all files within the corresponding module inside a folder in sys.path will be included.

    • If not provided, all Python files located within sys.path—excluding virtual environments—will be included.

    • If an empty list is provided, no local files will be included at all. This can be useful for decreasing upload times or resolving certain types of package conflicts when a model has no local dependencies.

  • model_api (ModelAPI, optional) – Model API specifying details about the model and its deployment.

  • artifacts (list of str, optional) – Keys of logged artifacts to be used by a Standard Verta Model.

  • overwrite (bool, default False) – Whether to allow overwriting existing model artifacts.

log_reference_data(X, Y, overwrite=False)

Log tabular reference data.

Parameters:
  • X (pd.DataFrame) – Reference data inputs.

  • Y (pd.DataFrame) – Reference data outputs.

  • overwrite (bool, default False) – Whether to allow overwriting existing reference data.

log_schema(input: dict, output: Optional[dict] = None) None

Sets the input and output schemas, which are stored as model artifacts.

New in version 0.24.0.

To propagate this change to any live endpoints, you must redeploy the model by calling update().

The output schema is optional.

To validate a prediction’s input and output against these schemas, use the validate_schema() decorator on your model’s predict() function.

Parameters:
  • input (dict) – Input schema as an OpenAPI-compatible JSON dict. Easiest to create using pydantic.BaseModel.schema() [1].

  • output (dict, optional) – Output schema as an OpenAPI-compatible JSON dict. Easiest to create using pydantic.BaseModel.schema().

References

log_setup_script(script, overwrite=False)

Associate a model deployment setup script with this Experiment Run.

New in version 0.13.8.

Parameters:
  • script (str) – String composed of valid Python code for executing setup steps at the beginning of model deployment. An on-disk file can be passed in using open("path/to/file.py", 'r').read().

  • overwrite (bool, default False) – Whether to allow overwriting an existing setup script.

Raises:

SyntaxError – If script contains invalid Python.

log_training_data(train_features, train_targets, overwrite=False)

Associate training data with this model reference.

Changed in version 0.14.4: Instead of uploading the data itself as a CSV artifact 'train_data', this method now generates a histogram for internal use by our deployment data monitoring system.

Deprecated since version 0.18.0: This method is no longer supported. Please see our documentation for information about our platform’s data monitoring features.

Parameters:
  • train_features (pd.DataFrame) – pandas DataFrame representing features of the training data.

  • train_targets (pd.DataFrame or pd.Series) – pandas DataFrame representing targets of the training data.

  • overwrite (bool, default False) – Whether to allow overwriting existing training data.

set_hide_input_label(hide)

Sets whether to hide the model version’s input label on the preview.

Parameters:

hide (bool) –

set_hide_output_label(hide)

Sets whether to hide the model version’s output label on the preview.

Parameters:

hide (bool) –

set_input_description(desc)

Sets this description of the model version’s input.

This field helps users have a quick view of what type of data will be used as input for a model. This field also helps non-tech users to understand model behavior at a glance.

Parameters:

desc (str) –

set_lock_level(lock_level)

Sets this model version’s lock level

Parameters:

lock_level (lock) – Lock level to set.

set_output_description(desc)

Sets this description of the model version’s output.

This field helps users have a quick view of what type of data will be produced as a result of executing a model. This field also helps non-tech users to understand model behavior at a glance.

Parameters:

desc (str) –