unionml.model.Model#

class unionml.model.Model(name='model', init=None, *, dataset, hyperparameter_config=None)#

Initialize a UnionML Model.

The term UnionML Model refers to the specification of a model, which the user defines through the functional entrypoints, e.g. unionml.model.Model.trainer(). The term model object is used to refer to some instance of model from a machine learning framework such as the subclasses of the BaseEstimator class in sklearn, Module in pytorch, etc.

Parameters:
  • name (str) – name of the model

  • init (Union[Type, Callable, None]) – a class or callable that produces a model object (e.g. an sklearn estimator) when invoked.

  • dataset (Dataset) – a UnionML Dataset object to bind to the model.

  • hyperparameter_config (Optional[Dict[str, Type]]) –

    A dictionary mapping hyperparameter names to types. This is used to determine the hyperparameter names and types associated with the model object produced by the init argument. For example:

    >>> {
    ...    "hyperparameter1": int,
    ...    "hyperparameter2": str,
    ...    "hyperparameter3": float,
    ... }
    

Methods

add_predictor_schedule

Add a prediction schedule to the model.

add_trainer_schedule

Add a training schedule to the model.

evaluator

Register a function for producing metrics for given model object.

init

Register a function for initializing a model object.

load

Load a model object from disk.

load_from_env

Load a model object from an environment variable pointing to the model file.

loader

Register a function for deserializing a model object to disk.

predict

Generate predictions locally.

predict_from_features_task

Create a Flyte task for generating predictions from a model object.

predict_from_features_workflow

Create a Flyte prediction workflow using raw features.

predict_task

Create a Flyte task for generating predictions from a model object.

predict_workflow

Create a Flyte prediction workflow using features from the dataset.reader as the data source.

predictor

Register a function that generates predictions from a model object.

remote

Configure the unionml.Model for remote backend deployment.

remote_activate_schedules

Activate deployed schedules.

remote_deactivate_schedules

Deactivate deployed schedules.

remote_deploy

Deploy model services to a Flyte backend.

remote_fetch_model

Fetch predictions from a Flyte execution.

remote_fetch_predictions

Fetch predictions from a Flyte execution.

remote_list_model_versions

Lists all the model versions of this UnionML app, in reverse chronological order.

remote_list_prediction_ids

Lists all the prediction ids of this UnionML app, in reverse chronological order.

remote_list_scheduled_prediction_runs

Lists executions associated with a prediction schedule, sorted from most to least recent.

remote_list_scheduled_training_runs

Lists executions associated with a training schedule, sorted from most to least recent.

remote_load

Load a ModelArtifact based on the provided Flyte execution.

remote_predict

Generate predictions on a remote Flyte backend.

remote_train

Train a model object on a remote Flyte backend.

remote_wait

Wait for a FlyteWorkflowExecution to complete and returns the execution's output.

resolve_model_artifact

Get a ModelArtifact from multiple sources.

save

Save the model object to disk.

saver

Register a function for serializing a model object to disk.

schedule_prediction

Schedule the prediction service when the UnionML app is deployed.

schedule_training

Schedule the training service when the UnionML app is deployed.

serve

Create a FastAPI serving app.

train

Train a model object locally

train_task

Create a Flyte task for training a model object.

train_workflow

Create a Flyte training workflow for model object training.

trainer

Register a function for training a model object.

Attributes

artifact

Model artifact associated with the unionml.Model .

config_file

Path to the config file associated with the Flyte backend.

dataset

Exposes the unionml.Dataset associated with the model.

dockerfile

Path to Docker file used to package the UnionML app.

hyperparameter_type

Hyperparameter type of the model object based on the init function signature.

model_type

predict_callbacks

predict_from_features_workflow_name

Name of the prediction workflow used to generate predictions from raw features.

predict_workflow_name

Name of the prediction workflow used to generate predictions from the dataset.reader .

prediction_schedule_names

Names of all the prediction schedules.

prediction_schedules

Scheduled prediction jobs.

prediction_type

registry

Docker registry used to push UnionML app.

train_workflow_name

Name of the training workflow.

trainer_params

Parameters used to create a Flyte workflow for model object training.

training_schedule_names

Names of all the training schedules.

training_schedules

Scheduled training jobs.

_default_loader(file, *args, **kwargs)#

Default model loader.

Supports sklearn estimators, pytorch modules, and keras models.

Parameters:
  • file (Union[str, PathLike, IO]) – str, path-like, or file-like object to read from.

  • args – additional args to forward to the underlying deserialization function.

  • kwargs – additional kwargs to forward to the underlying deserialization function.

Return type:

Any

The methods/functions for each associated model type are:

_default_saver(model_obj, hyperparameters, file, *args, **kwargs)#

Default model saver.

Supports sklearn estimators, pytorch modules, and keras models.

Parameters:
  • model_obj (Any) – model object to serialize.

  • hyperparameters (Union[dict, BaseHyperparameters, None]) – hyperparameters associated with the model object.

  • file (Union[str, PathLike, IO]) – str, path-like, or file-like object to write the contents of the model object to.

  • args – additional args to forward to the underlying serialization function.

  • kwargs – additional kwargs to forward to the underlying serialization function.

Return type:

Any

The methods/functions for each associated model type are:

add_predictor_schedule(schedule)#

Add a prediction schedule to the model.

Parameters:

schedule (Schedule) – schedule specification to add

add_trainer_schedule(schedule)#

Add a training schedule to the model.

Parameters:

schedule (Schedule) – schedule specification to add

property artifact: Optional[ModelArtifact]#

Model artifact associated with the unionml.Model .

This attribute is set when calling the following methods: - train - load - load_from_env - load_from_env - remote_load - remote_train with the wait=True kwarg.

property config_file: Optional[str]#

Path to the config file associated with the Flyte backend.

property dataset: Dataset#

Exposes the unionml.Dataset associated with the model.

property dockerfile: Optional[str]#

Path to Docker file used to package the UnionML app.

evaluator(fn)#

Register a function for producing metrics for given model object.

property hyperparameter_type: Type#

Hyperparameter type of the model object based on the init function signature.

init(fn)#

Register a function for initializing a model object.

load(file, *args, **kwargs)#

Load a model object from disk.

File:

a string or path-like object to load a model from.

Parameters:
  • args – positional arguments forwarded to unionml.Model.loader() .

  • kwargs – key-word arguments forwarded to unionml.Model.loader() .

load_from_env(env_var='UNIONML_MODEL_PATH', *args, **kwargs)#

Load a model object from an environment variable pointing to the model file.

Env_var:

environment variable referencing a path to load a model from.

Parameters:
  • args – positional arguments forwarded to unionml.Model.loader() .

  • kwargs – key-word arguments forwarded to unionml.Model.loader() .

loader(fn)#

Register a function for deserializing a model object to disk.

predict(features=None, **reader_kwargs)#

Generate predictions locally.

You can either pass this function raw features via the features argument or you can pass in keyword arguments that will be forwarded to the unionml.Dataset.reader() method as the feature source.

Parameters:
predict_from_features_task()#

Create a Flyte task for generating predictions from a model object.

This is used in the Flyte workflow produced by predict_from_features_workflow.

predict_from_features_workflow()#

Create a Flyte prediction workflow using raw features.

property predict_from_features_workflow_name#

Name of the prediction workflow used to generate predictions from raw features.

predict_task()#

Create a Flyte task for generating predictions from a model object.

This is used in the Flyte workflow produced by predict_workflow.

predict_workflow()#

Create a Flyte prediction workflow using features from the dataset.reader as the data source.

property predict_workflow_name#

Name of the prediction workflow used to generate predictions from the dataset.reader .

property prediction_schedule_names: List[str]#

Names of all the prediction schedules.

property prediction_schedules: List[Schedule]#

Scheduled prediction jobs.

predictor(fn=None, callbacks=None, **predict_task_kwargs)#

Register a function that generates predictions from a model object.

This function is the primary entrypoint for defining your application’s prediction behavior.

See the User Guide for more.

Parameters:
  • fn – function to use as the predictor.

  • train_task_kwargs – keyword arguments to pass into the flytekit task that will be composed of the input fn and functions defined in the bound Dataset.

property registry: Optional[str]#

Docker registry used to push UnionML app.

remote(registry=None, image_name=None, dockerfile='Dockerfile', patch_destination_dir='/root', config_file=None, project=None, domain=None)#

Configure the unionml.Model for remote backend deployment.

Parameters:
  • registry (Optional[str]) – Docker registry used to push UnionML app.

  • image_name (Optional[str]) – image name to give to the Docker image associated with the UnionML app.

  • dockerfile (str) – path to the Dockerfile used to package the UnionML app.

  • patch_destination_dir (str) – path where the UnionML source is installed within the docker image, and in case of patch registration, this would be replaced. In Flyte terms, patch registration is often called fast-registration.

  • config_file (Optional[str]) – path to the flytectl config to use for deploying your UnionML app to a Flyte backend.

  • project (Optional[str]) – deploy your app to this Flyte project name.

  • domain (Optional[str]) – deploy your app to this Flyte domain name.

remote_activate_schedules(app_version=None, schedule_names=None)#

Activate deployed schedules.

Parameters:
  • app_version (Optional[str]) – the version to use to fetch the scheduled launchplans. If None, uses gitsha as the version.

  • schedule_names (Optional[List[str]]) – names of schedules to activate.

remote_deactivate_schedules(app_version=None, schedule_names=None)#

Deactivate deployed schedules.

Parameters:
  • app_version (Optional[str]) – the version to use to fetch the scheduled launchplans. If None, uses gitsha as the version.

  • schedule_names (Optional[List[str]]) – names of schedules to deactivate.

remote_deploy(app_version=None, allow_uncommitted=False, patch=False, schedule=True)#

Deploy model services to a Flyte backend.

Parameters:
  • app_version (Optional[str]) – the version to use to deploy the UnionML app. If None, uses gitsha as the version.

  • allow_uncommitted (bool) – If True, deploys uncommitted changes in the unionml project. Otherwise, raise a :py:class`~unionml.remote.VersionFetchError`

  • patch (bool) – if True, this bypasses the Docker build process and only updates the UnionML app source code using the latest available image.

  • schedule (bool) – indicates whether or not to deploy the training and prediction schedules.

Return type:

str

Returns:

app version string

remote_fetch_model(execution)#

Fetch predictions from a Flyte execution.

Parameters:

execution (FlyteWorkflowExecution) – a Flyte workflow execution, which is the output of remote_predict(..., wait=False) .

Return type:

ModelArtifact

remote_fetch_predictions(execution)#

Fetch predictions from a Flyte execution.

Parameters:

execution (FlyteWorkflowExecution) – a Flyte workflow execution, which is the output of remote_predict(..., wait=False) .

Return type:

Any

remote_list_model_versions(app_version=None, limit=10)#

Lists all the model versions of this UnionML app, in reverse chronological order.

Parameters:
  • app_version (Optional[str]) – if provided, lists the model versions associated with this app version. By default, this uses the current git sha of the repo, which versions your UnionML app.

  • limit (int) – limit the number results to fetch.

Return type:

List[str]

remote_list_prediction_ids(app_version=None, limit=10)#

Lists all the prediction ids of this UnionML app, in reverse chronological order.

Prediction ids are unique identifiers given to each batch prediction run that’s executed remotely on a Flyte cluster.

Parameters:
  • app_version (Optional[str]) – if provided, lists the model versions associated with this app version. By default, this uses the current git sha of the repo, which versions your UnionML app.

  • limit (int) – limit the number results to fetch.

Return type:

List[str]

remote_list_scheduled_prediction_runs(schedule_name, app_version=None, limit=5)#

Lists executions associated with a prediction schedule, sorted from most to least recent.

Parameters:
  • schedule_name (str) – fetch runs for this prediction schedule.

  • limit (int) – number of executions to list.

Return type:

List[FlyteWorkflowExecution]

Returns:

a list of FlyteWorkflowExecution objects

remote_list_scheduled_training_runs(schedule_name, app_version=None, limit=5)#

Lists executions associated with a training schedule, sorted from most to least recent.

Parameters:
  • schedule_name (str) – fetch runs for this training schedule.

  • limit (int) – number of executions to list.

Return type:

List[FlyteWorkflowExecution]

Returns:

a list of FlyteWorkflowExecution objects

remote_load(execution)#

Load a ModelArtifact based on the provided Flyte execution.

Parameters:

execution (FlyteWorkflowExecution) – a Flyte workflow execution, which is the output of remote_train(..., wait=False) .

remote_predict(app_version=None, model_version=None, wait=True, *, features=None, **reader_kwargs)#

Generate predictions on a remote Flyte backend.

You can either pass this function raw features via the features argument or you can pass in keyword arguments that will be forwarded to the unionml.Dataset.reader() method as the feature source.

Parameters:
  • app_version (Optional[str]) – if provided, executes a prediction job using the specified UnionML app version. By default, this uses the latest app version deployed on the Flyte remote cluster.

  • model_version (Optional[str]) – if provided, executes a prediction job using the specified model version. By default, this uses the latest Flyte execution id as the model version.

  • wait (bool) – if True, this is a synchronous operation, returning a ModelArtifact. Otherwise, this function returns a FlyteWorkflowExecution.

  • features (Optional[Any]) –

    Raw features that are pre-processed by the :py:class:unionml.Dataset methods in the following order:

  • reader_kwargs – keyword arguments that correspond to the unionml.Dataset.reader() method signature.

Return type:

Union[Any, FlyteWorkflowExecution]

Returns:

the predictions if wait is True, or a FlyteWorkflowExecution object if wait is False.

remote_train(app_version=None, wait=True, *, hyperparameters=None, loader_kwargs=None, splitter_kwargs=None, parser_kwargs=None, trainer_kwargs=None, **reader_kwargs)#

Train a model object on a remote Flyte backend.

Parameters:
  • app_version (Optional[str]) – if provided, executes a training job using the specified UnionML app version. By default, this uses the latest app version deployed on the Flyte remote cluster.

  • wait (bool) – if True, this is a synchronous operation, returning a ModelArtifact. Otherwise, this function returns a FlyteWorkflowExecution.

  • hyperparameters (Optional[Dict[str, Any]]) – a dictionary mapping hyperparameter names to values. This is passed into the init callable to initialize a model object.

  • loader_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.loader() function. This will override any defaults set in the function definition.

  • splitter_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.splitter() function. This will override any defaults set in the function definition.

  • parser_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.parser() function. This will override any defaults set in the function definition.

  • trainer_kwargs (Optional[Dict[str, Any]]) – a dictionary mapping training parameter names to values. There training parameters are determined by the keyword-only arguments of the model.trainer function.

  • reader_kwargs – keyword arguments that correspond to the unionml.Dataset.reader() method signature.

Return type:

Union[ModelArtifact, FlyteWorkflowExecution]

Returns:

the trained model if wait is True, or a FlyteWorkflowExecution object if wait is False.

remote_wait(execution, **kwargs)#

Wait for a FlyteWorkflowExecution to complete and returns the execution’s output.

Return type:

FlyteWorkflowExecution

resolve_model_artifact(model_object=None, model_version=None, app_version=None, model_file=None, loader_kwargs=None)#

Get a ModelArtifact from multiple sources.

This method produces a model artifact from the following sources: - an in-memory model_object - a Flyte cluster execution via model_version and app_version - a serialized model object via model_file and loader_kwargs

If no arguments are provided, this method simply returns the artifact.

Parameters:
  • modelModel to use for resolving a model object.

  • model_object (Optional[Any]) – model object to use for prediction.

  • model_version (Optional[str]) – model version identifier to use for prediction.

  • app_version (Optional[str]) – if model_version is specified, this argument indicates the app version to use for fetching the model artifact.

  • model_file (Union[str, Path, None]) – a filepath to a serialized model object.

  • loader_kwargs (Optional[dict]) – additional keyword arguments to be forwarded to the unionml.model.Model.loader() function.

Return type:

ModelArtifact

save(file, *args, **kwargs)#

Save the model object to disk.

saver(fn)#

Register a function for serializing a model object to disk.

schedule_prediction(name, *, expression=None, offset=None, fixed_rate=None, reader_time_arg=None, activate_on_deploy=True, launchplan_kwargs=None, model_object=None, model_version=None, app_version=None, model_file=None, loader_kwargs=None, **reader_kwargs)#

Schedule the prediction service when the UnionML app is deployed.

The model used for prediction must be from one of the following sources:

  • An in-memory model object specified via the model_object argument.

  • A model version associated with a Flyte cluster execution speciied via the model_version and app_version arguments.

  • A serialized model object specified via the model_file argument.

Parameters:
  • name (str) – unique name of the launch plan

  • expression (Optional[str]) – a cron expression (see here) or valid croniter schedule for e.g. @hourly, @daily, @weekly, @monthly, @yearly (see here).

  • offset (Optional[str]) – duration to offset the schedule, must be a valid ISO 8601 duration <https://en.wikipedia.org/wiki/ISO_8601>__ . Only used if ``expression` is specified.

  • fixed_rate (Optional[timedelta]) – a timedelta object representing fixed rate with which to run the workflow.

  • reader_time_arg (Optional[str]) – if not None, the name of the reader() argument that will receive the kickoff datetime of the scheduled launchplan.

  • activate_on_deploy (bool) – Whether or not to automatically activate this schedule on app deployment.

  • launchplan_kwargs (Optional[dict]) – additional keyword arguments to pass to LaunchPlan

  • model_object (Optional[Any]) – model object to use for prediction.

  • model_version (Optional[str]) – model version identifier to use for prediction.

  • app_version (Optional[str]) – if model_version is specified, this argument indicates the app version to use for fetching the model artifact.

  • model_file (Union[str, Path, None]) – a filepath to a serialized model object.

  • loader_kwargs (Optional[dict]) – additional keyword arguments to be forwarded to the unionml.model.Model.loader() function.

  • reader_kwargs – keyword arguments that correspond to the unionml.dataset.Dataset.reader() method signature.

schedule_training(name, *, expression=None, offset=None, fixed_rate=None, reader_time_arg=None, activate_on_deploy=True, launchplan_kwargs=None, hyperparameters=None, loader_kwargs=None, splitter_kwargs=None, parser_kwargs=None, trainer_kwargs=None, **reader_kwargs)#

Schedule the training service when the UnionML app is deployed.

Parameters:
  • name (str) – unique name of the launch plan

  • expression (Optional[str]) – a cron expression (see here) or valid croniter schedule for e.g. @daily, @hourly, @weekly, @yearly (see here).

  • offset (Optional[str]) – duration to offset the schedule, must be a valid ISO 8601 duration <https://en.wikipedia.org/wiki/ISO_8601>__ . Only used if ``expression` is specified.

  • fixed_rate (Optional[timedelta]) – a timedelta object representing fixed rate with which to run the workflow.

  • reader_time_arg (Optional[str]) – if not None, the name of the reader() argument that will receive the kickoff datetime of the scheduled launchplan.

  • activate_on_deploy (bool) – Whether or not to automatically activate this schedule on app deployment.

  • launchplan_kwargs (Optional[dict]) – additional keyword arguments to pass to LaunchPlan

  • hyperparameters (Optional[Dict[str, Any]]) – a dictionary mapping hyperparameter names to values. This is passed into the init callable to initialize a model object.

  • loader_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.loader() function. This will override any defaults set in the function definition.

  • splitter_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.splitter() function. This will override any defaults set in the function definition.

  • parser_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.Dataset.parser() function. This will override any defaults set in the function definition.

  • trainer_kwargs (Optional[Dict[str, Any]]) – a dictionary mapping training parameter names to values. There training parameters are determined by the keyword-only arguments of the model.trainer function.

  • reader_kwargs – keyword arguments that correspond to the unionml.dataset.Dataset.reader() method signature.

serve(app, remote=False, app_version=None, model_version='latest')#

Create a FastAPI serving app.

Parameters:

app (FastAPI) – A FastAPI app to use for model serving.

train(hyperparameters=None, loader_kwargs=None, splitter_kwargs=None, parser_kwargs=None, trainer_kwargs=None, **reader_kwargs)#

Train a model object locally

Parameters:
  • hyperparameters (Optional[Dict[str, Any]]) – a dictionary mapping hyperparameter names to values. This is passed into the init callable to initialize a model object.

  • loader_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.dataset.Dataset.loader() function. This will override any defaults set in the function definition.

  • splitter_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.dataset.Dataset.splitter() function. This will override any defaults set in the function definition.

  • parser_kwargs (Optional[Dict[str, Any]]) – key-word arguments to pass to the registered unionml.dataset.Dataset.parser() function. This will override any defaults set in the function definition.

  • trainer_kwargs (Optional[Dict[str, Any]]) – a dictionary mapping training parameter names to values. There training parameters are determined by the keyword-only arguments of the model.trainer function.

  • reader_kwargs – keyword arguments that correspond to the unionml.Dataset.reader() method signature.

Return type:

Tuple[Any, Any]

The train method invokes an execution graph that composes together the following functions to train and evaluate a model:

  • unionml.Dataset.reader()

  • unionml.Dataset.loader()

  • unionml.Dataset.splitter()

  • unionml.Dataset.parser()

  • unionml.Model.trainer()

  • unionml.Model.predictor()

  • unionml.Model.evaluator()

train_task()#

Create a Flyte task for training a model object.

This is used in the Flyte workflow produced by train_workflow.

train_workflow()#

Create a Flyte training workflow for model object training.

property train_workflow_name#

Name of the training workflow.

trainer(fn=None, **train_task_kwargs)#

Register a function for training a model object.

This function is the primary entrypoint for defining your application’s model-training behavior.

See the User Guide for more.

Parameters:
  • fn (Optional[Callable]) – function to use as the trainer.

  • train_task_kwargs – keyword arguments to pass into the flytekit task that will be composed of the input fn and functions defined in the bound Dataset.

property trainer_params: Dict[str, Parameter]#

Parameters used to create a Flyte workflow for model object training.

property training_schedule_names: List[str]#

Names of all the training schedules.

property training_schedules: List[Schedule]#

Scheduled training jobs.