Deployment Guides#

Currently, UnionML apps support two core types of machine learning microservices: model training and model serving.

Production Training and Batch Predictions#

UnionML uses flytekit under the hood to execute your training and prediction workflows locally, but you can benefit from the reproducibility and scalability benefits of UnionML by deploying your workflows to a production-grade flyte cluster.

Deploy to a Flyte Cluster: Deploy training and prediction services to a Flyte cluster.

Serving Online Predictions#

Once you have a trained model object that you want to serve in production, you can:

Serve with FastAPI: Stand up an online prediction service with FastAPI.
Serve with AWS Lambda: Create an online prediction serverless endpoint with AWS Lambda.
Serve with BentoML: Leverage BentoML to deploy a prediction service to a wide variety of cloud platforms.

Serving Reactive Predictions#

Some predictive applications require reacting to events that occur in some external system:

Reacting to S3 Events: Generate predictions in response to files being dumped into a specified S3 path.