UnionML

The easiest way to build and deploy machine learning microservices

PyPI - Python Version PyPI version shields.io Documentation Status Build PyPI - Downloads Roadmap OSS Planning


UnionML is an open source MLOps framework that reduces the boilerplate, complexity, and friction that comes with building models and deploying them to production. Taking inspiration from web protocols, UnionML asks the question:

Is it possible to define a standard set of functions/methods for machine learning that can be reused in many different contexts, from model training to prediction?

UnionML aims to unify the ever-evolving ecosystem of machine learning and data tools into a single interface for expressing microservices as Python functions.

You can create UnionML Apps by defining a few core methods that are automatically bundled into ML microservices, starting with model training and offline/online prediction:

%%{init: {'theme':'default'}}%% flowchart LR A[UnionML App] subgraph methods Rm[reader] Tm[trainer] Pm[predictor] Em[...] end subgraph microservices T[train] Pb[batch predict] Po[online predict] E[...] end Rm --> A Tm --> A Pm --> A Em --> A A --> T A --> Pb A --> Po A --> E

Brought to you by the Union.ai team, UnionML is built on top of Flyte to provide a high-level interface for productionizing your ML models so that you can focus on curating a better dataset and improving your models.

Installation#

pip install unionml

Quickstart#

A UnionML app is composed of two core classes: a Dataset and a Model.

In this example, we’ll build a minimal UnionML app that classifies images of handwritten digits into their corresponding digit labels using sklearn, pytorch, or keras.

Create a python file called app.py, import app dependencies, and define dataset and model objects.

from typing import List

import pandas as pd
from sklearn.datasets import load_digits
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

from unionml import Dataset, Model

dataset = Dataset(name="digits_dataset", test_size=0.2, shuffle=True, targets=["target"])
model = Model(name="digits_classifier", init=LogisticRegression, dataset=dataset)

Define App Methods#

Specify the core functions for training and prediction with the decorators exposed by the dataset and model objects:

@dataset.reader
def reader() -> pd.DataFrame:
    return load_digits(as_frame=True).frame


@model.trainer
def trainer(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> LogisticRegression:
    return estimator.fit(features, target.squeeze())


@model.predictor
def predictor(estimator: LogisticRegression, features: pd.DataFrame) -> List[float]:
    return [float(x) for x in estimator.predict(features)]


@model.evaluator
def evaluator(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> float:
    return float(accuracy_score(target.squeeze(), predictor(estimator, features)))

Train and Predict Locally#

Invoke train() to train a model and predict() to generate predictions.

if __name__ == "__main__":
    model_object, metrics = model.train(hyperparameters={"C": 1.0, "max_iter": 10000})
    predictions = model.predict(features=load_digits(as_frame=True).frame.sample(5, random_state=42))
    print(model_object, metrics, predictions, sep="\n")

    # save model to a file, using joblib as the default serialization format
    model.save("/tmp/model_object.joblib")

Serve Seamlessly with FastAPI#

UnionML integrates with FastAPI to automatically create /train/ and /predict/ endpoints.

Install unionml with fastapi:

pip install unionml[fastapi]

Start a server with unionml serve and call the app endpoints with the requests library.

Bind a FastAPI app to the model object with model.serve

from fastapi import FastAPI

app = FastAPI()
model.serve(app)

Start the server, assuming the UnionML app is in a app.py script

unionml serve app:app --reload --model-path /tmp/model_object.joblib

Important

The first argument to unionml serve is a :-separated string where the first part is the module name of the app script, and the second part is the variable name of the FastAPI app.

Then you can invoke the endpoints using the requests library, e.g. in a separate client.py script:

import requests
from sklearn.datasets import load_digits

digits = load_digits(as_frame=True)
features = digits.frame[digits.feature_names]


prediction_response = requests.post(
    "http://127.0.0.1:8000/predict",
    json={"features": features.sample(5, random_state=42).to_dict(orient="records")},
)

print(prediction_response.text)

What Next?#

Learn how to leverage the full power of UnionML 🦾 in the Basics guide.

Want to contribute?

Check out the Contributing Guide.