Local Training and Prediction#
In Binding a Model and Dataset together, we defined a Model
and Dataset
object,
bound them together, and defined the core functions needed for model training and prediction.
In this guide, we’ll learn how to interact with these objects locally to ensure that our code is working as expected.
Note
Local interaction with Model
objects are mainly useful for local development, debugging, and
unit testing of your UnionML app.
Here’s our complete UnionML app for digit classification in a app.py
script:
from typing import List
import pandas as pd
from sklearn.datasets import load_digits
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from unionml import Dataset, Model
dataset = Dataset(name="digits_dataset", test_size=0.2, shuffle=True, targets=["target"])
model = Model(name="digits_classifier", init=LogisticRegression, dataset=dataset)
@dataset.reader
def reader(sample_frac: float = 1.0, random_state: int = 12345) -> pd.DataFrame:
data = load_digits(as_frame=True).frame
return data.sample(frac=sample_frac, random_state=random_state)
@model.trainer
def trainer(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> LogisticRegression:
return estimator.fit(features, target.squeeze())
@model.predictor
def predictor(estimator: LogisticRegression, features: pd.DataFrame) -> List[float]:
return [float(x) for x in estimator.predict(features)]
@model.evaluator
def evaluator(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> float:
return float(accuracy_score(target.squeeze(), predictor(estimator, features)))
Execute as a Python Module#
We can then invoke the model.train
method to train the sklearn estimator and model.predict
to generate predictions. Then invoke the app script with python app.py
:
if __name__ == "__main__":
model_object, metrics = model.train(
hyperparameters={"C": 1.0, "max_iter": 10000},
sample_frac=1.0,
random_state=12345,
)
predictions = model.predict(
features=load_digits(as_frame=True).frame.sample(5, random_state=42)
)
print(f"model object: {model_object}")
print(f"training metrics: {metrics}")
print(f"predictions: {predictions}")
# save model to a file, using joblib as the default serialization format
model.save("/tmp/model_object.joblib")
model object: LogisticRegression(max_iter=10000)
training metrics: {'train': 1.0, 'test': 0.9694444444444444}
predictions: [6.0, 9.0, 3.0, 7.0, 2.0]
Note
You may notice a few things about the code example above:
The
model.train
method takes thedataset.reader
arguments as keyword-only arguments. In this case, thesample_frac
andrandom_state
values are passed intomodel.train
and forwarded todataset.reader
.model.train
returns a model instance and a dictionary of metrics.The model instance is the same type as the return annotation of the
model.trainer
function, which in this case isLogisticRegression
.The metrics dictionary maps dataset split keys
{"train", "test"}
to metrics of the same type as the return annotation of themodel.evaluator
, which in this case is afloat
.
The
model.predict
method accepts afeatures
keyword argument containing the features of the same type defined in themodel.predictor
function.At the end of the file we save the model object to a file called
/tmp/model_object.joblib
. This is simply ansklearn
base estimator that you know and love!
Serve with FastAPI#
UnionML integrates with FastAPI to make model serving super easy.
Important
Install unionml with fastapi:
pip install unionml[fastapi]
Simply create a FastAPI
app and pass it into model.serve
in the app.py
script:
from fastapi import FastAPI
# dataset and model definition
...
app = FastAPI()
model.serve(app)
model.serve
will take the FastAPI
app and automatically create a /predict/
endpoint that you can
invoke with HTTP requests.
Start the server with unionml serve
unionml serve app:app --model-path /tmp/model_object.joblib --reload
Note
The --model-path
option points to a local file containing the serialized model object that
we created above when we executed the UnionML app script.
Once the server’s started, you can use the Python requests
library or any other HTTP library
to get predictions from input features. For example, you can copy the following code into a client.py
script to generate predictions from the endpoint:
import requests
# generate predictions
requests.post(
"http://127.0.0.1:8000/predict",
json={"features": load_digits(as_frame=True).frame.sample(5, random_state=42).to_dict(orient="records")},
)
Note
The /predict
endpoint computation is being done on the app server itself, which, in this case, is probably
your laptop 💻. You’ll need to ensure that your prediction server has the resources needed to load the model
into memory and generate predictions.
Next#
We’ve run our training and prediction code by invoking our UnionML app as a python module and starting a local FastAPI server, but how do we deploy it as a suite of integrated machine learning services in the ☁️ cloud?
UnionML is coupled with Flyte, which is a scalable, reliable, and robust orchestration platform for data processing and machine learning. But before we deploy to the cloud, it’s important to understand what a Flyte cluster is by spinning up a Flyte Cluster locally.