Local Training and Prediction#
In Binding a Model and Dataset together, we defined a Model and Dataset object,
bound them together, and defined the core functions needed for model training and prediction.
In this guide, we’ll learn how to interact with these objects locally to ensure that our code is working as expected.
Note
Local interaction with Model objects are mainly useful for local development, debugging, and
unit testing of your UnionML app.
Here’s our complete UnionML app for digit classification in a app.py script:
from typing import List
import pandas as pd
from sklearn.datasets import load_digits
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from unionml import Dataset, Model
dataset = Dataset(name="digits_dataset", test_size=0.2, shuffle=True, targets=["target"])
model = Model(name="digits_classifier", init=LogisticRegression, dataset=dataset)
@dataset.reader
def reader(sample_frac: float = 1.0, random_state: int = 12345) -> pd.DataFrame:
data = load_digits(as_frame=True).frame
return data.sample(frac=sample_frac, random_state=random_state)
@model.trainer
def trainer(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> LogisticRegression:
return estimator.fit(features, target.squeeze())
@model.predictor
def predictor(estimator: LogisticRegression, features: pd.DataFrame) -> List[float]:
return [float(x) for x in estimator.predict(features)]
@model.evaluator
def evaluator(estimator: LogisticRegression, features: pd.DataFrame, target: pd.DataFrame) -> float:
return float(accuracy_score(target.squeeze(), predictor(estimator, features)))
/home/docs/checkouts/readthedocs.org/user_builds/unionml/envs/stable/lib/python3.8/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Execute as a Python Module#
We can then invoke the model.train method to train the sklearn estimator and model.predict
to generate predictions. Then invoke the app script with python app.py:
if __name__ == "__main__":
model_object, metrics = model.train(
hyperparameters={"C": 1.0, "max_iter": 10000},
sample_frac=1.0,
random_state=12345,
)
predictions = model.predict(
features=load_digits(as_frame=True).frame.sample(5, random_state=42)
)
print(f"model object: {model_object}")
print(f"training metrics: {metrics}")
print(f"predictions: {predictions}")
# save model to a file, using joblib as the default serialization format
model.save("/tmp/model_object.joblib")
model object: LogisticRegression(max_iter=10000)
training metrics: {'train': 1.0, 'test': 0.9694444444444444}
predictions: [6.0, 9.0, 3.0, 7.0, 2.0]
Note
You may notice a few things about the code example above:
The
model.trainmethod takes thedataset.readerarguments as keyword-only arguments. In this case, thesample_fracandrandom_statevalues are passed intomodel.trainand forwarded todataset.reader.model.trainreturns a model instance and a dictionary of metrics.The model instance is the same type as the return annotation of the
model.trainerfunction, which in this case isLogisticRegression.The metrics dictionary maps dataset split keys
{"train", "test"}to metrics of the same type as the return annotation of themodel.evaluator, which in this case is afloat.
The
model.predictmethod accepts afeatureskeyword argument containing the features of the same type defined in themodel.predictorfunction.At the end of the file we save the model object to a file called
/tmp/model_object.joblib. This is simply ansklearnbase estimator that you know and love!
Serve with FastAPI#
UnionML integrates with FastAPI to make model serving super easy. Simply
create a FastAPI app and pass it into model.serve in the app.py script:
from fastapi import FastAPI
# dataset and model definition
...
app = FastAPI()
model.serve(app)
model.serve will take the FastAPI app and automatically create a /predict/ endpoint that you can
invoke with HTTP requests.
Start the server with unionml serve
unionml serve app:app --model-path /tmp/model_object.joblib --reload
Note
The --model-path option points to a local file containing the serialized model object that
we created above when we executed the UnionML app script.
Once the server’s started, you can use the Python requests library or any other HTTP library
to get predictions from input features. For example, you can copy the following code into a client.py
script to generate predictions from the endpoint:
import requests
# generate predictions
requests.post(
"http://127.0.0.1:8000/predict",
json={"features": load_digits(as_frame=True).frame.sample(5, random_state=42).to_dict(orient="records")},
)
Note
The /predict endpoint computation is being done on the app server itself, which, in this case, is probably
your laptop 💻. You’ll need to ensure that your prediction server has the resources needed to load the model
into memory and generate predictions.
Next#
We’ve run our training and prediction code by invoking our UnionML app as a python module and starting a local FastAPI server, but how do we deploy it as a suite of integrated machine learning services in the ☁️ cloud?
UnionML is coupled with Flyte, which is a scalable, reliable, and robust orchestration platform for data processing and machine learning. But before we deploy to the cloud, it’s important to understand what a Flyte cluster is by spinning up a Flyte Cluster locally.