Shortcuts

Scikit Learn: Quick Start

In this tutorial, we are going to train and deploy a small scikit-learn classifier on the iris dataset, and deploy it to Model Zoo to make predictions via HTTP.

You can follow along this tutorial in any Python environment you’re comfortable with, such as a Python IDE, Jupyter notebook, or a Python terminal. The easiest option is to open this tutorial directly in colab:

Open In Colab

Installation

Install the Model Zoo client library via pip:

!pip install modelzoo-client[sklearn]

To deploy and use your own models, you’ll need to create an account and configure an API key. You can do so from the command line:

!modelzoo auth

Train

First, we need to train a model. For the sake of this quickstart, we’ll train a simple logistic regression model on the iris dataset.

import sklearn.datasets
import sklearn.linear_model
import sklearn.model_selection

# Load and split data into train and test sets.
iris = sklearn.datasets.load_iris()
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
    iris.data, iris.target, test_size=0.1
)

# Train the logistic regression model.
estimator = sklearn.linear_model.LogisticRegression()
estimator.fit(X_train, y_train)

Deploy

To deploy this pipeline to a production-ready HTTP endpoint, use the modelzoo.sklearn.deploy() function. This function will directly take any scikit learn estimator object.

import modelzoo.sklearn

model_name = modelzoo.sklearn.deploy(estimator)

That’s all there is to it! Behind the scenes, Model Zoo serialized your model via joblib, uploaded it to object storage, and deployed a serverless lambda function to serve requests to this model. If you’d like, take some time to explore the model via the Web UI link. There you’ll be able to modify documentation, test the model with raw or visual inputs, monitor metrics and/or logs. By default, only your account (or anybody you share your API key with) will be able to access this model.

You can specify the name of the model you’d like to deploy via a model_name argument. If a name is omitted, Model Zoo will choose a unique one for you. Model names must be unique to your account.

Predict

Next, we’ll use our Python client library to query the model for a prediction. modelzoo.sklearn.predict() requires the model_name and an input numpy array for prediction. Under the hood, the client library will query the model endpoint for a result. Let’s use the test set we loaded earlier, and compare the predictions to the ground truth labels.

import sklearn.metrics

result = modelzoo.sklearn.predict(model_name, X_test)
prediction = result.get("prediction")

print("Test label ground truth: ", y_test)
print("Model prediction: ", prediction)
print("Test accuracy: ", sklearn.metrics.accuracy_score(y_test, prediction))

You can also request a list of the raw probabilities for class labels on each input instance by using the return_probabilities flag:

result = modelzoo.sklearn.predict(model_name, X_test, return_probabilities=True)
print(result["probabilities"])

Great! At this point, we’ve successfully used our scikit learn classifier to make predictions on test data.

Interested in what you’ve seen and want to test drive an unlimited version of Model Zoo? Apply to our private beta and reach out at contact@modelzoo.dev to learn more.