October 8, 2024
OpenAI’s Python API is quickly becoming one of the most-downloaded Python packages. With…
Recently I have been enjoying using Comet for my experiments and I am always surprised by the new features I discover. Today I would like to talk to you about the possibility provided by Comet to keep track of the Machine Learning model to send into production.
Suppose we run many different experiments to solve a certain problem. After several tests, we understand that the X model is the best and we want to choose it as the production model. Well, Comet allows us to do this thanks to the functionality provided by the Registry.
A Comet Registry is a place that stores all the registered models. A registered model is a model saved in a Comet project. There are at least two advantages of registering models in Comet:
To make a model available in the Comet Registry, firstly we need to register it. We can follow two strategies to register a model:
experiment.log_model(name, file_name)
— this method of the Experiment()
class logs the model as an artifact and then, manually, we need to add it to the Registry.experiment.register_model(MODEL_NAME)
— this method of Experiment()
class registers the full experiment and adds it to the Registry.In this article, I will describe how to add a model to the Registry through the log_model()
method. The article is organized as follows:
In this example, we will model the same dataset with two different Machine Learning models, and we will use Comet to select the best one. As a sample dataset, we will use the classical diabetes dataset provided by the scikit-learn Python package.
Firstly, we import the dataset:
from sklearn.datasets import load_diabetes diabetes_dataset = load_diabetes() X = diabetes_dataset.data y = diabetes_dataset.target
Then, we split it into training and test sets, through the train_test_split() function provided by scikit-learn:
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
Now we define a function, named run_experiment()
, that receives the name and the model object as input:
import numpy as np import pickle from comet_ml import Experiment from sklearn.linear_model import LinearRegression, LogisticRegression from sklearn import metrics def run_experiment(name, model): experiment = Experiment(api_key="MY_API_KEY", project_name="MY_PROJECT_NAME", workspace="MY_WORKSPACE") model.fit(X_train,y_train) file_name = name + '.pkl' with open(file_name, 'wb') as file: pickle.dump(model, file) y_pred = model.predict(X_test) RMSE = np.sqrt(metrics.mean_squared_error(y_test, y_pred)) experiment.log_metric("RMSE", RMSE) experiment.log_model(name, file_name)
The previous function performs the following operations:
Note that we have used the log_model()
method to log the model in Comet. In this case, we have logged only the model. If we wanted to log the whole experiment, we would have to use the register_model()
method.
Finally, we call the defined function to run two experiments, as follows:
model = LinearRegression() run_experiment('LinearRegression', model) model = LogisticRegression() run_experiment('LogisticRegression', model)
We have built a Linear Regression model and a Logistic Regression model. We run the code and access the results directly in Comet.
Which of the two will perform better? Let’s find out together!
We can compare the output of the two experiments directly in the Comet dashboard. We can select both the experiments and then compare the respective RMSE:
Video by Author
The second experiment (that corresponds to the Linear Regression) outperforms the first one.
Want to see more of Comet in action? Check out working sessions, demo videos, and more on our YouTube channel.
Under the Experiment section we have two experiments, one for the Linear Regression and the other for the Logistic Regression, as shown in the following figure:
Image by Author
We click on the first experiment, and we select the Assets & Artifacts tab.
Under the models directory, we can find the specific model file, as shown in the following figure:
Image by Author
We can download the model if we want. On the right part of the screen, there is a button, named Register. We can click it to add the model to the Registry. The following window opens:
Image by Author
We can add the model to an existing Registry or we can register a new model. In our case, we register a new model.
We can repeat the same procedure for the second experiment, but when we need to add the model to the Registry, we save it to the existing model, i.e. the previous one. In this case, we need to change the model version, e.g. 1.0.1.
We can now access the Model Registry from the Comet main dashboard. We need to exit the current project. We should have a view similar to the following one:
Image by Author
We click the View model button. We have the two models:
Image by Author
We can now set the stage of version 1.0.1, which corresponds to the linear regression, to production by clicking the arrow on the left, as shown in the following short video:
Video by Author
Congratulations! You have just learned how to use the Comet Registry to keep track of your best Machine Learning Model!
The use of the Comet Registry can help you to maintain your code well organized and ordered.
If you want to learn more about Comet, you can read my previous articles: