Integrate with MLflow¶
Comet has extensive support for users of MLflow.
Comet can support the use of MLflow in two different methods:
- Built-in, core Comet support for MLflow
- Comet for MLflow extension.
The following sections provide details of both methods.
Built-in, core Comet support for MLflow¶
If you're already using MLflow, then Comet will work with MLflow with no further configuration.
Run any MLflow script from the console, as follows:
comet python mlflow_script.py
Alternatively, you can add this one line of code to the top of your training MLflow script and run your MLflow script as you normally would.:
import comet_ml
How it works¶
Comet's built-in, core support for MLflow attempts to create a live, online Experiment if a Comet API Key is configured. If a Comet API Key cannot be found, you will see the following log message:
No Comet API Key was found, creating an OfflineExperiment.
Set up your API Key to get the full Comet experience:
https://www.comet.com/docs/guides/experiment-management/configure-sdk/.
In case no API key is found, the Comet SDK still creates an OfflineExperiment so you still get all the additional tracking data from Comet. Just remember to upload the offline experiment archive later. At the end of the run, the script provides you the exact command to run, similar to the following:
comet upload /path/to/archive.zip
Any future Experiment runs that are created with this script automatically include Comet's extended Experiment tracking to MLflow.
Log automatically¶
When you run MLflow by importing comet_ml
or by using the command-line comet python script.py
, you automatically log all of the following items to Comet's [single experiment page]/docs/v2/guides/comet-ui/experiment-management/experiments-page/#the-single-experiment-page):
- Metrics: Logged to the metrics tab
- Hyperparameters: Logged to the hyperparameters tab
- Models: Logged to the assets tab
- Assets: Logged to the assets tab
- Source code: Logged to the code tab
- Git repo and patch info: Available by clicking the reproduce button
- System metrics
- CPU and GPU usage: Logged to the system metrics tab
- Python packages: Logged to the installed packages tab
- Command-line arguments
- Standard Output: Logged to the output tab
- Installed OS Packages: Available through the
get_os_packages
method
For more information about using environment parameters in Comet, see Configure Comet.
Info
For more information on using Comet in the console, see Comet Command-Line Utilities.
Now, explore the other support method for MLflow users.
If you would like to see your previously run MLflow Experiments in Comet, try the comet_for_mlflow
extension. To do this, first download the open-source Python extension and command-line interface (CLI) command:
pip install comet-for-mlflow
comet_for_mlflow
The Comet for MLflow Extension finds any existing MLflow runs in your current folder and make those available for analysis in Comet. For more options, use comet_for_mlflow --help
and see the following section.
The Comet for MLflow Extension is an open-source project and can be found at: github.com/comet-ml/comet-for-mlflow/
We welcome any questions, bug fixes, and comments in that Git repo.
Advanced CLI usage for Comet for MLflow Extension¶
The comet_for_mlflow
command offers several options to help you get the most out of previous MLflow runs with Comet:
--upload
- automatically uploads the prepared Experiments to Comet.--no-upload
- do not upload the prepared Experiments to Comet.--api-key API_KEY
- set the Comet API key.--mlflow-store-uri MLFLOW_STORE_URI
- set the MLflow store URI.--output-dir OUTPUT_DIR
- set the directory to store prepared runs.--force-reupload
- force re-upload of prepared Experiments.-y
,--yes
- answer all yes/no questions automatically with 'yes'.-n
,--no
- answer all yes/no questions automatically with 'no'.--email EMAIL
- set email address, if needed, for creating an account.
For more information, use comet_for_mlflow --help
or see github.com/comet-ml/comet-for-mlflow.
Configure Comet for MLflow¶
Calling mlflow.start_run()
in your code will create an Experiment object. The auto-logging features of this Experiment object can be configured through either environment variables or the .comet.config
file.
Item | Experiment Parameter | Environment Setting | Configuration Setting |
---|---|---|---|
metrics | auto_metric_logging | COMET_AUTO_LOG_METRICS | comet.auto_log.metrics |
metric logging rate | auto_metric_step_rate | COMET_AUTO_LOG_METRIC_STEP_RATE | comet.auto_log.metric_step_rate |
hyperparameters | auto_param_logging | COMET_AUTO_LOG_PARAMETERS | comet.auto_log.parameters |
command-line arguments | parse_args | COMET_AUTO_LOG_CLI_ARGUMENTS | comet.auto_log.cli_arguments |
As mentioned, Comet supports MLflow users through two different approaches:
- Built-in, core Comet support for MLflow
- Comet for MLflow Extension
The first is useful for running new Experiments, and requires you to use import comet_ml
or comet python script.py
. The second is useful for previously run MLflow Experiments and requires the comet-for-mlflow
extension.
There are some differences in the way these two methods operate. Specifically:
Item logged? | Comet built-in, core support | Comet Extension for MLflow |
---|---|---|
Metrics | Yes | Yes |
Hyperparameters | Yes | Yes |
Models | Yes | Yes |
Assets | Yes | Yes |
Source code | Yes | No |
git repo and patch info | Yes | No |
System metrics | Yes | No |
CPU and GPU usage | Yes | No |
Python packages | Yes | No |
Command-line arguments | Yes | No |
Standard output | Yes | No |
Installed OS packages | Yes | No |
Limitations in Comet support for MLflow¶
When running the built-in, core Comet support, there are two limitations:
- It does not support MLflow nested runs.
- It does not support continuing a previous MLflow run. The MLflow extension creates a new Comet Experiment in this case.
End-to-end example¶
For more examples using mlflow, see our examples GitHub repository.
import comet_ml
import keras
# The following import and function call are the only additions to code required
# to automatically log metrics and parameters to MLflow.
import mlflow
import mlflow.keras
import numpy as np
from keras.datasets import reuters
from keras.layers import Activation, Dense, Dropout
from keras.models import Sequential
from keras.preprocessing.text import Tokenizer
# The sqlite store is needed for the model registry
mlflow.set_tracking_uri("sqlite:///db.sqlite")
# We need to create a run before calling keras or MLflow will end the run by itself
mlflow.start_run()
mlflow.keras.autolog()
max_words = 1000
batch_size = 32
epochs = 5
print("Loading data...")
(x_train, y_train), (x_test, y_test) = reuters.load_data(
num_words=max_words, test_split=0.2
)
print(len(x_train), "train sequences")
print(len(x_test), "test sequences")
num_classes = np.max(y_train) + 1
print(num_classes, "classes")
print("Vectorizing sequence data...")
tokenizer = Tokenizer(num_words=max_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode="binary")
x_test = tokenizer.sequences_to_matrix(x_test, mode="binary")
print("x_train shape:", x_train.shape)
print("x_test shape:", x_test.shape)
print(
"Convert class vector to binary class matrix "
"(for use with categorical_crossentropy)"
)
y_train = keras.utils.np_utils.to_categorical(y_train, num_classes)
y_test = keras.utils.np_utils.to_categorical(y_test, num_classes)
print("y_train shape:", y_train.shape)
print("y_test shape:", y_test.shape)
print("Building model...")
model = Sequential()
model.add(Dense(512, input_shape=(max_words,)))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(
x_train,
y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_split=0.1,
)
score = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1)
print("Test score:", score[0])
print("Test accuracy:", score[1])
mlflow.keras.log_model(model, "model", registered_model_name="Test Model")
mlflow.end_run()
Try it out¶
Here is an example Colab Notebook for using Comet with MLflow.