mlflow

Comet Support for MLFlow

Comet has extensive support for users of MLFlow. In fact, Comet can support using MLFlow in two different ways:

  1. Comet built-in, core support for MLFlow
  2. Comet for MLFlow Extension

We'll explore these two methods in the sections below.

1. Comet built-in, core support for MLFlow

If you're already using MLFlow, then Comet will work out of the box with MLFlow. First install the Comet Python SDK and set your API key.

Once you have comet_ml installed, you can simply run any MLFlow script from the console as follows:

comet python mlflow_script.py

Alternatively, you can add this one line of code to the top of your training MLFlow script:

python import comet_ml

and run your MLFlow script as you normally would.

With either method, you will get much more data about your run beyond MLFlow's regular functionality. We'll detail the additional benefits below.

How it Works

Comet's built-in, core support for MLFlow will attempt to create a live, online Experiment if a Comet API Key is configured. If a Comet API Key cannot be found, you will see the following log message:

No Comet API Key was found, creating an OfflineExperiment. Set up your API Key to get the full Comet experience: https://www.comet.ml/docs/python-sdk/advanced/#python-configuration

In the case that no API key is found the Comet SDK will still create an OfflineExperiment so you will still get all the additional tracking data from Comet.ml. Just remember to upload the offline experiment archive later. In fact, at the end of the run, the script will provide you the exact command to run, similar to the following:

comet upload /path/to/archive.zip

Any future experiment runs created with this script will automatically include Comet's extended experiment tracking to MLFlow.

When you run MLFlow by importing comet_ml or by using the command-line comet python script.py you will automatically log all of the following items to comet.ml:

Info

For more information on using comet in the console, see Comet Command-Line Utilities.

Now, we'll explore the other support method for MLFlow users.

2. Comet for MLFlow Extension

If you would like to see your previously run MLFlow experiments in Comet, try out the comet_for_mlflow extension. To do this, first download the open-source Python extension and command-line interface (CLI) command:

python pip install comet-for-mlflow Then execute the command at the console:

bash comet_for_mlflow

The Comet for MLFlow Extension will find any existing MLFlow runs in your current folder and make those available for analysis in Comet. For more options, use comet_for_mlflow --help and see the following section.

The Comet for MLFlow Extension is an open-source project and can be found at: github.com/comet-ml/comet-for-mlflow/

We welcome any questions, bug fixes, and comments there.

Advanced Comet for MLFlow Extension CLI Usage

comet_for_mlflow has a variety of options to help you get the most out of previously run MLFlow runs with Comet. You can use the following flags with comet_for_mlflow:

  • --upload - automatically upload the prepared experiments to comet.ml
  • --no-upload - do not upload the prepared experiments to comet.ml
  • --api-key API_KEY - set the Comet API key
  • --mlflow-store-uri MLFLOW_STORE_URI - set the MLFlow store uri
  • --output-dir OUTPUT_DIR - set the directory to store prepared runs
  • --force-reupload - force re-upload of prepared experiments
  • -y, --yes - answer all yes/no questions automatically with 'yes'
  • -n, --no - answer all yes/no questions automatically with 'no'
  • --email EMAIL - set email address if needed for creating an account

For more information, use comet_for_mlflow --help or see github.com/comet-ml/comet-for-mlflow.

MLFlow Logging

As mentioned, Comet supports MLFlow users through two different approaches:

  1. Comet built-in, core support for MLFlow
  2. Comet for MLFlow Extension

The first is useful for running new experiments, and requires that you import comet_ml or use comet python script.py. The second is useful for previously run MLFlow experiments, and requires the comet-for-mlflow extension.

There are some differences between the way these two methods operate. Specifically:

Item logged? Comet built-in, core support Comet Extension for MLFlow
Metrics Yes Yes
Hyperparameters Yes Yes
Models Yes Yes
Assets Yes Yes
Source code Yes No
git repo and patch info Yes No
System Metrics Yes No
CPU and GPU usage Yes No
Python packages Yes No
Command-line arguments Yes No
Standard Output Yes No
Installed OS Packages Yes No

Comet for MLFlow Support Limitations

When running the MLFlow built-in, core support, there are two limitations:

  • Does not support MLFlow nested runs.
  • Does not support continuing a previous MLFlow run. The MLFlow extension will create a new Comet Experiment in that case.