Skip to content

Send data to MPM

Start logging model predictions, as well as new models, simply by sending Comet your model's input features and predictions. There is no additional configuration required.

There are two main integration paths when it comes to MPM:

  • Sending data from the application making the predictions
  • Using log forwarding

If the inference service is written in Python and predictions are not tracked already, we recommend sending events straight from the application using our MPM Python SDK.

If you are using a managed inference service like Sagemaker or Seldon, you can simply enable Data Capture or Request / Response logging and forward these events to MPM using our Rest API.

Integration with Experiment Management¶

Using MPM in conjunction with Experiment Management and Comet's Model Registry allows you to track models from development all the way to production ! In order to do this, you simply need to ensure that the model name used when sending MPM events matches up with the model name in the Model Registry.

MPM can also be used standalone, in which case models are automatically created in the Comet Model Registry whenever you send the first event for a new model.

MPM Python SDK¶

The MPM Python SDK has been developed with production inference services in mind and includes a number of optimizations to make sure the logging overhead is kept at a minimum.

The MPM SDK can be installed using:

pip install comet-mpm

Note

If you are logging MPM events to a local MPM deployment, you will need to specify the following environment variable before running the script above:

export COMET_URL=https://<path_to_your_deployment>/

Sending prediction events¶

Once installed, logging events to MPM takes just three lines of code:

from comet_mpm import CometMPM

api_key = "<Your API key>"
workspace_name = "<Your workspace name>"

MPM = CometMPM(
    workspace_name=workspace_name,
    model_name='demo-mpm-model',
    model_version='1.2.0',
    api_key=api_key
)

MPM.log_event(
    prediction_id="1",
    input_features={'age': 29, 'country': 'UK'},
    output_features={'value': True, 'probability': 0.5},
    labels={'value': False}
)

Note

Feature names should only contain letters, numbers and underscores. If they contain any other characters, some features like input drift calculation, filters and custom metrics might not behave as expected.

You can check that individual events are correctly ingested in Comet by navigating to the debugger tab in MPM. If the event has been ingested correctly, it will appear in the predictions tab, if there was an error ingesting the data it will appear in the Ingestion Errors section with an error message explaining the issue. It is worth noting that it can take up to 5 minutes for the events to be present in the debugger tab.

If you are using FastAPI, you will also need to implement a shutdown event as detailed here

Sending labels¶

You can also send labels using the SDK:

from comet_mpm import CometMPM

api_key = "<Your API key>"
workspace_name = "<Your workspace name>"

MPM = CometMPM(
    workspace_name=workspace_name,
    model_name='demo-mpm-model',
    model_version='1.2.0',
    api_key=api_key
)


MPM.log_label(
    prediction_id="1",
    labels={"value": False}
)

Sending batches of events¶

To send a batch of events to MPM, you can use the log_dataframe method to log a pandas DataFrame:

from comet_mpm import CometMPM
import pandas as pd

api_key = "<Your API key>"
workspace_name = "<Your workspace name>"

MPM = CometMPM(
    workspace_name=workspace_name,
    model_name='demo-mpm-model',
    model_version='1.2.0',
    api_key=api_key,
)

df = pd.DataFrame({
    'age': [29, 30, 31],
    'prediction_id': [100, 200, 300],
    'country': ['UK', 'US', 'UK'],
    'label': [True, False, True],
    'predicted_value': [True, False, True]
})

MPM.log_dataframe(
    dataframe=df,
    prediction_id_column='prediction_id',
    feature_columns=['age', 'country'],
    output_features_columns=['predicted_value'],
    labels_columns=['label']
)

Sending training distributions¶

If you would like to compute feature drift based on a training distribution, you will need to upload these to the MPM platform. You can achieve this by using the upload_dataset_csv method in the MPM SDK:

from comet_mpm import CometMPM

api_key = "<Your API key>"
workspace_name = "<Your workspace name>"

MPM = CometMPM(
    workspace_name=workspace_name,
    model_name='demo-mpm-model',
    model_version='1.2.0',
    api_key=api_key,
)

MPM.upload_dataset_csv(
    file_path='path/to/your/training/distribution.csv',
    dataset_type='training-events', # This is an important field as it marks this CSV as training data
    dataset_name='training-distribution',
)

MPM Rest API¶

To track model performance, MPM needs to have access to a model's input features, output features and labels. In addition to the Python API, this data can be sent via a Rest API.

The MPM Rest API includes three methods that can be used to upload MPM events:

  • https://www.comet.com/mpm/v2/events/batch: Used to upload batches of events that contain input and output features
  • https://www.comet.com/mpm/v2/labels: Used to upload labels

Sending input and output features to MPM¶

The JSON payload for the https://www.comet.com/mpm/v2/events/batch POST endpoints should contain the following attributes:

NameTypeDescriptionRequiredExample
workspaceNamestringComet workspace in which to log the model✓object-detection
modelNamestringName of model✓Demo model
modelVersionstringVersion of the model✓"1.0.0"
predictionIdstringUsed to identify a single prediction✓1
timestampintTimestamp in milliseconds✓1615922560000
featuresobjectInput features to the model✓See example below.
predictionsobjectOutput features to the model✓See example below.
labelsobjectlabels for the prediction✗See example below.

Here is an example payload:

{
    "data": [{
        "workspaceName": "...",
        "modelName": "Credit Scoring",
        "modelVersion": "1.0.0",
        "timestamp": 1615922560000,
        "predictionId": "000001",
        "features": {
            "feature_1":0.34,
            "feature_2": "dog",
        },
        "predictions": {
            "value": "true",
            "probability": 0.86
        },
        "labels": {
            "value": "true"
        }
    }]
}

To test, you can use:

workspaceName=<workspace_name>
api_key=<api_key>

current_timestamp=$(date -v-1H +%s000)
payload='{"data": [{"features": {"categorical_feature_0": "value_1", "numerical_feature_0": 0.5841119597210334}, "modelName": "test-model", "modelVersion": "1.0.0", "predictions": {"predicted_value": "true"}, "predictionId": "e347539b-a1df-432e-aa4a-3fe93805d3be", "timestamp": '$current_timestamp', "workspaceName": "'$workspaceName'"}]}'

curl -s -d "$payload" \
    -H "Content-Type: application/json"\
    -H "Authorization: $api_key"\
    -X POST https://www.comet.com/mpm/v2/events/batch

Note

In the example above we have offset the timestamp by an hour so that the data will appear in MPM charts.

If you use the current timestamp, you will need to wait an hour for the data to appear in the MPM charts or use check the debugger tab for any ingestion errors.

Sending labels to MPM¶

Ground truth labels can be sent to Comet and are used to compute accuracy related metrics (Accuracy, Precision, Recall, F1-score, etc). You can send the labels either with the input features and predictions or at any time after that, we will take care of updating the relevant metrics.

If you wish to send labels to Comet independently from the prediction, you can use the https://www.comet.com/mpm/v2/labels/batch POST endpoint with a JSON payload containing the following attributes:

NameTypeDescriptionRequiredExample
workspaceNamestringComet workspace in which to log the model✓object-detection
modelNamestringName of model✓Demo model
modelVersionstringVersion of the model✓"1.0.0"
predictionIdstringUsed to identify a single prediction✓1
valueobjectValue of the label✓{"prediction: "true"}

Note

Label events will automatically be linked to prediction events based on the predictionId supplied. This merge job is done on a daily basis and runs at around 8am UTC.

If you wish to have accuracy metrics updated as soon as the label is available, we recommend sending the labels as part of your prediction events.

Learn more¶

Dec. 17, 2024