December 18, 2024
Each layer of visibility into your training and debugging workflows builds confidence that your models…
Machine learning is experimental in nature. It’s more like research in a lab than it is like building traditional software. Regardless of the team size, it pays off to experiment quickly, fail fast and only invest in the best performing models. Over time, as the team and business matures, the wealth of knowledge accumulated is beyond what any team can manually capture.
The iterative nature of machine learning works best when the systems and tools of record, the MLOps tech stack, scales well, is quick to set up, easy to use, and ultimately human-friendly.
ML requires omniscient note-taking skills and infrastructure. When thousands of experiments are run, it’s impossible for humans to do all the note-taking. The support that most ML Engineers need is to automate as much as possible. Clearing out the manual tasks will help them stay closer to the business problem and get more value out of their ML investments.
The landscape for MLOps tools is quite vast. The majority of enterprise customers typically incorporate just a few vendors in their MLOps tech stack, along with their home-grown tools. The fewer tools in a tech stack, the easier it is to integrate various systems and stay agile as the business matures. Choosing an ML tech stack that scales is as much about evaluating the vendor as it is the tool itself.
Together, Comet and Metaflow enable a team of 1, or a team of 300 ML engineers to easily build, train and deploy models to production.
Metaflow provides a simple Python API for defining the business logic of your ML workflow and how and where it should get executed. Metaflow also helps to version all code, data, and models. Metaflow has been battle-hardened at Netflix, supported by the wonderful team at Outerbounds, and has been used to power thousands of applications across hundreds of companies, such as 23andMe, CNN, and Realtor.com.
Comet brings clarity and visibility into what is happening inside every workflow execution, allowing you to track and compare experiments in a user-friendly UI. See how enterprise companies like Uber, WorkFusion and The RealReal scale up ML with Comet.
Together, these complementary tools help make ML workflows more robust, reproducible, and observable, both during prototyping and production.
With just a few lines of code, Comet automatically gathers and logs:
Start by importing the Metaflow integration and annotating the Flow class:
from comet_ml.integration.metaflow import comet_flow from metaflow import FlowSpec, step @comet_flow class HelloFlow(FlowSpec): """ A flow where Metaflow prints 'Hi'. Run this flow to validate that Metaflow is installed correctly. """ @step def start(self): """ This is the 'start' step. All flows must have a step named 'start' that is the first step in the flow. """ print("HelloFlow is starting.") self.next(self.hello) @step def hello(self): """ A step for metaflow to introduce itself. """ print("Metaflow says: Hi!") self.next(self.end) @step def end(self): """ This is the 'end' step. All flows must have an 'end' step, which is the last step in the flow. """ print("HelloFlow is all done.") if __name__ == "__main__": HelloFlow()
Once you’ve run some experiments, the Comet-Metaflow integration will track both the individual tasks and the state of the flow as a whole. This enables consistent vocabulary and visibility of data across both Metaflow and Comet.
Once you’ve logged a Metaflow pipeline with Comet, here’s how you can add relevant panels.
Here are resources that you can leverage to get started:
If you have any questions, our teams are available on Comet Slack and Metaflow Slack communities.
We are working hard to make it easier to use both Comet and Metaflow together. Stay tuned for part 2 of this announcement later this year.