August 30, 2024
A guest post from Fabrício Ceolin, DevOps Engineer at Comet. Inspired by the growing demand…
As data science and machine learning teams become less siloed, more cross-functional, and increasingly interdisciplinary, the need for sustainable tools and processes for collaboration across the ML model building lifecycle has only increased.
Given this trend—and the fact that one of Comet’s core values (both internally and with our product stack) is to enable more effective and efficient collaboration—we thought collaboration would be a perfect topic to restart our Industry Q&A series.
Luckily, we were able to host two industry leaders at the forefront of creating tooling for ML collaboration at different points in the development lifecycle—Abubakar Abid of Gradio, and Jakub Jurovych of Deepnote.
While there was plenty of awesome conversation and many compelling insights (we’ve collected the best, along with the full session, in this YouTube playlist), I wanted to take a few moments to recap and share a few of my favorite moments
To help frame the conversation around collaboration in ML—which admittedly could go in so many different directions, Jakub shared his thoughts about the nature of collaboration more broadly.
Specifically, he discussed his understanding of collaboration as something that, organizationally speaking, happens both in stages and in relative proportions. I really appreciated his use of general writing as a frame through which to explore collaboration’s basics stages, from creating and tweaking an initial draft in isolation to seeking line edits or deeper peer review.
The same basic concept holds true for collaboration in ML. Much of the modeling work will still need to be done on the individual level—but to capture value from that 20-30% of the work that requires close and efficient collaboration, teams will need to adopt cross-functional tools and approaches to make sure a collaborative approach is able to scale with you as your team grows.
One of the overarching themes of the Q&A was the topic of domain/subject matter expertise—that is, the importance of building, testing, and deploying models with the support of experts in the field in which your model will exist in the real world (i.e. healthcare, finance, retail, etc).
Abubakar was able to share some really interesting thoughts on this dynamic, especially as it relates to testing whether or not ML models actually work as intended. His primary contention is that we need to change the model validation paradigm—from just measuring accuracy and loss on validation sets, to actually measuring effectiveness by putting models in the hands of domain experts as quickly as possible.
In addition to discussing the dynamics that led them to building tools for ML collaboration, as well as the current problems they’re attempting to solve, both Jakub and Abubakar also offered some interesting perspective on what they’d like to see from collaborative tools and processes moving forward.
Specifically, Jakub discussed how the future of data-focused organizations will likely include data scientists and analysts across different teams and business areas (product, sales, marketing, etc). And Abubakar largely agreed, adding that he hopes to see a future where teams, using intuitive tools for collaboration, can accomplish a whole lot more with just a single data scientist than they previously could with a full team.
Check out our complete playlist on YouTube for more awesome content from this event, including the full session!
And stay tuned! We’ll be announcing our August Industry Q&A soon, and we’d love to see you all there.
We recently launched The Comet Newsletter, which offers a weekly inside look at all things data science and ML, featuring expert takes and perspective from our team. We have big things planned for both Office Hours and the newsletter, so be sure to subscribe if you haven’t already!