Welcome to another recap of the Comet ML Office Hours, powered by The Artists of Data Science! This week we’re covering Session 4 of our new series. This session took place Jan. 26th, 2022 and we were joined by Jimmy Whitaker of Pachyderm, Dr. Abe Gong of Great Expectations, and Data Governance Analyst and Heartbeat contributor Matt Blasa.
As a reminder, we’d love to see any and all of you at these fifty minute sessions—so feel free to register for upcoming Office Hours sessions here! As always, there’s a lot more in the full session (which you can find on our YouTube channel), so be sure to check it out, alongside clips from roundtables, webinars, and previous Office Hours.
All About Data
This session was all about data – understanding, validating, versioning, and engineering it. Because without good data, your model won’t perform optimally and could disappoint down the road.
Matt Blasa of Brinks Home Security discussed his role as a Data Governance Analyst and how that plays into the larger ML experiment pipeline.
Data Curation
One thing that did come up in the course of the data discussion was the idea of “Data Curators.” This was a new term for Harpreet and inspired Dr. Abe Gong to bring up Emilie Schario’s suggestion of taking the term “Data Scientist” and exploding it into multiple job titles.
Hear more from Dr. Gong in the clip below.
Cautionary Tales – And Plenty of Them
Throughout the discussion a number of stories came up about “upside down models” and what happens if you don’t properly version your data.
One story that came up from Jimmy Whitaker was about digits in NLP models. Specifically, a transcription model that was outputting different things for dates – sometimes written as digits and sometimes written out as words.
As always, there’s more to be discussed and discovered. Check out Comet’s contributor-led publication Heartbeat as well as our YouTube, Twitter, and LinkedIn for more great information.
We run these virtual Office Hours every Wednesday at 11am EST (New York, NY). Completely free to attend and participate, and we’d love to see any and all of you there! We’ve got a great series planned and welcome questions for Harpreet or any of our guests via email to emilie@comet.ml.
Notes from the eight session of a brand new Office Hours series: Seven Simple Steps to Standardizing the Experiment with guests Dr. Doug Blank, Jacques Verre, Dhruv Nair and Michael Cullan.
Notes from the seventh session of a brand new Office Hours series: Seven Simple Steps to Standardizing the Experiment with guests Dhruv Nair and Michael Cullan.
Notes from the sixth session of a brand new Office Hours series: Seven Simple Steps to Standardizing the Experiment discussing data with guests Tiffany Fabianac and Dr. Doug Blank.