skip to Main Content

Issue 17: Graph Neural Nets, Concept Drift Deep-dive, Rethinking Large Language Models

Welcome to issue #17 of The Comet Newsletter!

In this week’s issue, we share a two-part deep dive on graph neural networks (part 1; part 2), as well as a re-evaluation of the idea that bigger = more complex when it comes to language models.

Additionally, we explore an introduction to ML optimizers and compilers and an in-depth guide to the ins and outs of concept drift once a model is deployed.

Like what you’re reading? Subscribe here.

And be sure to follow us on Twitter and LinkedIn — drop us a note if you have something we should cover in an upcoming issue!

Happy Reading,

Austin

Head of Community, Comet

——————————–

INDUSTRY | WHAT WE’RE READING | PROJECTS

Large language models aren’t always more complex

Whenever a new state-of-the-art language model arrives (think GPT-3), it’s generally accompanied by media coverage and analysis touting the number of parameters it was trained on (these days, in the 150+ billion range), as well as its complexity.

But as Kyle Wiggers writes for VentureBeat, industry experts are becoming increasingly skeptical that the size of the models and the datasets they’re trained on correspond directly to improved performance.

In terms of parameter counts and how the models are trained, this boils down to explicit “instruction tuning”, or the specific kinds of NLP tasks the model is trained to solve (i.e. translation, sentiment analysis). Interestingly, models trained in this fashion are still often able to generalize to other tasks, much like GPT-3. Wiggers cites the FLAN architecture (“Fine-tuned language network”) as an example of this dynamic.

When it comes to dataset problems, Wiggers notes that “over-filtering” for data that achieves high classifier scores can actually lead to worse model performance. This kind of excessive optimization can lead to a “misalignment between proxy and true objective(s)” in what the model is designed to achieve, according to Leo Gao, data scientist at EleutherAI.

Read the full story in VentureBeat 

——————————–

INDUSTRY | WHAT WE’RE READING | PROJECTS

Inferring Concept Drift Without Labeled Data

In this incredibly thorough guide from Andrew Reed and Nisha Muktewar of Fast Forward Labs, the authors offer a deeply researched survey of the background, problematic nature, and approaches to solving what’s known as “concept drift” in production ML models. Put simply, concept drift refers to the fact that data that models are trained on often don’t reflect dynamic (and shifting) real-world conditions—and therefore, resulting models, left alone, won’t be able to adapt and generalize to these changing conditions.

This impressive guide hammers home the point that an ML team’s work isn’t done once a model has been deployed—in fact, in most cases, the work is just beginning. And not only does this guide do a great job of clearly defining concept drift, it also provides practical approaches to solving the problem, a sample use case, and a number of additional concerns that ML teams should consider, ranging from ethical considerations, to handling concept drift within “big data” systems.

Read the full report from Fast Forward Labs

——————————–

INDUSTRY | WHAT WE’RE READING | PROJECTS

From Distill Pub: Graph Neural Networks Deep Dive

While we recently reported on Distill’s hiatus (which they are still on, apparently), we were treated to a bit of a surprise last week when they shared a 2-part series on graph neural networks, authored by several researchers at Google.

Part 1 provides an overview of graph neural networks, including how they’re structured, how they work, and the kinds of problems they’re fit to solve.

Part 2 zooms in on convolutions on graphs, which make up and impact the “building blocks and design choices of graph neural networks.”

——————————–

INDUSTRY | WHAT WE’RE READING | PROJECTS

A friendly introduction to machine learning compilers and optimizers

In this helpful guide, Stanford instructor and ML engineer Chip Huyen focuses in on deploying ML models on the edge, drilling down into topics like model compatibility and performance, the differences in deploying via the cloud vs. the edge, best practices for optimizing models toward this kind of deployment, a survey of the different kinds of compilers and optimization methods, and more.

Read Huyen’s full article here

Austin Kodra

Austin Kodra

Austin Kodra is the Head of Community at Comet, where he works with Comet's talented community of Data Scientists and Machine Learners.
Back To Top