July 29, 2024
In the machine learning (ML) and artificial intelligence (AI) domain, managing, tracking, and visualizing model…
Welcome to issue #18 of The Comet Newsletter!
In this week’s issue, we share a report that examines how big tech players are working to self-regulate their approaches to AI ethics.
Additionally, we share details on several new projects, including a textless NLP model from Facebook, a novel approach to 3D dance generation from Google, and a new TensorFlow Python package that makes it easier than ever to train similarity models.
And be sure to follow us on Twitter and LinkedIn — drop us a note if you have something we should cover in an upcoming issue!
Happy Reading,
Austin
Head of Community, Comet
INDUSTRY | PROJECTS
While we’ve previously covered issues of bias and ethics in the industry, this report from Paresh Dave and Jeffrey Dastin for Reuters unearths some of the ways the big tech players are debating the ethical concerns of AI internally.
The authors cite examples like Google pulling back from AI features that predict who should receive loans, Microsoft weighing speech tech that could help those with vocal impairments—but could also give rise to troubling political deepfakes.
Beyond these examples, this report touches on a deeper point that is less frequently discussed—should these companies be tasked with creating their own ethics regulations, committees, and policies to begin with? Or should governments and regulatory agencies step in and provide oversight that would standardize the way the industry must tackle these difficult challenges and questions?
Read the full report in Reuters here.
INDUSTRY | PROJECTS
This new Python package from the TensorFlow team is intended to make training similarity models (often used in search applications) much faster and easier. The TF team tackled this problem by relying on contrastive deep learning approaches, which teach models how to learn a given embedding space “in which similar examples are close together while dissimilar ones are far apart.”
While not necessarily a new technical approach to the problem of identifying similarity, this new library allows ML engineers to quickly and efficiently access a new Keras model that natively supports indexing and querying for these embeddings. To help the community get started with this new library, the TensorFlow team has provided a Hello World Tutorial, as well as more information and code resources on the project’s GitHub repo.
Read the full blog post announcement here.
INDUSTRY | PROJECTS
For many of us, learning to dance is…difficult. For those of us who aren’t naturally inclined to the art form, matching our movement patterns to musical rhythms can be a daunting task. But developing a machine learning system that can learn to dance is even more difficult—the movements by an ML system need to be continuous, have a high degree of kinematic complexity, and somehow understand the often non-linear relationship between movements and a given piece of music.
In this technical blog post from the Google AI team, their researchers set out to tackle this exact problem. At a high level, the team had to construct a complicated, extensively-annotated dataset of computer-generated movements mapped onto different pieces of music. They then used this dataset to train a “Full Attention Cross-Modal Transformer (FACT) Model”, which is designed to generate novel 3D dance from music inputs.
It’s an impressive and unique approach, and you can read a whole lot more about the technical specifics in their full write-up, as well as information about how they’ve evaluated the model’s performance.
Read Google AI’s full blog post here.
INDUSTRY | PROJECTS
Without a doubt, large language models represent one of the most important advancements in the ML industry in recent years. But while these models have spurred incredible and creative use cases, they also have a number of significant limitations. One of the main ones is that applications are generally limited to languages for which there are massive datasets suitable for training these models.
Facebook AI is looking to address this problem with its latest NLP architecture: Generative Spoken Language Model (GSLM). This novel architecture centers on the use of new representation learning mechanisms, which allows it to learn directly from raw audio signals—without any labels or text.
The deep dive on Facebook AI’s blog covers the benefits of a textless NLP approach, their team’s development of a baseline model, a discussion about the model’s limitations, and what they plan to do next.