Structured Generation for LLM-as-a-Judge Evaluations
For the past few months, I’ve been working on LLM-based evaluations (”LLM-as-a-Judge” metrics) for language models. The results have so…
For the past few months, I’ve been working on LLM-based evaluations (”LLM-as-a-Judge” metrics) for language models. The results have so…
Today, we’re excited to release version 2.0 of Kangas, our open-source platform for exploring, analyzing, and visualizing multi-media data. Whether…
Thousands of data scientists use Comet panels, histograms, and reports to visualize data from experiments every day. While we’re proud…