Major Releases: MCP Server & Google Agent Dev Kit Support
We’ve just rolled out two major updates in Opik, Comet’s open-source LLM evaluation platform, that make it easier than ever…
We’ve just rolled out two major updates in Opik, Comet’s open-source LLM evaluation platform, that make it easier than ever…
6 years ago, I decided to open-source my Python code for a personal project I was working on, which led…
Large language models (LLMs) have revolutionized natural language processing, yet most development and evaluation efforts have historically centered around Latin-script…
Detecting hallucinations in language models is challenging. There are three general approaches: Measuring token-level probability distributions for indications that a…
Even ChatGPT knows it’s not always right. When prompted, “Are large language models (LLMs) always accurate?” ChatGPT says no and…
Spring is in the air, and we’re excited to bring you four fresh releases in the Comet platform to make…
As teams work on complex AI agents and expand what LLM-powered applications can achieve, a variety of LLM evaluation frameworks…