LLMOps Archives

March 26, 2025

Abby Morgan

SelfCheckGPT for LLM Evaluation

Detecting hallucinations in language models is challenging. There are three general approaches: Measuring token-level probability distributions for indications that a…

Read

LLMOps

March 26, 2025

Kelsey Kinzer

LLM Hallucination Detection in App Development

Even ChatGPT knows it’s not always right. When prompted, “Are large language models (LLMs) always accurate?” ChatGPT says no and…

Read

graphic showing example llm hallucination responses from an AI chatbot that incorrectly counts the number of times the letter A appears in the word hallucination

LLMOps

March 3, 2025

Leonardo Gonzalez

LLM Evaluation Frameworks: Head-to-Head Comparison

As teams work on complex AI agents and expand what LLM-powered applications can achieve, a variety of LLM evaluation frameworks…

Read

Tutorials LLMOps Comet Community Hub

February 24, 2025

Abby Morgan

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…

Read

LLM Juries for Evaluation featured image

Tutorials Machine Learning LLMOps

February 19, 2025

Claire Longo

A Simple Recipe for LLM Observability

So, you’re building an AI application on top of an LLM, and you’re planning on setting it live in production.…

Read

LLMOps Comet Community Hub

February 5, 2025

Stéphan André

LLM Monitoring & Maintenance in Production Applications

Generative AI has become a transformative force, revolutionizing how businesses engage with users through chatbots, content creation, and personalized recommendations.…

Read

futuristic outer space graphic showing the importance of llm monitoring to maintain genai applications

Product LLMOps

January 29, 2025

Andrés Cruz

Building Opik: A Scalable Open-Source LLM Observability Platform

Opik is an open-source platform for evaluating, testing, and monitoring LLM applications, created by Comet. When teams integrate language models…

Read

Run open source LLM evaluations with Opik!

SelfCheckGPT for LLM Evaluation

LLM Hallucination Detection in App Development

LLM Evaluation Frameworks: Head-to-Head Comparison

LLM Juries for Evaluation

A Simple Recipe for LLM Observability

LLM Monitoring & Maintenance in Production Applications

Building Opik: A Scalable Open-Source LLM Observability Platform

Products

Learn

Company

Pricing