AI/ML Growth Engineer

March 26, 2025

Abby Morgan

SelfCheckGPT for LLM Evaluation

Detecting hallucinations in language models is challenging. There are three general approaches: Measuring token-level probability distributions for indications that a…

Read

Tutorials LLMOps Comet Community Hub

February 24, 2025

Abby Morgan

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…

Read

LLM Juries for Evaluation featured image

Product Tutorials Machine Learning LLMOps Comet Community Hub

January 28, 2025

Abby Morgan

G-Eval for LLM Evaluation

LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…

Read

Tutorials LLMOps Comet Community Hub

December 19, 2024

Abby Morgan

BERTScore For LLM Evaluation

Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…

Read

Tutorials LLMOps Comet Community Hub

November 21, 2024

Abby Morgan

Perplexity for LLM Evaluation

Perplexity is, historically speaking, one of the "standard" evaluation metrics for language models. And while recent years have seen a…

Read

Perplexity for LLM Evaluation Blog title image

Tutorials Machine Learning Comet Community Hub

June 25, 2023

Abby Morgan

SAM + Stable Diffusion for Text-to-Image Inpainting

In this article, we’ll leverage the power of SAM, the first foundational model for computer vision, along with Stable Diffusion,…

Read

a side by side picture of a koala and a frog holding a leaf

Tutorials Machine Learning Comet Community Hub

August 18, 2023

Abby Morgan

Image Inpainting for SDXL 1.0 Base Model + Refiner

In this article, we’ll compare the results of SDXL 1.0 with its predecessor, Stable Diffusion 2.0. We’ll also take a…

Read

Run open source LLM evaluations with Opik!

SelfCheckGPT for LLM Evaluation

LLM Juries for Evaluation

G-Eval for LLM Evaluation

BERTScore For LLM Evaluation

Perplexity for LLM Evaluation

SAM + Stable Diffusion for Text-to-Image Inpainting

Image Inpainting for SDXL 1.0 Base Model + Refiner

Products

Learn

Company

Pricing