LLMOps Archives

February 24, 2025

Abby Morgan

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…

Read

LLM Juries for Evaluation featured image

Product Tutorials Machine Learning LLMOps Comet Community Hub

January 28, 2025

Abby Morgan

G-Eval for LLM Evaluation

LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…

Read

Tutorials LLMOps Comet Community Hub

December 19, 2024

Abby Morgan

BERTScore For LLM Evaluation

Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…

Read

LLMOps

December 19, 2024

Gourav Bais

Intro to LLM Observability: What to Monitor & How to Get Started

While LLM usage is soaring, productionizing an LLM-powered application or software product presents new and different challenges compared to traditional…

Read

futuristic space visualization to emphasize the concept of LLM observability

Tutorials LLMOps Comet Community Hub

December 9, 2024

Claire Longo

Building ClaireBot, an AI Personal Stylist Chatbot

Follow the evolution of my personal AI project and discover how to integrate image analysis, LLM models, and LLM-as-a-judge evaluation…

Read

Tutorials LLMOps Comet Community Hub

November 21, 2024

Abby Morgan

Perplexity for LLM Evaluation

Perplexity is, historically speaking, one of the "standard" evaluation metrics for language models. And while recent years have seen a…

Read

Perplexity for LLM Evaluation Blog title image

Tutorials Machine Learning LLMOps Comet Community Hub

August 30, 2024

Fabrício Ceolin

Building a Low-Cost Local LLM Server to Run 70 Billion Parameter Models

A guest post from Fabrício Ceolin, DevOps Engineer at Comet. Inspired by the growing demand for large-scale language models, Fabrício…

Read

Run open source LLM evaluations with Opik!

LLM Juries for Evaluation

G-Eval for LLM Evaluation

BERTScore For LLM Evaluation

Intro to LLM Observability: What to Monitor & How to Get Started

Building ClaireBot, an AI Personal Stylist Chatbot

Perplexity for LLM Evaluation

Building a Low-Cost Local LLM Server to Run 70 Billion Parameter Models

Products

Learn

Company

Pricing