Comet Community Hub Archives

March 27, 2025

Vincent Koc

LLM Evaluation Complexities for Non-Latin Languages

Large language models (LLMs) have revolutionized natural language processing, yet most development and evaluation efforts have historically centered around Latin-script…

Read

Tutorials LLMOps Comet Community Hub

March 26, 2025

Abby Morgan

SelfCheckGPT for LLM Evaluation

Detecting hallucinations in language models is challenging. There are three general approaches: Measuring token-level probability distributions for indications that a…

Read

Tutorials LLMOps Comet Community Hub

February 24, 2025

Abby Morgan

LLM Juries for Evaluation

Evaluating the correctness of generated responses is an inherently challenging task. LLM-as-a-Judge evaluators have gained popularity for their ability to…

Read

LLM Juries for Evaluation featured image

LLMOps Comet Community Hub

February 5, 2025

Stéphan André

LLM Monitoring & Maintenance in Production Applications

Generative AI has become a transformative force, revolutionizing how businesses engage with users through chatbots, content creation, and personalized recommendations.…

Read

futuristic outer space graphic showing the importance of llm monitoring to maintain genai applications

Product Tutorials Machine Learning LLMOps Comet Community Hub

January 28, 2025

Abby Morgan

G-Eval for LLM Evaluation

LLM-as-a-judge evaluators have gained widespread adoption due to their flexibility, scalability, and close alignment with human judgment. They excel at…

Read

Tutorials LLMOps Comet Community Hub

December 19, 2024

Abby Morgan

BERTScore For LLM Evaluation

Introduction BERTScore represents a pivotal shift in LLM evaluation, moving beyond traditional heuristic-based metrics like BLEU and ROUGE to a…

Read

Tutorials LLMOps Comet Community Hub

December 9, 2024

Claire Longo

Building ClaireBot, an AI Personal Stylist Chatbot

Follow the evolution of my personal AI project and discover how to integrate image analysis, LLM models, and LLM-as-a-judge evaluation…

Read

Run open source LLM evaluations with Opik!

LLM Evaluation Complexities for Non-Latin Languages

SelfCheckGPT for LLM Evaluation

LLM Juries for Evaluation

LLM Monitoring & Maintenance in Production Applications

G-Eval for LLM Evaluation

BERTScore For LLM Evaluation

Building ClaireBot, an AI Personal Stylist Chatbot

Products

Learn

Company

Pricing