Skip to content

Opik¶

Opik serves as a comprehensive platform for evaluating LLMs. It enables you to confidently assess, test, and deploy LLM applications, equipped with a range of observability tools designed to fine-tune language model outputs throughout both development and production phases.

Opik can be used for:

  • Observability: Log all your LLM calls and chains during development and in production
  • Evaluation: Store your evaluation datasets in Opik and easily evaluate the performance of your LLM applications using Opik's built-in evaluation metrics (Hallucination, Context Relevance, and more) or using custom metrics
  • Testing: Use Opik's integration with PyTest to automate the testing of your LLM application before it is deployed to production
  • Production: Monitor and debug your LLM applications in production

Learn more

The full Opik documentation is available here.

Getting started¶

Explore the following guides to get started with Opik:

Sep. 17, 2024