January 13, 2025

Opik Dashboard:

  • Datasets are now supported in the playground allowing you to quickly evaluate prompts on multiple samples
  • Updated the models supported in the playground
  • Updated the quickstart guides to include all the supported integrations
  • Fix issue that means traces with text inputs can’t be added to datasets
  • Add the ability to edit dataset descriptions in the UI
  • Released online evaluation rules - You can now define LLM as a Judge metrics that will automatically score all, or a subset, of your production traces.

Online evaluation

SDK:

  • New integration with CrewAI
  • Released a new evaluate_prompt method that simplifies the evaluation of simple prompts templates
  • Added Sentry to the Python SDK so we can more easily