Getting Started

Changelog

Opik Dashboard:

  • We have revamped the traces table, the header row is now sticky at the top of the page when scrolling

  • As part of this revamp, we also made rows clickable to make it easier to open the traces sidebar

  • Added visualizations in the experiment comparison page to help you analyze your experiments

  • You can now filter traces by empty feedback scores in the traces table

  • Added support for Gemini options in the playground

  • Updated the experiment creation code

  • Many performance improvements

Python and JS / TS SDK:

  • Add support for Anthropic cost tracking when using the LangChain integration
  • Add support for images in google.genai calls
  • LangFlow integration has now been merged

Opik Dashboard:

  • Add CSV export for the experiment comparison page

  • Added a pretty mode for rendering trace and span input / output fields

  • Improved pretty mode to support new line characters and tabs

  • Added time support for the Opik datetime filter

  • Improved tooltips for long text

  • Add reason field for feedback scores to json downloads

Python and JS / TS SDK:

  • Day 0 integration with OpenAI Agents
  • Fixed issue with get_experiment_by_name method
  • Added cost tracking for Anthropic integration
  • Sped up the import time of the Opik library from ~5 seconds to less than 1 second

Opik Dashboard:

  • Chat conversations can now be reviewed in the platform
  • Added the ability to leave comments on experiments
  • You can now leave reasons on feedback scores, see Annotating Traces
  • Added support for Gemini in the playground
  • A thumbs up / down feedback score definition is now added to all projects by default to make it easier to annotate traces.

JS / TS SDK:

  • The AnswerRelevanceMetric can now be run without providing a context field
  • Made some updates to how metrics are uploaded to optimize data ingestion

Opik Dashboard:

  • You can now add comments to your traces allowing for better collaboration:
  • Added support for OpenRouter in the playground - You can now use over 300 different models in the playground !

JS / TS SDK:

  • Added support for JSON data format in our OpenTelemetry endpoints
  • Added a new opik healthcheck command in the Python SDK which simplifies the debugging of connectivity issues

Opik Dashboard:

  • Improved the UX when navigating between the project list page and the traces page

Python SDK:

  • Make the logging of spans and traces optional when using Opik LLM metrics
  • New integration with genai library

JS / TS SDK:

  • Added logs and better error handling

Opik Dashboard:

  • Added support for local models in the Opik playground

Python SDK:

  • Improved the @track decorator to better support nested generators.
  • Added a new Opik.copy_traces(project_name, destination_project_name) method to copy traces from one project to another.
  • Added support for searching for traces that have feedback scores with spaces in their name.
  • Improved the LangChain and LangGraph integrations

JS / TS SDK:

  • Released the Vercel AI integration
  • Added support for logging feedback scores

Opik Dashboard:

  • You can now view feedback scores for your projects in the Opik home page
  • Added line highlights in the quickstart page
  • Allow users to download experiments as CSV and JSON files for further analysis

Python SDK:

  • Update the evaluate_* methods so feedback scores are logged after they computed rather than at the end of an experiment as previously
  • Released a new usefulness metric
  • Do not display warning messages about missing API key when Opik logging is disabled
  • Add method to list datasets in a workspace
  • Add method to list experiments linked to a dataset

JS / TS SDK:

  • Official release of the first version of the SDK - Learn more here
  • Support logging traces using the low-level Opik client and an experimental decorator.

Opik Dashboard:

  • Performance improvements for workspaces with 100th of millions of traces
  • Added support for cost tracking when using Gemini models
  • Allow users to diff prompt

SDK:

  • Fixed the evaluate and evaluate_* functions to better support event loops, particularly useful when using Ragas metrics
  • Added support for Bedrock invoke_agent API

Opik Dashboard:

  • Added logs for online evaluation rules so that you can more easily ensure your online evaluation metrics are working as expected
  • Added auto-complete support in the variable mapping section of the online evaluation rules modal
  • Added support for Anthropic models in the playground
  • Experiments are now created when using datasets in the playground
  • Improved the Opik home page
  • Updated the code snippets in the quickstart to make them easier to understand

SDK:

  • Improved support for litellm completion kwargs
  • LiteLLM required version is now relaxed to avoid conflicts with other Python packages

Opik Dashboard:

  • Datasets are now supported in the playground allowing you to quickly evaluate prompts on multiple samples
  • Updated the models supported in the playground
  • Updated the quickstart guides to include all the supported integrations
  • Fix issue that means traces with text inputs can’t be added to datasets
  • Add the ability to edit dataset descriptions in the UI
  • Released online evaluation rules - You can now define LLM as a Judge metrics that will automatically score all, or a subset, of your production traces.

Online evaluation

SDK:

  • New integration with CrewAI
  • Released a new evaluate_prompt method that simplifies the evaluation of simple prompts templates
  • Added Sentry to the Python SDK so we can more easily