Opik Dashboard:
-
We have revamped the traces table, the header row is now sticky at the top of the page when scrolling
-
As part of this revamp, we also made rows clickable to make it easier to open the traces sidebar
-
Added visualizations in the experiment comparison page to help you analyze your experiments
-
You can now filter traces by empty feedback scores in the traces table
-
Added support for Gemini options in the playground
-
Updated the experiment creation code
-
Many performance improvements
Python and JS / TS SDK:
- Add support for Anthropic cost tracking when using the LangChain integration
- Add support for images in google.genai calls
- LangFlow integration has now been merged
Opik Dashboard:
-
Add CSV export for the experiment comparison page
-
Added a pretty mode for rendering trace and span input / output fields
-
Improved pretty mode to support new line characters and tabs
-
Added time support for the Opik datetime filter
-
Improved tooltips for long text
-
Add
reason
field for feedback scores to json downloads
Python and JS / TS SDK:
- Day 0 integration with OpenAI Agents
- Fixed issue with
get_experiment_by_name
method - Added cost tracking for Anthropic integration
- Sped up the import time of the Opik library from ~5 seconds to less than 1 second
Opik Dashboard:
- Chat conversations can now be reviewed in the platform

- Added the ability to leave comments on experiments
- You can now leave reasons on feedback scores, see Annotating Traces
- Added support for Gemini in the playground
- A thumbs up / down feedback score definition is now added to all projects by default to make it easier to annotate traces.
JS / TS SDK:
- The AnswerRelevanceMetric can now be run without providing a context field
- Made some updates to how metrics are uploaded to optimize data ingestion
Opik Dashboard:
- You can now add comments to your traces allowing for better collaboration:

- Added support for OpenRouter in the playground - You can now use over 300 different models in the playground !

JS / TS SDK:
- Added support for JSON data format in our OpenTelemetry endpoints
- Added a new
opik healthcheck
command in the Python SDK which simplifies the debugging of connectivity issues
Opik Dashboard:
- Improved the UX when navigating between the project list page and the traces page
Python SDK:
- Make the logging of spans and traces optional when using Opik LLM metrics
- New integration with genai library
JS / TS SDK:
- Added logs and better error handling
Opik Dashboard:
- Added support for local models in the Opik playground

Python SDK:
- Improved the
@track
decorator to better support nested generators. - Added a new
Opik.copy_traces(project_name, destination_project_name)
method to copy traces from one project to another. - Added support for searching for traces that have feedback scores with spaces in their name.
- Improved the LangChain and LangGraph integrations
JS / TS SDK:
- Released the Vercel AI integration
- Added support for logging feedback scores
Opik Dashboard:
- You can now view feedback scores for your projects in the Opik home page
- Added line highlights in the quickstart page
- Allow users to download experiments as CSV and JSON files for further analysis
Python SDK:
- Update the
evaluate_*
methods so feedback scores are logged after they computed rather than at the end of an experiment as previously - Released a new usefulness metric
- Do not display warning messages about missing API key when Opik logging is disabled
- Add method to list datasets in a workspace
- Add method to list experiments linked to a dataset
JS / TS SDK:
- Official release of the first version of the SDK - Learn more here
- Support logging traces using the low-level Opik client and an experimental decorator.
Opik Dashboard:
- Performance improvements for workspaces with 100th of millions of traces
- Added support for cost tracking when using Gemini models
- Allow users to diff prompt
SDK:
- Fixed the
evaluate
andevaluate_*
functions to better support event loops, particularly useful when using Ragas metrics - Added support for Bedrock
invoke_agent
API
Opik Dashboard:
- Added logs for online evaluation rules so that you can more easily ensure your online evaluation metrics are working as expected
- Added auto-complete support in the variable mapping section of the online evaluation rules modal
- Added support for Anthropic models in the playground
- Experiments are now created when using datasets in the playground
- Improved the Opik home page
- Updated the code snippets in the quickstart to make them easier to understand
SDK:
- Improved support for litellm completion kwargs
- LiteLLM required version is now relaxed to avoid conflicts with other Python packages
Opik Dashboard:
- Datasets are now supported in the playground allowing you to quickly evaluate prompts on multiple samples
- Updated the models supported in the playground
- Updated the quickstart guides to include all the supported integrations
- Fix issue that means traces with text inputs can’t be added to datasets
- Add the ability to edit dataset descriptions in the UI
- Released online evaluation rules - You can now define LLM as a Judge metrics that will automatically score all, or a subset, of your production traces.
SDK:
- New integration with CrewAI
- Released a new
evaluate_prompt
method that simplifies the evaluation of simple prompts templates - Added Sentry to the Python SDK so we can more easily