Observability for LlamaIndex with Opik

LlamaIndex is a flexible data framework for building LLM applications:

LlamaIndex is a “data framework” to help you build LLM apps. It provides the following tools:

Offers data connectors to ingest your existing data sources and data formats (APIs, PDFs, docs, SQL, etc.).
Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs.
Provides an advanced retrieval/query interface over your data: Feed in any LLM input prompt, get back retrieved context and knowledge-augmented output.
Allows easy integrations with your outer application framework (e.g. with LangChain, Flask, Docker, ChatGPT, anything else).

Account Setup

Comet provides a hosted version of the Opik platform, simply create an account and grab your API Key.

You can also run the Opik platform locally, see the installation guide for more information.

Getting Started

Installation

To use the Opik integration with LlamaIndex, you’ll need to have both the opik and llama_index packages installed. You can install them using pip:

$ pip install opik llama-index llama-index-agent-openai llama-index-llms-openai llama-index-callbacks-opik

Configuring Opik

Configure the Opik Python SDK for your deployment type. See the Python SDK Configuration guide for detailed instructions on:

CLI configuration: opik configure
Code configuration: opik.configure()
Self-hosted vs Cloud vs Enterprise setup
Configuration files and environment variables

Configuring LlamaIndex

In order to use LlamaIndex, you will need to configure your LLM provider API keys. For this example, we’ll use OpenAI. You can find or create your API keys in these pages:

You can set them as environment variables:

$ export OPENAI_API_KEY="YOUR_API_KEY"

Or set them programmatically:

1 import os
2 import getpass
3 
4 if "OPENAI_API_KEY" not in os.environ:
5     os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Using the Opik integration

To use the Opik integration with LLamaIndex, you can use the set_global_handler function from the LlamaIndex package to set the global tracer:

1 from llama_index.core import global_handler, set_global_handler
2 
3 set_global_handler("opik")
4 opik_callback_handler = global_handler

Now that the integration is set up, all the LlamaIndex runs will be traced and logged to Opik.

Example

To showcase the integration, we will create a new a query engine that will use Paul Graham’s essays as the data source.

First step: Configure the Opik integration:

1 import os
2 from llama_index.core import global_handler, set_global_handler
3 
4 # Set project name for better organization
5 os.environ["OPIK_PROJECT_NAME"] = "llamaindex-integration-demo"
6 
7 set_global_handler("opik")
8 opik_callback_handler = global_handler

Second step: Download the example data:

1 import os
2 import requests
3 
4 # Create directory if it doesn't exist
5 os.makedirs('./data/paul_graham/', exist_ok=True)
6 
7 # Download the file using requests
8 url = 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt'
9 response = requests.get(url)
10 with open('./data/paul_graham/paul_graham_essay.txt', 'wb') as f:
11     f.write(response.content)

Third step:

Configure the OpenAI API key:

1 import os
2 import getpass
3 
4 if "OPENAI_API_KEY" not in os.environ:
5     os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Fourth step:

We can now load the data, create an index and query engine:

1 from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
2 
3 documents = SimpleDirectoryReader("./data/paul_graham").load_data()
4 index = VectorStoreIndex.from_documents(documents)
5 query_engine = index.as_query_engine()
6 
7 response = query_engine.query("What did the author do growing up?")
8 print(response)

Given that the integration with Opik has been set up, all the traces are logged to the Opik platform:

Cost Tracking

The Opik integration with LlamaIndex automatically tracks token usage and cost for all supported LLM models used within LlamaIndex applications.

Cost information is automatically captured and displayed in the Opik UI, including:

Token usage details
Cost per request based on model pricing
Total trace cost

View the complete list of supported models and providers on the Supported Models page.