Opik: Open-Source LLM Observability, Evaluation, and Optimization

Summary
Opik is an open-source platform by Comet designed to streamline the lifecycle of LLM applications. It provides comprehensive tools for debugging, evaluating, and monitoring RAG systems and agentic workflows. Developers can leverage its tracing, automated evaluations, and production-ready dashboards to build and optimize generative AI applications.
Repository Info
Tags
Click on any tag to explore related repositories
Introduction
Opik, by Comet, is a robust open-source platform designed to debug, evaluate, and monitor LLM applications, RAG systems, and agentic workflows. It offers comprehensive tracing, automated evaluations, and production-ready dashboards, streamlining the generative AI development lifecycle from prototype to production. With Opik, developers can optimize prompts and agents, ensure full observability of LLM calls, and implement safe, responsible AI practices.
Installation
Getting started with Opik is straightforward, with options for cloud deployment or self-hosting.
Option 1: Comet.com Cloud (Recommended)
Option 2: Self-Host for Full Control
Deploy Opik in your own environment, choosing between Docker for local setups or Kubernetes for scalability.
Self-Hosting with Docker Compose (Local Development)
For a local Opik instance, use the installation scripts:
On Linux or Mac:
# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git
# Navigate to the repository
cd opik
# Start the Opik platform
./opik.sh
On Windows:
# Clone the Opik repository
git clone https://github.com/comet-ml/opik.git
# Navigate to the repository
cd opik
# Start the Opik platform
powershell -ExecutionPolicy ByPass -c ".\opik.ps1"
For detailed instructions, refer to the Local Deployment Guide.
Self-Hosting with Kubernetes & Helm (Scalable Deployments)
For production or larger-scale self-hosted deployments, Opik can be installed on a Kubernetes cluster using its Helm chart.
Examples
Opik offers a comprehensive Python SDK and integrations to facilitate tracing and evaluation.
Python SDK Quick Start
Install the package and configure it:
# install using pip
pip install opik
# or install with uv
uv pip install opik
Configure the SDK:
opik configure
You can also configure programmatically, for example, opik.configure(use_local=True). Refer to the Python SDK documentation for more options.
Logging LLM Traces
The easiest way to log traces is to use one of Opik's many direct integrations, which support frameworks like LangChain, LlamaIndex, OpenAI, Autogen, and many others.
Alternatively, use the opik.track decorator:
import opik
opik.configure(use_local=True) # Run locally
@opik.track
def my_llm_function(user_question: str) -> str:
# Your LLM code here
return "Hello"
LLM as a Judge Metrics
Opik's Python SDK includes several LLM as a judge metrics to help evaluate your LLM application, such as hallucination detection.
from opik.evaluation.metrics import Hallucination
metric = Hallucination()
score = metric.score(
input="What is the capital of France?",
output="Paris",
context=["France is a country in Europe."]
)
print(score)
Explore more metrics in the metrics documentation.
Why Use Opik?
Opik is an essential tool for any generative AI developer, offering:
- Comprehensive Observability: Deep tracing of LLM calls, conversation logging, and agent activity.
- Advanced Evaluation: Robust prompt evaluation, LLM-as-a-judge metrics, and experiment management.
- Production-Ready: Scalable monitoring dashboards and online evaluation rules to identify production issues, supporting over 40 million traces per day.
- Optimization and Safety: Tools like Opik Agent Optimizer and Opik Guardrails to continuously improve and secure your LLM applications.
- Flexible Integration: Support for a wide range of frameworks and integration with CI/CD pipelines via PyTest.