LangWatch: The Platform for LLM Evaluations and AI Agent Testing

This repository profile is provided by osrepos.com, an open source repository discovery platform.

LangWatch: The Platform for LLM Evaluations and AI Agent Testing

Summary

LangWatch is an open-source platform designed for end-to-end LLM evaluations and AI agent testing. It helps teams test, simulate, evaluate, and monitor LLM-powered agents both before release and in production. Built for robust regression testing, simulations, and production observability, LangWatch eliminates the need for custom tooling.

Repository Information

Analyzed by OSRepos on April 28, 2026

Topics

Click on any tag to explore related repositories

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

LangWatch is a comprehensive platform for LLM evaluations and AI agent testing. It empowers teams to rigorously test, simulate, evaluate, and monitor their LLM-powered agents from development through to production. Designed for teams requiring robust regression testing, detailed simulations, and production observability, LangWatch offers a unified solution without the need for fragmented custom tools.

Key features include end-to-end agent simulations, a combined loop for evaluation, observability, and prompt optimization, and an AI Gateway for governance and cost control. The platform is built on open standards like OpenTelemetry, ensuring flexibility and preventing vendor lock-in.

Installation

Getting started with LangWatch is straightforward, with options for cloud, local, and self-hosted deployments.

Cloud

The easiest way to begin is by creating a free account on the LangWatch cloud platform:

  1. Create a free account
  2. Create a project and obtain your API key.

Local Setup

Using npx (Node.js required):

npx @langwatch/server

This command installs necessary components (uv, postgres, redis, clickhouse, AI gateway) into ~/.langwatch/, sets up environment variables, and starts all services. LangWatch will be available at http://localhost:5560.

Using Docker Compose:

git clone https://github.com/langwatch/langwatch.git
cd langwatch
cp langwatch/.env.example langwatch/.env
docker compose up -d --wait --build

After running, LangWatch will be accessible at http://localhost:5560.

Deployment Options

For self-hosting on your own infrastructure, LangWatch supports:

Examples

To quickly ship safer agents, start with these guides after creating a free account:

LangWatch also offers extensive integrations with popular frameworks and model providers, including LangChain, LangGraph, OpenAI, Anthropic, and many more, thanks to its OpenTelemetry-based tracing platform.

Why Use LangWatch?

LangWatch provides full visibility into agent behavior and the necessary tools to systematically enhance reliability, performance, and cost efficiency, all while maintaining control over your AI system. Its unique value proposition includes:

  • End-to-end Agent Simulations: Run realistic scenarios against your full stack to pinpoint agent failures.
  • Integrated Workflow: Combine tracing, dataset creation, evaluation, and prompt optimization in one seamless loop.
  • Open Standards: Built on OpenTelemetry, ensuring no vendor lock-in and compatibility across frameworks and LLM providers.
  • AI Gateway: An OpenAI/Anthropic-compatible proxy offering virtual keys, hierarchical budgets, inline guardrails, and automatic fallback.
  • Enhanced Collaboration: Features like run reviews, failure annotations, and GitHub integration for prompt management streamline team collaboration and accelerate fixes.

Links

Related repositories

Similar repositories that may be relevant next.

agentmemory: Persistent Memory for AI Coding Agents

agentmemory: Persistent Memory for AI Coding Agents

May 27, 2026

agentmemory provides persistent memory for AI coding agents, ensuring they remember past interactions and project context across sessions. This eliminates the need for re-explaining, significantly boosting agent efficiency and reducing token costs. Built on the `iii engine`, it offers high retrieval accuracy and multi-agent support without external databases.

agentmemoryagentsai
AI Website Cloner Template: Clone Websites with AI Coding Agents

AI Website Cloner Template: Clone Websites with AI Coding Agents

May 26, 2026

The AI Website Cloner Template is an innovative open-source project that leverages AI coding agents to reverse-engineer any website into a clean, modern Next.js codebase. It enables users to clone entire websites with a single command, extracting design tokens, assets, and reconstructing sections in parallel. This tool is ideal for platform migration, recovering lost source code, or learning web development by deconstructing live sites.

aiai-agentsnextjs
claude-code-webui: A Web Interface for Claude CLI with Streaming Responses

claude-code-webui: A Web Interface for Claude CLI with Streaming Responses

April 30, 2026

claude-code-webui transforms the command-line Claude CLI experience into an intuitive web-based chat interface. It offers real-time streaming responses, visual project selection, and mobile-responsive design. This tool enhances productivity by providing a rich, visual environment for interacting with Claude Code locally.

claudeclaude-cliweb-ui
Index: The SOTA Open-Source Browser Agent for Autonomous Web Tasks

Index: The SOTA Open-Source Browser Agent for Autonomous Web Tasks

April 27, 2026

Index is a cutting-edge open-source browser agent designed to autonomously execute complex tasks on the web. It transforms any website into an accessible API, enabling seamless integration and automation. Leveraging powerful reasoning LLMs with vision capabilities, Index simplifies web interactions and data extraction for developers.

aiai-agentbrowser-agent

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️