{"name":"LangWatch: The Platform for LLM Evaluations and AI Agent Testing","description":"LangWatch is an open-source platform designed for end-to-end LLM evaluations and AI agent testing. It helps teams test, simulate, evaluate, and monitor LLM-powered agents both before release and in production. Built for robust regression testing, simulations, and production observability, LangWatch eliminates the need for custom tooling.","github":"https://github.com/langwatch/langwatch","url":"https://osrepos.com/repo/langwatch-langwatch","source":"osrepos.com","sourceDescription":"This repository profile is provided by osrepos.com, an open source repository discovery platform.","repositoryProfile":"https://osrepos.com/repo/langwatch-langwatch","generatedFor":"open source discovery and AI-assisted research","markdown":"https://osrepos.com/repo/langwatch-langwatch.md","json":"https://osrepos.com/repo/langwatch-langwatch.json","topics":["ai","analytics","datasets","dspy","evaluation","llm","llm-ops","observability"],"keywords":["ai","analytics","datasets","dspy","evaluation","llm","llm-ops","observability"],"stars":null,"summary":"LangWatch is an open-source platform designed for end-to-end LLM evaluations and AI agent testing. It helps teams test, simulate, evaluate, and monitor LLM-powered agents both before release and in production. Built for robust regression testing, simulations, and production observability, LangWatch eliminates the need for custom tooling.","content":"## Introduction\n\nLangWatch is a comprehensive platform for LLM evaluations and AI agent testing. It empowers teams to rigorously test, simulate, evaluate, and monitor their LLM-powered agents from development through to production. Designed for teams requiring robust regression testing, detailed simulations, and production observability, LangWatch offers a unified solution without the need for fragmented custom tools.\n\nKey features include end-to-end agent simulations, a combined loop for evaluation, observability, and prompt optimization, and an AI Gateway for governance and cost control. The platform is built on open standards like OpenTelemetry, ensuring flexibility and preventing vendor lock-in.\n\n## Installation\n\nGetting started with LangWatch is straightforward, with options for cloud, local, and self-hosted deployments.\n\n### Cloud\n\nThe easiest way to begin is by creating a free account on the LangWatch cloud platform:\n\n1.  [Create a free account](https://app.langwatch.ai \"Create a free account\" target=\"_blank\")\n2.  Create a project and obtain your API key.\n\n### Local Setup\n\n**Using npx (Node.js required):**\n\nbash\nnpx @langwatch/server\n\n\nThis command installs necessary components (uv, postgres, redis, clickhouse, AI gateway) into `~/.langwatch/`, sets up environment variables, and starts all services. LangWatch will be available at `http://localhost:5560`.\n\n**Using Docker Compose:**\n\nbash\ngit clone https://github.com/langwatch/langwatch.git\ncd langwatch\ncp langwatch/.env.example langwatch/.env\ndocker compose up -d --wait --build\n\n\nAfter running, LangWatch will be accessible at `http://localhost:5560`.\n\n### Deployment Options\n\nFor self-hosting on your own infrastructure, LangWatch supports:\n\n*   [Docker Compose](https://docs.langwatch.ai/self-hosting/open-source#docker-compose \"Docker Compose\" target=\"_blank\")\n*   [Kubernetes (Helm)](https://docs.langwatch.ai/self-hosting/open-source#helm-chart-for-langwatch \"Kubernetes (Helm)\" target=\"_blank\")\n*   [OnPrem](https://docs.langwatch.ai/self-hosting/onprem \"OnPrem\" target=\"_blank\") for cloud-specific setups (AWS, Google Cloud, Azure).\n\n## Examples\n\nTo quickly ship safer agents, start with these guides after creating a free account:\n\n*   **[Run your first agent simulation](https://langwatch.ai/scenario/introduction/getting-started \"Run your first agent simulation\" target=\"_blank\")**: Test agents against realistic scenarios before production.\n*   **[Set up evaluations](https://docs.langwatch.ai/llm-evaluation/offline-evaluation \"Set up evaluations\" target=\"_blank\")**: Measure quality, performance, and reliability.\n*   **[Send your first traces](https://docs.langwatch.ai/integration/overview \"Send your first traces\" target=\"_blank\")**: Integrate LangWatch with your existing stack.\n\nLangWatch also offers extensive integrations with popular frameworks and model providers, including LangChain, LangGraph, OpenAI, Anthropic, and many more, thanks to its OpenTelemetry-based tracing platform.\n\n## Why Use LangWatch?\n\nLangWatch provides full visibility into agent behavior and the necessary tools to systematically enhance reliability, performance, and cost efficiency, all while maintaining control over your AI system. Its unique value proposition includes:\n\n*   **End-to-end Agent Simulations**: Run realistic scenarios against your full stack to pinpoint agent failures.\n*   **Integrated Workflow**: Combine tracing, dataset creation, evaluation, and prompt optimization in one seamless loop.\n*   **Open Standards**: Built on OpenTelemetry, ensuring no vendor lock-in and compatibility across frameworks and LLM providers.\n*   **AI Gateway**: An OpenAI/Anthropic-compatible proxy offering virtual keys, hierarchical budgets, inline guardrails, and automatic fallback.\n*   **Enhanced Collaboration**: Features like run reviews, failure annotations, and GitHub integration for prompt management streamline team collaboration and accelerate fixes.\n\n## Links\n\n*   **Website**: [https://langwatch.ai](https://langwatch.ai \"LangWatch Website\" target=\"_blank\")\n*   **Documentation**: [https://docs.langwatch.ai](https://docs.langwatch.ai \"LangWatch Documentation\" target=\"_blank\")\n*   **GitHub Repository**: [https://github.com/langwatch/langwatch](https://github.com/langwatch/langwatch \"LangWatch GitHub Repository\" target=\"_blank\")\n*   **Discord Community**: [https://discord.gg/kT4PhDS2gH](https://discord.gg/kT4PhDS2gH \"LangWatch Discord\" target=\"_blank\")","metrics":{"detailViews":1,"githubClicks":2},"dates":{"published":null,"modified":"2026-04-28T00:25:22.000Z"}}