kotaemon: An Open-Source RAG Tool for Document Chat

This repository profile is provided by osrepos.com, an open source repository discovery platform.

kotaemon: An Open-Source RAG Tool for Document Chat

Summary

kotaemon is an open-source, RAG-based tool designed to facilitate interactive conversations with your documents. It provides a clean and customizable UI, catering to both end-users seeking document Q&A and developers building RAG pipelines.

Repository Information

Analyzed by OSRepos on November 22, 2025

Topics

Click on any tag to explore related repositories

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

kotaemon is an open-source, RAG-based tool designed to facilitate interactive conversations with your documents. It offers a clean and customizable UI, built for both end-users who want to perform QA on their documents and developers looking to build their own RAG pipelines. This project aims to provide a functional RAG UI, supporting various LLMs, easy installation, and a robust framework for RAG pipeline development.

For end-users, kotaemon provides a clean and minimalistic UI for RAG-based QA, compatibility with various LLM API providers (OpenAI, AzureOpenAI, Cohere, etc.), and local LLMs (via ollama and llama-cpp-python), along with easy installation. Developers benefit from a comprehensive framework for building RAG pipelines, a customizable UI built with Gradio, and a dedicated Gradio theme.

Key features include hosting your own document QA web-UI with multi-user login and file organization, support for various LLM and embedding models, a hybrid RAG pipeline for optimal retrieval quality, multi-modal QA support for documents with figures and tables, and advanced citations with in-browser document preview. The system also supports complex reasoning methods like question decomposition and agent-based reasoning (ReAct, ReWOO), and offers a configurable settings UI.

Installation

To get started with kotaemon, ensure you meet the system requirements: Python >= 3.10. Docker is optional but recommended for an easier setup. For processing additional file types beyond .pdf, .html, .mhtml, and .xlsx, you may need to install Unstructured.

With Docker (Recommended)

kotaemon provides lite, full, and ollama Docker images. The full version includes unstructured for broader file type support, while ollama bundles Ollama for local RAG.

docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-v ./ktem_app_data:/app/ktem_app_data \
-p 7860:7860 -it --rm \
ghcr.io/cinnamon/kotaemon:main-lite

Access the WebUI at http://localhost:7860/. You can specify the platform (e.g., --platform linux/arm64) if needed.

Without Docker

  1. Clone the repository and install required packages:
git clone https://github.com/Cinnamon/kotaemon
cd kotaemon
pip install -e "libs/kotaemon[all]"
pip install -e "libs/ktem"
  1. Create a .env file based on .env.example in the project root for initial model configuration.
  2. (Optional) For in-browser PDF viewer, download and extract PDF_JS_DIST to libs/ktem/ktem/assets/prebuilt.
  3. Start the web server:
python app.py

The app will launch in your browser. Default login is admin/admin.

Examples

kotaemon offers several ways to experience its capabilities:

  • Live Demos: Explore interactive versions on Hugging Face Spaces:

  • Local RAG with Colab: Try a local RAG setup using the provided Colab Notebook.

  • Visual Previews: The repository includes preview images showcasing the user interface, such as the chat tab and advanced citation features with in-browser PDF viewer.

Why Use kotaemon?

kotaemon stands out as a powerful and flexible solution for document Q&A due to several compelling reasons:

  • Open-Source & Community-Driven: Being open-source, it fosters transparency, community contributions, and continuous improvement.

  • Comprehensive RAG Capabilities: It provides a robust, hybrid RAG pipeline with full-text and vector retrieval, re-ranking, and support for complex reasoning methods like question decomposition and agent-based approaches (ReAct, ReWOO).

  • Versatile LLM Support: Seamlessly integrate with popular LLM API providers (OpenAI, AzureOpenAI, Cohere) or leverage local models via Ollama and llama-cpp-python for private RAG solutions.

  • User-Friendly & Customizable UI: The clean and minimalistic Gradio-based UI is intuitive for end-users, while its extensibility allows developers to customize or add new UI elements and integrate custom RAG pipelines.

  • Advanced Features: Benefit from multi-modal QA support (figures, tables), detailed citations with in-browser PDF preview and relevance scoring, and configurable settings to fine-tune retrieval and generation processes.

  • Easy Deployment: With Docker support and clear installation guides, setting up your own document QA web-UI is straightforward.

Links

Related repositories

Similar repositories that may be relevant next.

Hexabot: Open-Source AI Chatbot and Agent Builder

Hexabot: Open-Source AI Chatbot and Agent Builder

March 19, 2026

Hexabot is an open-source AI chatbot and agent builder designed for creating and managing multi-channel and multilingual conversational agents with ease. It offers extensive customization, powerful text-to-action capabilities, and supports integration with various LLM models, making it a flexible solution for developers. This project simplifies the deployment and management of sophisticated AI-powered interactions across different platforms.

aichatbotagent
Gurubase: AI-Powered Q&A Assistant Issue Tracker

Gurubase: AI-Powered Q&A Assistant Issue Tracker

January 17, 2026

Gurubase is a platform designed to transform your content into a 24/7 AI support assistant. This GitHub repository serves as the official hub for managing issues, feature requests, and bug reports related to the Gurubase product. It provides a centralized place for the community to contribute to the product's improvement.

aichatbotrag
AstrBot: Agentic IM ChatBot Infrastructure for Multi-Platform AI

AstrBot: Agentic IM ChatBot Infrastructure for Multi-Platform AI

December 31, 2025

AstrBot is an open-source, agentic IM chatbot infrastructure designed for seamless integration across multiple messaging platforms. It offers a powerful and user-friendly plugin system, supporting a wide range of advanced AI models and LLM platforms. This makes it an ideal solution for building reliable and scalable conversational AI applications, from personal AI companions to enterprise knowledge bases.

agentaichatbot
Typebot.io: A Powerful Self-Hostable Chatbot Builder

Typebot.io: A Powerful Self-Hostable Chatbot Builder

December 6, 2025

Typebot.io is an open-source, powerful chatbot builder that allows users to create advanced conversational bots visually. It supports embedding chatbots into web/mobile apps and collecting results in real-time, offering a flexible solution for various business needs.

chatbotchat-applicationform-builder

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️