KAG: Knowledge Augmented Generation for LLM Reasoning in Professional Domains

Introduction

KAG, or Knowledge Augmented Generation, is an innovative framework developed by OpenSPG that enhances Large Language Models (LLMs) with advanced logical reasoning and retrieval capabilities. Built on the OpenSPG engine, KAG is tailored for constructing sophisticated Q&A solutions within professional domain knowledge bases. It addresses critical shortcomings of traditional RAG (Retrieval Augmented Generation) models, such as ambiguity in vector similarity calculations, and mitigates noise issues often found in GraphRAG approaches.

The primary goal of KAG is to establish a knowledge-enhanced LLM service framework for professional domains, supporting complex logical reasoning and factual multi-hop Q&A. KAG deeply integrates the logical and factual characteristics of Knowledge Graphs (KGs) through several core features:

Knowledge and Chunk Mutual Indexing structure: Integrates more complete contextual text information.
Knowledge alignment using conceptual semantic reasoning: Alleviates noise problems caused by OpenIE.
Schema-constrained knowledge construction: Supports the representation and construction of domain expert knowledge.
Logical form-guided hybrid reasoning and retrieval: Enables logical reasoning and multi-hop reasoning Q&A.

For a deeper dive into the technical details, you can explore the associated research paper: KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation.

Installation

KAG offers both product-based and toolkit-based installation methods to cater to different user needs.

Product-based (for ordinary users)

This method is ideal for users who want to quickly deploy and use the KAG product.

Recommended System Version:

macOS User: macOS Monterey 12.6 or later
Linux User: CentOS 7 / Ubuntu 20.04 or later
Windows User: Windows 10 LTSC 2021 or later

Software Requirements:

macOS / Linux User: Docker, Docker Compose
Windows User: WSL 2 / Hyper-V, Docker, Docker Compose

Use the following commands to download the docker-compose.yml file and launch the services with Docker Compose:

# set the HOME environment variable (only Windows users need to execute this command)
# set HOME=%USERPROFILE%

curl -sSL https://raw.githubusercontent.com/OpenSPG/openspg/refs/heads/master/dev/release/docker-compose-west.yml -o docker-compose-west.yml
docker compose -f docker-compose-west.yml up -d

After installation, navigate to the default URL of the KAG product with your browser: http://127.0.0.1:8887.

Default Username: openspg
Default password: openspg@kag

For detailed instructions, refer to the KAG usage (product mode) guide.

Toolkit-based (for developers)

This method is suitable for developers who wish to integrate KAG components into their projects.

Engine & Dependent Image Installation:

Refer to the product-based installation section above to complete the installation of the engine and dependent images.

Installation of KAG:

macOS / Linux developers:

# Create conda env: conda create -n kag-demo python=3.10 && conda activate kag-demo

# Clone code: git clone https://github.com/OpenSPG/KAG.git

# Install KAG: cd KAG && pip install -e .

Windows developers:

# Install the official Python 3.10 or later, install Git.

# Create and activate Python venv: py -m venv kag-demo && kag-demo\Scripts\activate

# Clone code: git clone https://github.com/OpenSPG/KAG.git

# Install KAG: cd KAG && pip install -e .

Examples

Once KAG is installed, developers can leverage its toolkit to reproduce performance results on built-in datasets and apply these powerful components to new business scenarios. The product mode provides an intuitive interface for users to interact with the framework directly.

For comprehensive guidance on using the toolkit and exploring its capabilities, please refer to the KAG usage (developer mode) guide.

Why Use KAG?

KAG stands out by offering a robust solution for knowledge-intensive LLM applications, particularly in professional domains. Its key advantages include:

Overcoming RAG Limitations: KAG effectively addresses the ambiguity inherent in traditional RAG's vector similarity calculations, leading to more precise and relevant retrievals.
Mitigating GraphRAG Noise: It alleviates the noise problem introduced by OpenIE in GraphRAG, ensuring higher quality knowledge integration.
Advanced Logical Reasoning: The framework supports complex logical reasoning and multi-hop factual Q&A, enabling LLMs to answer intricate questions that require chaining multiple pieces of information.
Schema-Constrained Knowledge: KAG allows for the construction of domain expert knowledge with schema constraints, ensuring accuracy and consistency.
Unified Knowledge Representation: It provides a unique mutual index representation between graph structures and original text blocks, facilitating efficient retrieval and reasoning.

KAG: Knowledge Augmented Generation for LLM Reasoning in Professional Domains

Summary

Repository Info

Tags