Gitingest: Transform GitHub Repositories into LLM-Friendly Code Extracts

Summary
Gitingest is a powerful tool designed to convert Git repositories into prompt-friendly text for Large Language Models (LLMs). It allows developers to easily obtain structured code extracts, making it simpler to feed codebase context into AI applications. With Gitingest, you can quickly generate digests from GitHub URLs or local directories, streamlining your AI-driven development workflows.
Repository Info
Tags
Click on any tag to explore related repositories
Introduction
Gitingest is an innovative tool that streamlines the process of extracting code context from Git repositories, making it readily consumable by Large Language Models (LLMs). Its core functionality allows you to transform any Git repository, whether local or remote, into a clean, prompt-friendly text digest. A unique feature is the ability to simply replace "hub" with "ingest" in any GitHub URL to instantly access a structured extract of the codebase.
Key features include:
- Easy code context: Generate a text digest from a Git repository URL or a local directory.
- Smart Formatting: Output is optimized for LLM prompts, ensuring clarity and relevance.
- Comprehensive Statistics: Get insights into file and directory structure, extract size, and token count.
- Versatile Access: Available as a CLI tool, a Python package, and convenient browser extensions.
Installation
Gitingest requires Python 3.8 or newer. For private repositories, a GitHub Personal Access Token (PAT) is necessary.
You can install Gitingest via pip:
pip install gitingest
To include server dependencies for self-hosting, use:
pip install gitingest[server]
Alternatively, pipx is recommended for installing Python applications:
pipx install gitingest
Examples
Browser Extension Usage
The Gitingest browser extensions for Chrome, Firefox, and Edge offer a seamless way to get code digests directly from GitHub. Simply navigate to a GitHub repository page and replace "hub" in the URL with "ingest" to view the prompt-friendly extract.
Command Line Usage
The gitingest CLI tool provides powerful options for analyzing codebases.
Basic usage, writing to digest.txt by default:
gitingest /path/to/directory
From a GitHub URL:
gitingest https://github.com/coderamp-labs/gitingest
From a specific subdirectory within a repository:
gitingest https://github.com/coderamp-labs/gitingest/tree/main/src/gitingest/utils
For private repositories, use the --token option or set the GITHUB_TOKEN environment variable:
gitingest https://github.com/username/private-repo --token github_pat_...
Include repository submodules or gitignored files:
gitingest https://github.com/username/repo-with-submodules --include-submodules
gitingest /path/to/directory --include-gitignored
Output to a specific file or STDOUT:
gitingest /path/to/directory --output my_digest.txt
gitingest /path/to/directory --output -
Python Package Usage
Integrate Gitingest directly into your Python applications.
Synchronous usage:
from gitingest import ingest
summary, tree, content = ingest("path/to/directory")
# or from URL
summary, tree, content = ingest("https://github.com/coderamp-labs/gitingest")
Asynchronous usage:
from gitingest import ingest_async
import asyncio
result = asyncio.run(ingest_async("path/to/directory"))
In Jupyter notebooks, you can use await directly:
from gitingest import ingest_async
summary, tree, content = await ingest_async("path/to/directory")
Why Use Gitingest?
Gitingest simplifies the complex task of preparing codebases for AI consumption. By providing a structured, prompt-friendly text digest, it significantly reduces the effort required to feed relevant code context into LLMs. This is invaluable for tasks like code generation, analysis, documentation, and debugging with AI assistance. Its flexibility, offering CLI, Python library, and browser extension access, ensures it fits seamlessly into various developer workflows. Whether you need a quick overview or detailed code insights, Gitingest delivers optimized output for your AI-driven development needs.