Repository History

5 repositories tagged with Document Processing

Topic: Document Processing
Article-Assistant--RAG-Telegram-Bot: AI-Powered Knowledge Base via Telegram

Article-Assistant--RAG-Telegram-Bot: AI-Powered Knowledge Base via Telegram

The Article Assistant is a sophisticated RAG (Retrieval-Augmented Generation) Telegram bot designed to create interactive knowledge bases from various documents. Users can upload PDFs or provide URLs, and the bot will provide AI-powered answers with source citations. This tool efficiently transforms static content into a dynamic, queryable resource.

Analyzed Apr 24, 2026
View Details
RAG-Anything: The All-in-One Multimodal RAG Framework

RAG-Anything: The All-in-One Multimodal RAG Framework

RAG-Anything is a comprehensive, all-in-one Retrieval-Augmented Generation (RAG) framework designed to process and query diverse multimodal content. It seamlessly handles text, images, tables, and equations within a single integrated system, eliminating the need for multiple specialized tools. Built on LightRAG, this framework offers advanced multimodal retrieval capabilities for complex documents.

Analyzed Jan 31, 2026
View Details
docling-api: Scalable Document to Markdown Conversion Server

docling-api: Scalable Document to Markdown Conversion Server

docling-api is a robust and scalable backend server designed for converting a wide array of document formats, including PDFs, DOCX, and images, into Markdown. Built with FastAPI, Celery, and Redis, it supports both CPU and GPU processing, making it ideal for large-scale workflows requiring efficient text, table, and image extraction, along with OCR capabilities. This service offers flexible synchronous and asynchronous API endpoints for single and batch document conversions.

Analyzed Jan 30, 2026
View Details
gptpdf: Effortlessly Parse PDFs into Markdown with GPT-4o

gptpdf: Effortlessly Parse PDFs into Markdown with GPT-4o

gptpdf is a powerful Python library that leverages large visual models like GPT-4o to accurately parse PDF documents into clean Markdown format. With just 293 lines of code, it excels at preserving typography, math formulas, tables, and images. This tool offers an efficient and cost-effective solution for converting complex PDFs.

Analyzed Oct 24, 2025
View Details
Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR is a robust Python package and Streamlit application for Optical Character Recognition. It leverages state-of-the-art vision language models, accessible through Ollama, to accurately extract text from both images and PDF documents. The tool offers extensive features including support for multiple models, various output formats, and batch processing capabilities.

Analyzed Oct 12, 2025
View Details
Previous Page 1 Next
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️