Repository History
Explore all analyzed open source repositories

GLM-OCR: Accurate, Fast, and Comprehensive Multimodal OCR Model
GLM-OCR is a powerful multimodal OCR model designed for complex document understanding, built on the GLM-V encoder-decoder architecture. It achieves state-of-the-art performance across various benchmarks, offering efficient inference and easy integration. This open-source solution is optimized for real-world business scenarios, providing robust and high-quality OCR capabilities.

OmniParse: Ingest, Parse, and Optimize Data for GenAI Frameworks
OmniParse is a powerful platform designed to ingest, parse, and optimize any unstructured data, from documents to multimedia, into structured, actionable formats. It enhances compatibility with GenAI frameworks, preparing data for applications like RAG and fine-tuning. This tool simplifies the complex process of data preparation for AI, making it accessible and efficient.

docling-api: Scalable Document to Markdown Conversion Server
docling-api is a robust and scalable backend server designed for converting a wide array of document formats, including PDFs, DOCX, and images, into Markdown. Built with FastAPI, Celery, and Redis, it supports both CPU and GPU processing, making it ideal for large-scale workflows requiring efficient text, table, and image extraction, along with OCR capabilities. This service offers flexible synchronous and asynchronous API endpoints for single and batch document conversions.

Papermerge DMS: Open Source Document Management for Digital Archives
Papermerge DMS is an open-source document management system specifically designed for scanned documents and digital archives. It leverages OCR technology to extract and index text, enabling full-text search and efficient organization. With a modern web UI, it provides a desktop-like experience for managing various document formats.

Kreuzberg: A Polyglot Document Intelligence Framework with a Rust Core
Kreuzberg is a powerful polyglot document intelligence framework built with a high-performance Rust core. It enables extraction of text, metadata, and structured information from over 50 file formats, including PDFs, Office documents, and images. Developers can leverage Kreuzberg across multiple languages like Rust, Python, Ruby, Go, and Node.js, or utilize it via CLI, REST API, or MCP server.

Marker: High-Accuracy Document Conversion to Markdown and JSON
Marker is an open-source Python tool designed for high-accuracy conversion of documents like PDFs, images, and office files into Markdown, JSON, and HTML. It excels at preserving complex formatting, extracting images, and can leverage LLMs for even greater precision. This makes Marker a powerful solution for structured document intelligence.

Ollama-OCR: Advanced OCR with Vision Language Models via Ollama
Ollama-OCR is a robust Python package and Streamlit application for Optical Character Recognition. It leverages state-of-the-art vision language models, accessible through Ollama, to accurately extract text from both images and PDF documents. The tool offers extensive features including support for multiple models, various output formats, and batch processing capabilities.