Repository History

Explore all analyzed open source repositories

Topic: OCR
GLM-OCR: Accurate, Fast, and Comprehensive Multimodal OCR Model

GLM-OCR: Accurate, Fast, and Comprehensive Multimodal OCR Model

GLM-OCR is a powerful multimodal OCR model designed for complex document understanding, built on the GLM-V encoder-decoder architecture. It achieves state-of-the-art performance across various benchmarks, offering efficient inference and easy integration. This open-source solution is optimized for real-world business scenarios, providing robust and high-quality OCR capabilities.

May 28, 2026
View Details
OmniParse: Ingest, Parse, and Optimize Data for GenAI Frameworks

OmniParse: Ingest, Parse, and Optimize Data for GenAI Frameworks

OmniParse is a powerful platform designed to ingest, parse, and optimize any unstructured data, from documents to multimedia, into structured, actionable formats. It enhances compatibility with GenAI frameworks, preparing data for applications like RAG and fine-tuning. This tool simplifies the complex process of data preparation for AI, making it accessible and efficient.

Apr 7, 2026
View Details
docling-api: Scalable Document to Markdown Conversion Server

docling-api: Scalable Document to Markdown Conversion Server

docling-api is a robust and scalable backend server designed for converting a wide array of document formats, including PDFs, DOCX, and images, into Markdown. Built with FastAPI, Celery, and Redis, it supports both CPU and GPU processing, making it ideal for large-scale workflows requiring efficient text, table, and image extraction, along with OCR capabilities. This service offers flexible synchronous and asynchronous API endpoints for single and batch document conversions.

Jan 30, 2026
View Details
Papermerge DMS: Open Source Document Management for Digital Archives

Papermerge DMS: Open Source Document Management for Digital Archives

Papermerge DMS is an open-source document management system specifically designed for scanned documents and digital archives. It leverages OCR technology to extract and index text, enabling full-text search and efficient organization. With a modern web UI, it provides a desktop-like experience for managing various document formats.

Jan 8, 2026
View Details
Kreuzberg: A Polyglot Document Intelligence Framework with a Rust Core

Kreuzberg: A Polyglot Document Intelligence Framework with a Rust Core

Kreuzberg is a powerful polyglot document intelligence framework built with a high-performance Rust core. It enables extraction of text, metadata, and structured information from over 50 file formats, including PDFs, Office documents, and images. Developers can leverage Kreuzberg across multiple languages like Rust, Python, Ruby, Go, and Node.js, or utilize it via CLI, REST API, or MCP server.

Dec 30, 2025
View Details
Marker: High-Accuracy Document Conversion to Markdown and JSON

Marker: High-Accuracy Document Conversion to Markdown and JSON

Marker is an open-source Python tool designed for high-accuracy conversion of documents like PDFs, images, and office files into Markdown, JSON, and HTML. It excels at preserving complex formatting, extracting images, and can leverage LLMs for even greater precision. This makes Marker a powerful solution for structured document intelligence.

Nov 9, 2025
View Details
Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR is a robust Python package and Streamlit application for Optical Character Recognition. It leverages state-of-the-art vision language models, accessible through Ollama, to accurately extract text from both images and PDF documents. The tool offers extensive features including support for multiple models, various output formats, and batch processing capabilities.

Oct 12, 2025
View Details
Page 1