Repository History

4 repositories tagged with Multimodal

Topic: Multimodal
PixelRAG: Pixel-Native Search for Visual Retrieval-Augmented Generation

PixelRAG: Pixel-Native Search for Visual Retrieval-Augmented Generation

PixelRAG revolutionizes search by enabling pixel-native retrieval, moving beyond traditional text parsing. It renders documents as screenshots, preserving visual context like tables and charts, which is crucial for accurate answers from reader models. This allows for searching any document based on its visual appearance, not just its textual content.

Analyzed Jun 22, 2026
View Details
Qwen3-VL: A Powerful Multimodal Large Language Model Series

Qwen3-VL: A Powerful Multimodal Large Language Model Series

Qwen3-VL is a cutting-edge multimodal large language model series from Alibaba Cloud's Qwen team. It offers significant advancements in visual and text understanding, extended context length, and enhanced agent capabilities. This model is designed for flexible deployment, scaling from edge to cloud.

Analyzed Jun 15, 2026
View Details
VideoSDK AI Agents: Build Real-time Multimodal Conversational AI

VideoSDK AI Agents: Build Real-time Multimodal Conversational AI

VideoSDK AI Agents is an open-source Python framework designed for developing real-time, multimodal conversational AI agents. It enables seamless, natural voice and media interactions between users and intelligent agents within VideoSDK rooms. This powerful framework supports integration with various AI models and tools, facilitating advanced conversational experiences.

Analyzed Dec 11, 2025
View Details
Podcastfy: Transform Multimodal Content into AI-Generated Multilingual Podcasts

Podcastfy: Transform Multimodal Content into AI-Generated Multilingual Podcasts

Podcastfy is an open-source Python package that transforms diverse multimodal content, such as text, images, and videos, into engaging multilingual audio conversations. Utilizing generative AI, it offers a flexible and programmatic alternative to tools like NotebookLM, focusing on customization and scalability. This makes it an excellent solution for content creators, educators, and researchers aiming to broaden their audience reach and improve content accessibility.

Analyzed Nov 9, 2025
View Details
Previous Page 1 Next
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️