Repository History

Explore all analyzed open source repositories

Topic: Multimodal AI
Kimi-k1.5: Scaling Reinforcement Learning with LLMs and Multimodality

Kimi-k1.5: Scaling Reinforcement Learning with LLMs and Multimodality

Kimi-k1.5 introduces an o1-level multi-modal model that significantly advances reinforcement learning with Large Language Models. It demonstrates state-of-the-art performance in short-CoT tasks, outperforming leading models like GPT-4o and Claude Sonnet 3.5, and matches o1 performance in long-CoT scenarios across various modalities. This project highlights key innovations in long context scaling and improved policy optimization.

Apr 17, 2026
View Details
fast-agent: Build and Orchestrate Multimodal AI Agents and Workflows

fast-agent: Build and Orchestrate Multimodal AI Agents and Workflows

fast-agent is a powerful Python framework designed for creating and interacting with sophisticated multimodal AI agents and workflows. It offers a simple, declarative syntax for defining agents, comprehensive model support, and unique features like end-to-end tested MCP (Multi-modal Communication Protocol) integration. Developers can rapidly build, test, and deploy complex agent applications with advanced capabilities such as structured outputs, vision, and various orchestration patterns.

Jan 9, 2026
View Details
Attachments: The Python Funnel for LLM Context and Multimodal Data

Attachments: The Python Funnel for LLM Context and Multimodal Data

Attachments simplifies providing context to Large Language Models by transforming various file types into model-ready text and images. This Python library acts as a universal funnel, enabling developers to integrate diverse data sources like PDFs, images, web content, and even entire code repositories with just a few lines of code. It supports popular LLM APIs and frameworks, making multimodal AI development more accessible.

Nov 24, 2025
View Details
Page 1