Repository History

Explore all analyzed open source repositories

Topic: Machine Learning
FlashVideo: Efficient High-Resolution Video Generation with Flowing Fidelity

FlashVideo: Efficient High-Resolution Video Generation with Flowing Fidelity

FlashVideo is an innovative GitHub repository that introduces a novel approach for efficient high-resolution video generation. It leverages a two-stage diffusion model to produce detailed videos, scaling from 270p to 1080p. This project focuses on maintaining fidelity to detail while significantly improving the efficiency of the video generation process.

Nov 5, 2025
View Details
Weave by Weights & Biases: A Toolkit for AI-Powered Applications

Weave by Weights & Biases: A Toolkit for AI-Powered Applications

Weave is an open-source toolkit developed by Weights & Biases designed for building and managing AI-powered applications. It provides robust features for logging, debugging, and evaluating language model inputs and outputs, streamlining the development workflow for generative AI. Weave aims to bring rigor and best practices to the experimental process of AI software development.

Nov 3, 2025
View Details
Lance: Modern Columnar Data Format for ML and LLMs

Lance: Modern Columnar Data Format for ML and LLMs

Lance is a modern columnar data format, implemented in Rust, designed for machine learning and large language model workflows. It offers significant performance improvements over Parquet for random access, includes vector indexing, and supports data versioning. Compatible with popular tools like Pandas, DuckDB, and PyTorch, Lance streamlines data management for ML applications.

Nov 1, 2025
View Details
Gradio: Build and Share Machine Learning Apps in Python

Gradio: Build and Share Machine Learning Apps in Python

Gradio is an open-source Python library that simplifies the creation and sharing of interactive web applications for machine learning models, APIs, or any Python function. It allows developers to quickly build user interfaces without needing JavaScript, CSS, or web hosting expertise, offering a straightforward way to demo AI projects. With Gradio, you can transform your Python functions into shareable web demos in just a few lines of code.

Oct 31, 2025
View Details
Step-Video-T2V: State-of-the-Art Text-to-Video Generation Model

Step-Video-T2V: State-of-the-Art Text-to-Video Generation Model

Step-Video-T2V is a state-of-the-art text-to-video pre-trained model capable of generating videos up to 204 frames with 30 billion parameters. It achieves high efficiency through a deep compression Video-VAE and enhances visual quality using Direct Preference Optimization (DPO). The model's performance is validated on its novel benchmark, Step-Video-T2V-Eval, demonstrating superior text-to-video quality.

Oct 29, 2025
View Details
Plexe: Build Machine Learning Models from Natural Language Prompts

Plexe: Build Machine Learning Models from Natural Language Prompts

Plexe is an innovative Python library that empowers developers to build machine learning models using natural language descriptions. It automates the entire model creation process, from intent to deployment, through an intelligent multi-agent architecture. This allows for rapid development and experimentation, making ML accessible and efficient.

Oct 14, 2025
View Details
Fluxgym: Simple FLUX LoRA Training UI with Low VRAM Support

Fluxgym: Simple FLUX LoRA Training UI with Low VRAM Support

Fluxgym offers a user-friendly web interface for training FLUX LoRA models, specifically designed to support systems with low VRAM, such as 12GB, 16GB, and 20GB GPUs. It combines the simplicity of a Gradio UI, forked from AI-Toolkit, with the powerful and flexible training capabilities of Kohya sd-scripts. This tool allows users to easily train custom LoRAs, including advanced features like automatic sample image generation and direct publishing to Hugging Face.

Oct 12, 2025
View Details
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

LivePortrait is an official PyTorch implementation for efficient portrait animation, bringing still images and videos to life with advanced stitching and retargeting control. It supports both human and animal subjects, offering various features like image-driven mode, regional control, and precise editing. Widely adopted by major video platforms, LivePortrait provides a robust solution for generating dynamic animated portraits.

Oct 12, 2025
View Details
Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR: Advanced OCR with Vision Language Models via Ollama

Ollama-OCR is a robust Python package and Streamlit application for Optical Character Recognition. It leverages state-of-the-art vision language models, accessible through Ollama, to accurately extract text from both images and PDF documents. The tool offers extensive features including support for multiple models, various output formats, and batch processing capabilities.

Oct 12, 2025
View Details
AI Samples for .NET: Integrating AI into Your .NET Applications

AI Samples for .NET: Integrating AI into Your .NET Applications

The AI Samples for .NET repository provides a comprehensive collection of samples demonstrating how to integrate artificial intelligence into .NET applications. It features examples using Microsoft.Extensions.AI for unified API access to AI services and Microsoft.Extensions.AI.Evaluation for assessing LLM response quality. This resource is ideal for .NET developers looking to leverage AI, including large language models, in their projects.

Oct 11, 2025
View Details
chatterbox-vllm: Accelerating Chatterbox TTS with vLLM for Enhanced Performance

chatterbox-vllm: Accelerating Chatterbox TTS with vLLM for Enhanced Performance

chatterbox-vllm is a high-performance port of the Chatterbox Text-to-Speech (TTS) model to vLLM, designed to significantly improve generation speed and GPU memory efficiency. This personal project aims to provide a more efficient and easily integratable solution for speech synthesis, offering substantial speedups compared to the original implementation. While currently usable and demonstrating benchmark-topping throughput, it leverages internal vLLM APIs and hacky workarounds, with ongoing refactoring planned.

Oct 11, 2025
View Details
Page 1