Repository History

5 repositories tagged with Diffusion Models

Topic: Diffusion Models
Transformer Lab App: An Open Source Platform for Frontier AI/ML Workflows

Transformer Lab App: An Open Source Platform for Frontier AI/ML Workflows

Transformer Lab App is an open-source machine learning research platform designed for frontier AI/ML workflows. It provides a comprehensive toolkit for large language models, allowing users to train, tune, and chat on their own machines, whether locally, on-prem, or in the cloud. Backed by Mozilla, this cross-platform application simplifies experimentation with a wide range of models.

Analyzed Dec 31, 2025
View Details
HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation

HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation

HunyuanVideo-Avatar is a cutting-edge project by Tencent-Hunyuan for high-fidelity, audio-driven human animation. Utilizing a multimodal diffusion transformer, it generates dynamic, emotion-controllable, and multi-character dialogue videos. This innovative system addresses critical challenges in character consistency, emotion alignment, and multi-character animation, making it suitable for diverse applications like e-commerce and social media.

Analyzed Dec 30, 2025
View Details
StreamDiffusion: Real-Time Interactive Generation with Diffusion Pipelines

StreamDiffusion: Real-Time Interactive Generation with Diffusion Pipelines

StreamDiffusion is an innovative diffusion pipeline designed for real-time interactive generation, significantly enhancing the performance of current diffusion-based image generation techniques. It offers a pipeline-level solution to achieve high-speed image and text-to-image generation, making interactive AI experiences more accessible. This project introduces several key features to optimize computational efficiency and GPU utilization.

Analyzed Dec 13, 2025
View Details
Step-Video-T2V: State-of-the-Art Text-to-Video Generation Model

Step-Video-T2V: State-of-the-Art Text-to-Video Generation Model

Step-Video-T2V is a state-of-the-art text-to-video pre-trained model capable of generating videos up to 204 frames with 30 billion parameters. It achieves high efficiency through a deep compression Video-VAE and enhances visual quality using Direct Preference Optimization (DPO). The model's performance is validated on its novel benchmark, Step-Video-T2V-Eval, demonstrating superior text-to-video quality.

Analyzed Oct 29, 2025
View Details
Leffa: Controllable Person Image Generation with Flow Fields in Attention

Leffa: Controllable Person Image Generation with Flow Fields in Attention

Leffa is a unified framework for controllable person image generation, enabling precise manipulation of appearance through virtual try-on and pose via pose transfer. This project addresses the common issue of fine-grained textural detail distortion by learning flow fields in attention, guiding target queries to correct reference keys. It achieves state-of-the-art performance, maintaining high image quality while significantly reducing detail distortion.

Analyzed Oct 12, 2025
View Details
Previous Page 1 Next
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️