Repository History

41 repositories tagged with Deep Learning

Topic: Deep Learning

InfiniteTalk: Unlimited-Length AI Video Generation from Audio or Images

InfiniteTalk is an innovative AI model for generating unlimited-length talking videos. It excels at creating realistic video content from audio, supporting both image-to-video and video-to-video generation. This framework ensures accurate lip synchronization and consistent identity preservation, aligning head movements, body posture, and facial expressions with the input audio.

Analyzed Nov 13, 2025

View Details

Gradio: Build and Share Machine Learning Apps in Python

Gradio is an open-source Python library that simplifies the creation and sharing of interactive web applications for machine learning models, APIs, or any Python function. It allows developers to quickly build user interfaces without needing JavaScript, CSS, or web hosting expertise, offering a straightforward way to demo AI projects. With Gradio, you can transform your Python functions into shareable web demos in just a few lines of code.

Analyzed Oct 31, 2025

View Details

LitServe: Build Custom Inference Engines for AI Models

LitServe is a powerful framework from Lightning AI designed to help developers build custom inference engines for a wide range of AI models and systems. It provides expert control over serving, supporting agents, multi-modal systems, RAG, and pipelines without the typical MLOps overhead. This framework offers a flexible and efficient solution for deploying AI models, whether self-hosted or managed on the Lightning AI platform.

Analyzed Oct 29, 2025

View Details

Leffa: Controllable Person Image Generation with Flow Fields in Attention

Leffa is a unified framework for controllable person image generation, enabling precise manipulation of appearance through virtual try-on and pose via pose transfer. This project addresses the common issue of fine-grained textural detail distortion by learning flow fields in attention, guiding target queries to correct reference keys. It achieves state-of-the-art performance, maintaining high image quality while significantly reducing detail distortion.

Analyzed Oct 12, 2025

View Details

chatterbox-vllm: Accelerating Chatterbox TTS with vLLM for Enhanced Performance

chatterbox-vllm is a high-performance port of the Chatterbox Text-to-Speech (TTS) model to vLLM, designed to significantly improve generation speed and GPU memory efficiency. This personal project aims to provide a more efficient and easily integratable solution for speech synthesis, offering substantial speedups compared to the original implementation. While currently usable and demonstrating benchmark-topping throughput, it leverages internal vLLM APIs and hacky workarounds, with ongoing refactoring planned.

Analyzed Oct 11, 2025

View Details

Previous Page 4 Next