Repository History
Explore all analyzed open source repositories

CSM: A Conversational Speech Generation Model by SesameAILabs
CSM (Conversational Speech Model) is an advanced speech generation model from SesameAILabs, designed to create RVQ audio codes from text and audio inputs. It leverages a Llama backbone and a smaller audio decoder for Mimi audio codes, enabling high-quality, context-aware speech synthesis. The model is now natively available in Hugging Face Transformers, making it accessible for researchers and developers.

VideoSDK AI Agents: Build Real-time Multimodal Conversational AI
VideoSDK AI Agents is an open-source Python framework designed for developing real-time, multimodal conversational AI agents. It enables seamless, natural voice and media interactions between users and intelligent agents within VideoSDK rooms. This powerful framework supports integration with various AI models and tools, facilitating advanced conversational experiences.