Repository History

1 repository tagged with Speech Generation

Topic: Speech Generation

CSM: A Conversational Speech Generation Model by SesameAILabs

CSM (Conversational Speech Model) is an advanced speech generation model from SesameAILabs, designed to create RVQ audio codes from text and audio inputs. It leverages a Llama backbone and a smaller audio decoder for Mimi audio codes, enabling high-quality, context-aware speech synthesis. The model is now natively available in Hugging Face Transformers, making it accessible for researchers and developers.

Analyzed Jan 11, 2026

View Details

Previous Page 1 Next