Repository History
4 repositories tagged with Conversational AI
Intervo: Open-Source Conversational AI Platform for Voice and Chat
Intervo is an open-source platform designed for building, deploying, and managing advanced, goal-oriented AI agents for both voice and chat. It enables users to create complex, multi-step conversational workflows that understand user intent, perform tasks, and integrate seamlessly with existing systems. This versatile platform supports multimodal interactions, from real-time voice calls to web chat, making it suitable for a wide range of applications.

TEN VAD: Low-Latency, High-Performance Voice Activity Detector
TEN VAD is a low-latency, high-performance, and lightweight Voice Activity Detector (VAD) designed for real-time enterprise use. It provides accurate frame-level speech activity detection, outperforming common alternatives like WebRTC VAD and Silero VAD. This system is crucial for enhancing conversational AI by reducing end-to-end latency and improving speech segment extraction.

CSM: A Conversational Speech Generation Model by SesameAILabs
CSM (Conversational Speech Model) is an advanced speech generation model from SesameAILabs, designed to create RVQ audio codes from text and audio inputs. It leverages a Llama backbone and a smaller audio decoder for Mimi audio codes, enabling high-quality, context-aware speech synthesis. The model is now natively available in Hugging Face Transformers, making it accessible for researchers and developers.

VideoSDK AI Agents: Build Real-time Multimodal Conversational AI
VideoSDK AI Agents is an open-source Python framework designed for developing real-time, multimodal conversational AI agents. It enables seamless, natural voice and media interactions between users and intelligent agents within VideoSDK rooms. This powerful framework supports integration with various AI models and tools, facilitating advanced conversational experiences.