Repository History
Explore all analyzed open source repositories

VoxCPM: Tokenizer-Free TTS for Multilingual Speech, Voice Design, and Cloning
VoxCPM2 is a groundbreaking tokenizer-free Text-to-Speech system, offering highly natural and expressive synthesis across 30 languages. It enables creative voice design from natural language descriptions and provides advanced controllable voice cloning capabilities. With its 2B parameter model, VoxCPM2 delivers 48kHz studio-quality audio, making it a powerful tool for diverse speech generation needs.

MOSS-TTS Family: Open-Source High-Fidelity Speech and Sound Generation
The MOSS-TTS Family offers an open-source suite of models for high-fidelity, highly expressive speech and sound generation. Designed for complex real-world scenarios, it covers stable long-form speech, multi-speaker dialogue, voice design, environmental sound effects, and real-time streaming TTS. This comprehensive family of models from MOSI.AI and OpenMOSS team provides robust solutions for diverse audio generation needs.

tortoise.cpp: Local Text-to-Speech with GGML and C++
tortoise.cpp is a C++ re-implementation of the popular Tortoise-TTS model, leveraging the efficient GGML library. This project enables high-quality, local text-to-speech generation without the need for Python dependencies. It aims to make advanced speech synthesis more accessible and performant on various hardware configurations.
StreamingKokoroJS: Unlimited, Local Text-to-Speech in Your Browser
StreamingKokoroJS provides unlimited text-to-speech capabilities directly within your browser, ensuring 100% local processing and complete privacy. This open-source project leverages the Kokoro-JS model and WebGPU acceleration to deliver high-quality, streaming audio generation without server-side interaction.

TTSFM: OpenAI-Compatible Text-to-Speech API Service (Project Notice)
TTSFM was a project designed to mirror OpenAI's TTS service, offering a compatible API for free text-to-speech conversion with multiple voice options. Built on the openai.fm backend, it provided a Python SDK, RESTful API, and a web playground for easy testing and integration. Please note, the project is no longer functional as the openai.fm demo website has been shut down.

GPT-SoVITS: Few-Shot Voice Cloning and Text-to-Speech WebUI
GPT-SoVITS is a powerful web-based tool for few-shot voice conversion and text-to-speech. It allows users to train a high-quality TTS model with as little as one minute of voice data. This project offers robust voice cloning capabilities and cross-lingual support, making advanced voice synthesis accessible.
ChatTTS: A Generative Speech Model for Natural Dialogue and LLM Assistants
ChatTTS is an advanced text-to-speech model specifically designed for dialogue scenarios, such as those involving LLM assistants. It offers highly natural and expressive speech synthesis, featuring fine-grained control over prosodic elements like laughter, pauses, and interjections. This Python-based project supports both English and Chinese, making it a powerful tool for conversational AI applications.