Repository History
Explore all analyzed open source repositories

Chatterbox: State-of-the-Art Open-Source Text-to-Speech by Resemble AI
Chatterbox is a powerful family of open-source text-to-speech (TTS) models developed by Resemble AI, designed for high-quality speech generation. It features Chatterbox-Turbo, an efficient model with paralinguistic tags for added realism, alongside multilingual and general-purpose TTS options. These models provide robust solutions for voice agents, narration, and creative workflows, incorporating responsible AI features like built-in watermarking.
Spark-TTS: Efficient LLM-Based Text-to-Speech with Zero-Shot Voice Cloning
Spark-TTS is an advanced text-to-speech system that leverages large language models (LLM) for highly accurate and natural-sounding voice synthesis. Built on Qwen2.5, it offers streamlined efficiency, high-quality zero-shot voice cloning, bilingual support for Chinese and English, and controllable speech generation, making it versatile for both research and production.

sherpa-onnx: Offline Speech AI for Any Platform and Language
sherpa-onnx is a powerful open-source library providing comprehensive offline speech processing capabilities, including speech-to-text, text-to-speech, and speaker diarization. Built on next-gen Kaldi with ONNX Runtime, it offers broad support for embedded systems, mobile devices, and desktop platforms. With support for 12 programming languages, it makes advanced AI accessible without an internet connection.
YouTube Summarizer: AI-Powered Summaries for YouTube Videos and Playlists
YouTube Summarizer is a Flask web application designed to generate concise, AI-powered summaries of YouTube videos and entire playlists. It leverages advanced AI models like Google Gemini and OpenAI GPT, extracts transcripts, and can even convert summaries into audio using Google's Text-to-Speech API, offering a comprehensive tool for efficient content digestion.
Open NotebookLM: Convert PDFs into Personalized Podcast Episodes
Open NotebookLM is an innovative open-source project that transforms any PDF document into an engaging podcast episode. Inspired by NotebookLM, it leverages powerful LLMs and text-to-speech models to generate natural dialogue from your documents. This tool provides a unique way to consume information, making learning and content absorption more accessible and enjoyable.