tortoise.cpp: Local Text-to-Speech with GGML and C++

tortoise.cpp: Local Text-to-Speech with GGML and C++

Summary

tortoise.cpp is a C++ re-implementation of the popular Tortoise-TTS model, leveraging the efficient GGML library. This project enables high-quality, local text-to-speech generation without the need for Python dependencies. It aims to make advanced speech synthesis more accessible and performant on various hardware configurations.

Repository Info

Updated on May 12, 2026
View on GitHub

Tags

Click on any tag to explore related repositories

Introduction

tortoise.cpp is an impressive project that brings the powerful Tortoise-TTS text-to-speech model to a local, C++ environment using the GGML library. This re-implementation allows users to generate high-quality speech directly on their machines, offering an efficient alternative to Python-based solutions. By utilizing GGML, tortoise.cpp aims for optimized performance across different hardware, including CPU, CUDA-enabled GPUs, and Apple Metal.

Installation

To get started with tortoise.cpp, you'll first need to clone the repository and then compile it for your specific system. Ensure you clone recursively to fetch all necessary submodules.

Downloading

git clone --recursive https://github.com/balisujohn/tortoise.cpp.git

Compiling

For CPU (Linux x86 and Mac ARM):

mkdir build
cd build
cmake .. 
make

For CUDA-enabled GPUs (e.g., Ubuntu 22.04 with CUDA 12.0):

mkdir build
cd build
cmake .. -DGGML_CUBLAS=ON
make

For Mac OS with Metal (work in-progress):

mkdir build
cd build
cmake .. -DGGML_METAL=ON
make

Examples

Before running, you must download the necessary ggml-model.bin, ggml-vocoder-model.bin, and ggml-diffusion-model.bin files. These models can be found on Hugging Face.

Download Models: https://huggingface.co/balisujohn/tortoise-ggml

Place these files in the models directory within your tortoise.cpp project.

Basic Run Command (from the build directory):

./tortoise

Example with custom message, voice, and output:

./tortoise --message "based... dr freeman?" --voice "../models/mouse.bin" --seed 0 --output "based?.wav"

Note that only lowercase letters, spaces, and punctuation are supported in the prompt message.

Why Use It

tortoise.cpp offers several compelling reasons for developers and users interested in text-to-speech:

  • Local Execution: Perform speech synthesis entirely on your local machine, ensuring privacy and reducing reliance on cloud services.
  • Efficiency with GGML: Leverage the optimized performance of the GGML library, making it suitable for real-time or resource-constrained applications.
  • C++ Implementation: Benefit from the speed and control offered by C++, without the overhead of Python environments.
  • Cross-Platform Support: Compile and run on various operating systems and hardware, including CPUs, NVIDIA GPUs (CUDA), and Apple Silicon (Metal, in progress).
  • Open Source: The project is open source under the MIT License, encouraging contributions and community-driven development.

Links