# Voicebox: The Open-Source AI Voice Studio for Cloning and Dictation

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Source: osrepos.com
Repository profile: https://osrepos.com/repo/jamiepine-voicebox
Generated for open source discovery and AI-assisted research.

Voicebox is an innovative open-source AI voice studio that allows users to clone voices, generate speech in multiple languages, and dictate into any application. It provides a comprehensive, local-first voice I/O stack, offering a powerful alternative to cloud-based solutions. This tool ensures complete privacy and control over your voice data, running entirely on your local machine.

GitHub: https://github.com/jamiepine/voicebox
OSRepos URL: https://osrepos.com/repo/jamiepine-voicebox

## Summary

Voicebox is an innovative open-source AI voice studio that allows users to clone voices, generate speech in multiple languages, and dictate into any application. It provides a comprehensive, local-first voice I/O stack, offering a powerful alternative to cloud-based solutions. This tool ensures complete privacy and control over your voice data, running entirely on your local machine.

## Topics

- AI
- Voice Cloning
- Speech Synthesis
- Text-to-Speech
- Speech-to-Text
- Open Source
- TypeScript
- Desktop App

## Repository Information

Last analyzed by OSRepos: Thu Jun 25 2026 08:47:56 GMT+0100 (Western European Summer Time)
Detail views: 2
GitHub clicks: 1

## Safety Notice

OSRepos shares public repositories for knowledge and discovery only. Review source code, dependencies, licenses, and security implications before running or installing anything.

## Content

## Introduction

Voicebox is an innovative open-source AI voice studio designed for local-first operation, offering a powerful alternative to cloud-based solutions like ElevenLabs and WisprFlow. This comprehensive application allows users to clone voices from short audio samples, generate speech in 23 languages across 7 different TTS engines, and dictate text into any application using a global hotkey. Voicebox also integrates seamlessly with AI agents, providing a full voice input/output stack that runs entirely on your machine, ensuring complete privacy and control over your data.

Key features include:
*   **Complete privacy:** All models, voice data, and captures remain on your local machine.
*   **Diverse TTS engines:** Access 7 different Text-to-Speech engines, including Qwen3-TTS and LuxTTS.
*   **Multi-language support:** Generate speech in 23 languages, from English to Arabic, Japanese, and Hindi.
*   **Voice cloning and presets:** Create zero-shot voice clones or utilize over 50 curated preset voices.
*   **Advanced audio effects:** Apply pitch shift, reverb, delay, and other post-processing effects.
*   **Global dictation:** Use a hotkey for system-wide voice input, with Whisper-based Speech-to-Text.
*   **Agent integration:** Enable AI agents to speak in cloned voices via a simple API.

## Installation

Getting started with Voicebox is straightforward. Pre-built binaries are available for macOS and Windows, while Docker provides a convenient option for containerized deployment. Linux users can build from source.

**For macOS (Apple Silicon):**
[Download DMG](https://voicebox.sh/download/mac-arm){:target="_blank"}

**For macOS (Intel):**
[Download DMG](https://voicebox.sh/download/mac-intel){:target="_blank"}

**For Windows:**
[Download MSI](https://voicebox.sh/download/windows){:target="_blank"}

**For Docker:**
bash
docker compose up


For detailed instructions, including building from source on Linux, please refer to the [official documentation](https://docs.voicebox.sh){:target="_blank"}.

## Examples

Voicebox provides a robust API for integration into your own applications and scripts. Here are some examples of how to interact with the Voicebox API:

**Generate speech:**
bash
curl -X POST http://127.0.0.1:17493/generate \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "profile_id": "abc123", "language": "en"}'


**Agent voice output:**
bash
curl -X POST http://127.0.0.1:17493/speak \
  -H "Content-Type: application/json" \
  -H "X-Voicebox-Client-Id: my-script" \
  -d '{"text": "Deploy complete.", "profile": "Morgan"}'


**Transcribe an audio file:**
bash
curl -X POST http://127.0.0.1:17493/transcribe \
  -F "audio=@recording.wav" \
  -F "model=whisper-turbo"


**List voice profiles:**
bash
curl http://127.0.0.1:17493/profiles


Voicebox also ships with a built-in Model Context Protocol (MCP) server, allowing MCP-aware agents like Claude Code or Cursor to easily integrate voice capabilities.

## Why Use Voicebox?

Voicebox stands out as a powerful tool for anyone working with AI voice. Its local-first approach guarantees unparalleled privacy, as all your sensitive voice data and models remain on your machine. The extensive range of features, from multi-engine voice cloning and expressive speech generation to advanced audio effects and unlimited generation length, provides immense flexibility. Furthermore, its seamless integration with AI agents and global dictation capabilities make it an indispensable tool for developers, content creators, and anyone seeking a comprehensive, high-performance voice I/O solution. Built with Tauri (Rust) for native performance and supporting a wide array of GPUs, Voicebox delivers a fast and reliable experience across different platforms.

## Links

*   **GitHub Repository:** [jamiepine/voicebox](https://github.com/jamiepine/voicebox){:target="_blank"}
*   **Official Website:** [voicebox.sh](https://voicebox.sh){:target="_blank"}
*   **Documentation:** [docs.voicebox.sh](https://docs.voicebox.sh){:target="_blank"}
*   **Latest Releases:** [GitHub Releases](https://github.com/jamiepine/voicebox/releases/latest){:target="_blank"}