TTSFM: OpenAI-Compatible Text-to-Speech API Service (Project Notice)

TTSFM: OpenAI-Compatible Text-to-Speech API Service (Project Notice)

Summary

TTSFM was a project designed to mirror OpenAI's TTS service, offering a compatible API for free text-to-speech conversion with multiple voice options. Built on the openai.fm backend, it provided a Python SDK, RESTful API, and a web playground for easy testing and integration. Please note, the project is no longer functional as the openai.fm demo website has been shut down.

Repository Info

Updated on March 1, 2026
View on GitHub

Tags

Click on any tag to explore related repositories

Introdução

TTSFM was an open-source project that aimed to provide a free, OpenAI-compatible text-to-speech API service. It offered a comprehensive solution for converting text into natural-sounding speech, leveraging the openai.fm backend, which was based on OpenAI's GPT-4o mini TTS. The project included a powerful Python SDK, RESTful API endpoints, and an intuitive web playground for testing and integration.

?? NOTICE: This project is no longer functional as the openai.fm demo website has been shut down.

Key features that TTSFM offered included:

  • Multiple Voices: A selection of 11 OpenAI-compatible voices (alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer, verse).
  • Flexible Audio Formats: Support for 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM).
  • Speed Control: Adjustable playback speed from 0.25x to 4.0x.
  • Long Text Support: Automatic text splitting and audio combining for extended content.
  • Real-time Streaming: WebSocket support for streaming audio generation.
  • Python SDK: Easy-to-use synchronous and asynchronous clients.
  • Web Playground: An interactive web interface for testing and experimentation.
  • Docker Ready: Pre-built Docker images for instant deployment.
  • OpenAI Compatible: Designed as a drop-in replacement for OpenAI's TTS API.

Instalação

Although the project is no longer functional, for historical and educational purposes, here were the installation methods:

Python package

pip install ttsfm        # core client
pip install ttsfm[web]   # core client + web/server dependencies

Docker image

TTSFM offered two Docker image variants:

Full variant (recommended)

docker run -p 8000:8000 dbcccc/ttsfm:latest

This variant included ffmpeg for advanced features like all 6 audio formats, speed adjustment, and format conversion.

Slim variant - ~100MB

docker run -p 8000:8000 dbcccc/ttsfm:slim

This minimal image provided basic TTS functionality with MP3 and WAV formats only, without speed adjustment or advanced conversion.

The container exposed the web playground at http://localhost:8000 and an OpenAI-compatible endpoint at /v1/audio/speech.

Exemplos

Here are examples of how TTSFM could be used:

Python client

from ttsfm import TTSClient, AudioFormat, Voice

client = TTSClient()

# Basic usage
response = client.generate_speech(
    text="Hello from TTSFM!",
    voice=Voice.ALLOY,
    response_format=AudioFormat.MP3,
)
response.save_to_file("hello")  # -> hello.mp3

# With speed adjustment (requires ffmpeg)
response = client.generate_speech(
    text="This will be faster!",
    voice=Voice.NOVA,
    response_format=AudioFormat.MP3,
    speed=1.5,  # 1.5x speed (0.25 - 4.0)
)
response.save_to_file("fast")  # -> fast.mp3

CLI

ttsfm "Hello, world" --voice nova --format mp3 --output hello.mp3

REST API (OpenAI-compatible)

# Basic request
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello world!",
    "voice": "alloy",
    "response_format": "mp3"
  }' --output speech.mp3

# With speed adjustment (requires full image)
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello world!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.5
  }' --output speech_fast.mp3

Porquê usar (Historical Context)

TTSFM was a significant project for several reasons, even though it is no longer active. It demonstrated how an OpenAI-compatible text-to-speech service could be self-hosted and offered for free, providing an alternative for developers. Its comprehensive feature set, including multiple voices, audio formats, speed control, and long text support, made it a versatile tool for various applications. The project's Python SDK and Docker readiness also highlighted its ease of integration and deployment.

?? Disclaimer: This project was intended for educational and research purposes only. It was a reverse-engineered implementation of the openai.fm service and was not recommended for commercial use or in production environments. Users were responsible for ensuring compliance with applicable laws and terms of service.

Links