parakeet-mlx: Nvidia's Parakeet ASR Models on Apple Silicon with MLX

This repository profile is provided by osrepos.com, an open source repository discovery platform.

parakeet-mlx: Nvidia's Parakeet ASR Models on Apple Silicon with MLX

Summary

parakeet-mlx is an open-source project that implements Nvidia's advanced Automatic Speech Recognition (ASR) Parakeet models for Apple Silicon, leveraging the MLX framework for optimized performance. This Python library offers both a command-line interface and a flexible Python API, enabling efficient transcription of audio files, including real-time streaming capabilities. It provides a powerful solution for developers and researchers working with speech processing on Apple hardware.

Repository Information

Analyzed by OSRepos on January 29, 2026

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

parakeet-mlx is an implementation of Nvidia's Parakeet models, which are Automatic Speech Recognition (ASR) models, optimized for Apple Silicon using the MLX framework. This open-source project allows users to efficiently transcribe audio files, leveraging Apple hardware for superior performance.

With parakeet-mlx, you can easily convert speech to text using a straightforward command-line interface (CLI) or integrate advanced ASR capabilities into your Python applications. It supports various output options, including subtitles with word-level timestamps, and offers features like beam decoding, audio chunking for long files, and real-time streaming transcription.

Installation

Before installing, make sure you have ffmpeg installed on your system, as it is required for the CLI to work properly.

Using uv (recommended):

To add as a project dependency:

uv add parakeet-mlx -U

Or, for the CLI globally:

uv tool install parakeet-mlx -U

Using pip:

pip install parakeet-mlx -U

Examples

CLI Quick Start

Transcribe a single audio file:

parakeet-mlx audio.mp3

Transcribe multiple files and generate VTT subtitles with word-level timestamps:

parakeet-mlx *.mp3 --output-format vtt --highlight-words

Generate all available output formats:

parakeet-mlx audio.mp3 --output-format all

Python API Quick Start

Transcribe a file:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

result = model.transcribe("audio_file.wav")

print(result.text)

Check timestamps:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

result = model.transcribe("audio_file.wav")

print(result.sentences)
# [AlignedSentence(text="Hello World.", start=1.01, end=2.04, duration=1.03, tokens=[...])]

Do chunking:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

result = model.transcribe("audio_file.wav", chunk_duration=60 * 2.0, overlap_duration=15.0)

print(result.sentences)

Streaming Transcription:

For real-time transcription, use the transcribe_stream method:

from parakeet_mlx import from_pretrained
from parakeet_mlx.audio import load_audio
import numpy as np

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v3")

# Create a streaming context
with model.transcribe_stream(
    context_size=(256, 256),  # (left_context, right_context) frames
) as transcriber:
    # Simulate real-time audio chunks
    audio_data = load_audio("audio_file.wav", model.preprocessor_config.sample_rate)
    chunk_size = model.preprocessor_config.sample_rate  # 1 second chunks

    for i in range(0, len(audio_data), chunk_size):
        chunk = audio_data[i:i+chunk_size]
        transcriber.add_audio(chunk)

        # Access current transcription
        result = transcriber.result
        print(f"Current text: {result.text}")

Why Use parakeet-mlx?

parakeet-mlx stands out as an essential tool for anyone needing high-performance ASR capabilities on Apple Silicon devices.

  • Optimized for Apple Silicon: By leveraging the MLX framework, parakeet-mlx delivers native and efficient performance, making it ideal for Mac users.
  • High-Quality ASR: It implements Nvidia's Parakeet models, known for their accuracy and robustness in speech recognition.
  • Versatility: Whether you prefer a command-line tool for quick tasks or a flexible Python API for integration into larger projects, parakeet-mlx has you covered.
  • Advanced Features: From detailed word and sentence-level timestamps to advanced decoding options and real-time streaming transcription, the project offers a rich set of functionalities for diverse needs.
  • Ease of Use: With clear installation instructions and comprehensive examples, it is accessible to both beginners and experienced developers.

Links

For more details, documentation, and to contribute to the project, visit the official GitHub repository:

Related repositories

Similar repositories that may be relevant next.

LLM Guard: The Security Toolkit for LLM Interactions

LLM Guard: The Security Toolkit for LLM Interactions

June 26, 2026

LLM Guard is an open-source security toolkit developed by Protect AI, designed to fortify the safety of Large Language Models. It offers comprehensive protection against various threats, including prompt injection, data leakage, and harmful language, ensuring secure and reliable LLM interactions.

llm-securityprompt-injectionlarge-language-models
AuditNLG: Auditing Generative AI for Trustworthiness

AuditNLG: Auditing Generative AI for Trustworthiness

June 25, 2026

AuditNLG is an open-source library from Salesforce designed to enhance the trustworthiness of generative AI language models. It provides state-of-the-art techniques to detect and improve factualness, safety, and constraint adherence in AI-generated text. This library simplifies the process of auditing AI outputs, offering explanations and alternative suggestions for problematic content.

PythonGenerative AIAI Safety
Odysseus: A Comprehensive Self-Hosted AI Workspace for Productivity

Odysseus: A Comprehensive Self-Hosted AI Workspace for Productivity

June 25, 2026

Odysseus is a powerful self-hosted AI workspace designed to integrate various AI-powered tools into a single platform. It offers functionalities for chat, agents, deep research, document management, email, and calendar, supporting both local and API models. This comprehensive solution aims to enhance productivity and streamline AI workflows in a private environment.

AI WorkspaceSelf-HostedPython
Headroom: Drastically Reduce LLM Token Usage for AI Agents

Headroom: Drastically Reduce LLM Token Usage for AI Agents

June 25, 2026

Headroom is an innovative context compression layer for AI agents, designed to significantly reduce token usage for LLMs. It achieves 60-95% fewer tokens across various inputs like tool outputs, logs, files, and RAG chunks, all while preserving answer accuracy. This powerful tool enhances efficiency and cost-effectiveness for AI interactions.

AILLMToken Optimization

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️