FreeLLMAPI: Stack 16 Free LLM Tiers for 1.7 Billion Tokens/Month

This repository profile is provided by osrepos.com, an open source repository discovery platform.

FreeLLMAPI: Stack 16 Free LLM Tiers for 1.7 Billion Tokens/Month

Summary

FreeLLMAPI is an OpenAI-compatible proxy that aggregates the free tiers of 16 LLM providers, offering access to approximately 1.7 billion tokens per month. It simplifies access to diverse models through a single endpoint, featuring smart routing, automatic failover, and encrypted key storage. This powerful tool is designed for personal experimentation, allowing developers to leverage multiple free LLM resources efficiently.

Repository Information

Analyzed by OSRepos on June 27, 2026

Topics

Click on any tag to explore related repositories

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

FreeLLMAPI is an innovative, self-hosted, OpenAI-compatible proxy designed to consolidate the free tiers of 16 different Large Language Model (LLM) providers. By stacking these free resources, it offers access to an impressive approximately 1.7 billion tokens per month through a single /v1 endpoint. This project aims to simplify the use of various LLMs for personal experimentation, providing features like smart routing, automatic failover, and secure, encrypted key storage.

Why Use FreeLLMAPI?

Individually, the free tiers offered by major AI labs often feel limited. However, managing multiple SDKs, navigating different rate limits, and handling potential request failures across many providers can be a significant challenge. FreeLLMAPI solves this by collapsing all these complexities into one unified, OpenAI-compatible endpoint. This allows you to point any OpenAI client library at your local FreeLLMAPI server, which then transparently routes requests across the providers for which you've added keys. The aggregated capacity transforms individual "toy" tiers into a substantial working inference capacity.

Key Features

FreeLLMAPI comes packed with features to enhance your LLM experimentation:

  • OpenAI-compatible API: Works seamlessly with official OpenAI SDKs and any OpenAI-compatible client (LangChain, LlamaIndex, etc.) by simply changing the base_url.
  • Anthropic Messages API Support: Integrates with Anthropic's wire format, allowing Claude clients and SDKs to run against your free LLM pool.
  • Image Generation & Text-to-Speech: Routes requests for media models across supported providers.
  • Streaming and Non-Streaming: Supports both Server-Sent Events for streaming and standard JSON responses.
  • Tool Calling: Passes through OpenAI-style tools and tool_choice requests, supporting multi-step tool-use flows.
  • Embeddings: Provides a /v1/embeddings endpoint with family-based routing, ensuring failover only occurs between compatible models.
  • Automatic Fallover: Automatically retries requests on the next available model in your fallback chain if a provider returns a 429, 5xx, or times out.
  • Per-Key Rate Tracking: Monitors RPM, RPD, TPM, and TPD counters for each (platform, model, key) to stay within free-tier caps.
  • Encrypted Key Storage: API keys are encrypted with AES-256-GCM for enhanced security.
  • Unified API Key: Clients authenticate to your proxy with a single freellmapi-... bearer token, keeping upstream provider keys private.
  • Admin Dashboard: A React + Vite UI for managing keys, reordering the fallback chain, inspecting analytics, and using a prompt playground.
  • Context Handoff: Optionally injects a compact system message when switching models mid-conversation to improve continuity.

Installation

Getting FreeLLMAPI up and running is straightforward.

One-liner (Docker Required)

For a quick setup, use the provided install script:

curl -fsSL https://freellmapi.co/install.sh | bash

This script sets up ~/freellmapi, generates an encryption key, pulls the Docker image, and starts the container.

Manual Docker Compose

If you prefer a manual Docker Compose setup:

  1. Clone the repository:
    git clone https://github.com/tashfeenahmed/freellmapi.git
    cd freellmapi
    
  2. Generate an encryption key and create a .env file:
    ENCRYPTION_KEY="$(openssl rand -hex 32)"
    printf "ENCRYPTION_KEY=%s\\nPORT=3001\\n" "$ENCRYPTION_KEY" > .env
    
  3. Start the services:
    docker compose up -d
    

    Access the dashboard at http://localhost:3001. Remember to add your provider keys and configure the fallback chain.

Desktop App

A native menu-bar application is available for macOS and Windows, providing a local router and dashboard directly from your system tray. You can download the latest .dmg or .exe installer from the GitHub Releases page.

Examples

Once FreeLLMAPI is running, you can use any OpenAI-compatible client.

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3001/v1",
    api_key="freellmapi-your-unified-key",
)

resp = client.chat.completions.create(
    model="auto",  # let the router pick; or specify e.g. "gemini-2.5-flash"
    messages=[{"role": "user", "content": "Summarise the fall of Rome in one sentence."}],
)
print(resp.choices[0].message.content)
print("Routed via:", resp.headers.get("x-routed-via"))

curl

curl http://localhost:3001/v1/chat/completions \
  -H "Authorization: Bearer freellmapi-your-unified-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "hi"}]
  }'

Streaming

stream = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Stream me a haiku about SQLite."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Tool Calling

FreeLLMAPI supports OpenAI-style tool calling, allowing complex interactions with your LLMs.

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

# 1. Model asks for a tool call
first = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "What's the weather in Karachi?"}],
    tools=tools,
    tool_choice="required",
)
call = first.choices[0].message.tool_calls[0]

# 2. You execute the tool, feed the result back
final = client.chat.completions.create(
    model="auto",
    messages=[
        {"role": "user", "content": "What's the weather in Karachi?"},
        first.choices[0].message,
        {"role": "tool", "tool_call_id": call.id, "content": '{"temp_c": 32, "cond": "sunny"}'},
    ],
    tools=tools,
)
print(final.choices[0].message.content)

Vision / Image Input

Send images using standard OpenAI image_url content blocks. The router automatically restricts requests to vision-capable models.

resp = client.chat.completions.create(
    model="auto",  # auto-routes to a vision model
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "data:image/png;base64,<...>"},},
        ],
    }],
)
print(resp.choices[0].message.content)

Links

Related repositories

Similar repositories that may be relevant next.

Voicebox: The Open-Source AI Voice Studio for Cloning and Dictation

Voicebox: The Open-Source AI Voice Studio for Cloning and Dictation

June 25, 2026

Voicebox is an innovative open-source AI voice studio that allows users to clone voices, generate speech in multiple languages, and dictate into any application. It provides a comprehensive, local-first voice I/O stack, offering a powerful alternative to cloud-based solutions. This tool ensures complete privacy and control over your voice data, running entirely on your local machine.

AIVoice CloningSpeech Synthesis
EasyWhisperUI: A Cross-Platform Desktop App for Whisper Model Transcription

EasyWhisperUI: A Cross-Platform Desktop App for Whisper Model Transcription

June 22, 2026

EasyWhisperUI is a fast, local desktop application designed for transcribing audio and video using the Whisper model. It offers GPU acceleration across Windows, macOS, and Linux, providing a user-friendly interface for various transcription tasks. The application supports features like live transcription, batch processing, and translation, making it a versatile tool for media processing.

TypeScriptWhisperTranscription
Dexter: An Autonomous Agent for Deep Financial Research

Dexter: An Autonomous Agent for Deep Financial Research

June 22, 2026

Dexter is an autonomous financial research agent designed to think, plan, and learn while performing analysis. It leverages task planning, self-reflection, and real-time market data to tackle complex financial questions. This project provides a powerful tool for in-depth financial exploration, emphasizing its educational and informational purposes.

TypeScriptAIFinancial Research
Piping Server: Infinite Data Transfer Over Pure HTTP

Piping Server: Infinite Data Transfer Over Pure HTTP

June 20, 2026

Piping Server is an innovative open-source project enabling infinite data transfer between any device over pure HTTP. It acts as a simple, storageless server, facilitating data streaming with just `curl` or a web browser. This makes it ideal for secure, real-time communication and large file transfers without requiring any installation.

data-transferhttpstream

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️