TOON: Compact, Human-Readable JSON for LLM Prompts

Summary
TOON, or Token-Oriented Object Notation, is a compact and human-readable data format designed to optimize JSON serialization for Large Language Model (LLM) prompts. It significantly reduces token count while maintaining explicit structure, making data more efficient and reliable for AI applications. This format combines indentation-based structure with tabular layouts for uniform arrays, offering a powerful alternative to traditional JSON and YAML.
Repository Info
Tags
Click on any tag to explore related repositories
Introduction
TOON, or Token-Oriented Object Notation, is an innovative data format designed to make JSON data more compact and human-readable, specifically for Large Language Model (LLM) prompts. It aims to reduce the token cost and improve the reliability of data input for AI models by minimizing verbose syntax while preserving explicit structure. TOON represents objects, arrays, and primitives similar to JSON, but adopts a syntax that combines YAML's indentation for nested objects with a CSV-style tabular layout for uniform arrays, making it highly efficient for structured data.
Installation
Getting started with TOON is straightforward, whether you prefer a command-line interface or a TypeScript library.
CLI (No Installation Required)
You can instantly try TOON using npx:
# Convert JSON to TOON
npx @toon-format/cli input.json -o output.toon
# Pipe from stdin
echo '{"name": "Ada", "role": "dev"}' | npx @toon-format/cli
TypeScript Library
For programmatic use, install the TypeScript SDK:
# npm
npm install @toon-format/toon
# pnpm
pnpm add @toon-format/toon
# yarn
yarn add @toon-format/toon
Example usage:
import { encode } from '@toon-format/toon'
const data = {
users: [
{ id: 1, name: 'Alice', role: 'admin' },
{ id: 2, name: 'Bob', role: 'user' }
]
}
console.log(encode(data))
// users[2]{id,name,role}:
// 1,Alice,admin
// 2,Bob,user
Examples
TOON's syntax is designed for clarity and compactness. Here are some common data structures:
Objects
Simple objects with primitive values:
id: 123
name: Ada
active: true
Nested objects:
user:
id: 123
name: Ada
Arrays
Primitive arrays (inline):
tags[3]: admin,ops,dev
Arrays of objects (tabular format):
items[2]{sku,qty,price}:
A1,2,9.99
B2,1,14.5
TOON also supports optional key folding to further reduce tokens for deeply nested single-key chains:
data.metadata.items[2]: a,b
Why Use TOON?
TOON offers significant advantages, especially when working with LLMs:
- Token-efficient: It typically achieves 30-60% fewer tokens on large uniform arrays compared to formatted JSON, leading to lower costs and larger context windows. Benchmarks show TOON achieving higher accuracy per 1K tokens across various LLM models.
- LLM-friendly guardrails: Explicit array lengths (
[N]) and field declarations ({field1,field2}) provide built-in validation, helping LLMs parse and generate data more reliably. - Minimal syntax: It removes redundant punctuation like braces, brackets, and most quotes, making the data cleaner and easier for humans and models to read.
- Tabular arrays: For uniform arrays of objects, TOON declares keys once and streams data as rows, similar to CSV, but with explicit structural information.
- Balanced approach: While excelling in uniform arrays, TOON also intelligently handles mixed and non-uniform data by switching to a list format, and provides guidance on when other formats like compact JSON or CSV might be more suitable.
Links
Explore TOON further with these resources:
- Full Specification: TOON Specification v2.0
- Playgrounds:
- Other Implementations: Discover community and official implementations in various programming languages on the TOON GitHub repository.