TOON: Compact, Human-Readable JSON for LLM Prompts

Introduction

TOON, or Token-Oriented Object Notation, is an innovative data format designed to make JSON data more compact and human-readable, specifically for Large Language Model (LLM) prompts. It aims to reduce the token cost and improve the reliability of data input for AI models by minimizing verbose syntax while preserving explicit structure. TOON represents objects, arrays, and primitives similar to JSON, but adopts a syntax that combines YAML's indentation for nested objects with a CSV-style tabular layout for uniform arrays, making it highly efficient for structured data.

Installation

Getting started with TOON is straightforward, whether you prefer a command-line interface or a TypeScript library.

CLI (No Installation Required)

You can instantly try TOON using npx:

# Convert JSON to TOON
npx @toon-format/cli input.json -o output.toon

# Pipe from stdin
echo '{"name": "Ada", "role": "dev"}' | npx @toon-format/cli

TypeScript Library

For programmatic use, install the TypeScript SDK:

# npm
npm install @toon-format/toon

# pnpm
pnpm add @toon-format/toon

# yarn
yarn add @toon-format/toon

Example usage:

import { encode } from '@toon-format/toon'

const data = {
  users: [
    { id: 1, name: 'Alice', role: 'admin' },
    { id: 2, name: 'Bob', role: 'user' }
  ]
}

console.log(encode(data))
// users[2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user

Examples

TOON's syntax is designed for clarity and compactness. Here are some common data structures:

Objects

Simple objects with primitive values:

id: 123
name: Ada
active: true

Nested objects:

user:
  id: 123
  name: Ada

Arrays

Primitive arrays (inline):

tags[3]: admin,ops,dev

Arrays of objects (tabular format):

items[2]{sku,qty,price}:
  A1,2,9.99
  B2,1,14.5

TOON also supports optional key folding to further reduce tokens for deeply nested single-key chains:

data.metadata.items[2]: a,b

Why Use TOON?

TOON offers significant advantages, especially when working with LLMs:

Token-efficient: It typically achieves 30-60% fewer tokens on large uniform arrays compared to formatted JSON, leading to lower costs and larger context windows. Benchmarks show TOON achieving higher accuracy per 1K tokens across various LLM models.
LLM-friendly guardrails: Explicit array lengths ([N]) and field declarations ({field1,field2}) provide built-in validation, helping LLMs parse and generate data more reliably.
Minimal syntax: It removes redundant punctuation like braces, brackets, and most quotes, making the data cleaner and easier for humans and models to read.
Tabular arrays: For uniform arrays of objects, TOON declares keys once and streams data as rows, similar to CSV, but with explicit structural information.
Balanced approach: While excelling in uniform arrays, TOON also intelligently handles mixed and non-uniform data by switching to a list format, and provides guidance on when other formats like compact JSON or CSV might be more suitable.

TOON: Compact, Human-Readable JSON for LLM Prompts

Summary

Repository Info

Tags