TOON: The data format slashing LLM costs by 50%

Oğuzhan Aslan

Token-Oriented Object Notation just launched and it’s already changing how developers send data to AI models. Created by Johann Schopplich and released in October 2025, TOON achieves 30–60% token reduction compared to JSON while actually improving LLM comprehension accuracy. For teams spending thousands monthly on OpenAI or Anthropic APIs, this isn’t just elegant engineering — it’s money back in the budget.

Token-Oriented Object Notation just launched and it’s already changing how developers send data to AI models. Created by Johann Schopplich and released in October 2025, TOON achieves 30–60% token reduction compared to JSON while actually improving LLM comprehension accuracy. For teams spending thousands monthly on OpenAI or Anthropic APIs, this isn’t just elegant engineering — it’s money back in the budget.

TOON addresses a problem every LLM developer faces: JSON is verbose, and when you’re charged per token, that verbosity costs real money. The format combines YAML’s clean indentation with CSV’s efficient tabular layouts, stripping away the braces, brackets, and redundant quotes that make JSON so token-hungry. In benchmark tests across four major LLMs, TOON used 39.6% fewer tokens than JSON while achieving 73.9% accuracy versus JSON’s 69.7%. You save money and get better results.

The timing couldn’t be better. As RAG systems, AI agents, and multi-agent frameworks become production-critical, token efficiency has shifted from nice-to-have to competitive necessity. TOON has already gained 16,495+ GitHub stars and implementations in over 15 programming languages within its first month. This is what developer-driven innovation looks like when it solves a real, expensive problem.

What makes TOON different from JSON

TOON is a data serialization format — not a framework or model — specifically engineered as a translation layer between your application code and LLM prompts. You keep using JSON everywhere in your stack: APIs, databases, configuration files. But when sending structured data to an LLM, you convert to TOON first.

The format’s genius lies in recognizing that uniform arrays of objects are the sweet spot for optimization. Consider a typical JSON payload with 100 employee records. JSON repeats field names 100 times. TOON declares the schema once, then provides just the values in a clean tabular format. This single insight drives the dramatic token savings.

Here’s the same data in both formats:

JSON (84 tokens):

{  "users": [    {"id": 1, "name": "Alice", "role": "admin"},    {"id": 2, "name": "Bob", "role": "user"

TOON (32 tokens):

users[2]{id,name,role}:1,Alice,admin2,Bob,user

The 61.9% token reduction comes from eliminating braces, quotes, and repeated keys while adding structural metadata that actually helps LLMs understand the data better. The [2] explicitly tells the model to expect two items, and {id,name,role} declares the schema upfront. This reduces parsing errors compared to JSON where models must infer structure as they go.

Three design principles driving token efficiency

TOON synthesizes three proven serialization paradigms into something new. First, YAML-style indentation for nested objects eliminates the need for braces and keeps hierarchy visually clear. Second, CSV-style tabular layout for uniform arrays means schema definition happens once, not repeatedly. Third, minimal punctuation strips away every character that doesn’t carry meaning.

The format implements smart quoting rules that only add quotes when truly necessary — when strings contain delimiters, colons, or special characters. Most string values in structured data don’t need quotes at all. TOON also supports multiple delimiters (comma, tab, pipe) with explicit encoding in array headers, so the format itself documents which delimiter is active.

Key technical features include explicit array length markers [N] that tell LLMs exactly how many items to expect, reducing generation errors when models create structured output. Field declarations {field1,field2} provide schema validation. A strict validation mode checks array lengths, field counts, and structural integrity. The v1.5 specification added key folding, which collapses single-key wrapper chains into dotted paths like data.metadata.items[2] for additional compression.

Performance benchmarks prove the concept

The official benchmarks tested 209 questions across four LLM models — GPT-5-nano, Gemini-2.5-flash, Claude-Haiku-4–5, and Grok-4-fast. TOON consistently outperformed all alternatives:

TOON: 2,744 tokens, 73.9% accuracy, 26.9 score
JSON compact: 3,081 tokens, 70.7% accuracy, 22.9 score
YAML: 3,719 tokens, 69.0% accuracy, 18.6 score
JSON: 4,545 tokens, 69.7% accuracy, 15.3 score
XML: 5,167 tokens, 67.1% accuracy, 13.0 score

Dataset-specific results are even more dramatic. For 100 employee records (uniform tabular data), TOON achieved 60.7% token reduction versus JSON. Time-series analytics data saved 59.0%. Even nested e-commerce orders with only 33% tabular structure still saved 33.1%. The sweet spot is data where tabular eligibility exceeds 60% — uniform arrays of objects with consistent field structures.

One production deployment reported converting a 1,344-token JSON prompt to a 589-token TOON prompt — a 56% reduction that also cut response time by 5 seconds. For a system making thousands of daily API calls, this translates to measurable cost savings and user experience improvements. Teams spending $1,000 monthly on LLM APIs could save $300–600 just by changing data format.

Where TOON excels and where it doesn’t

TOON shines brightest in specific scenarios. RAG systems passing retrieved documents to LLMs can cut token usage nearly in half — crucial when context windows are constrained. AI agent frameworks benefit from efficient inter-agent communication where state and data pass between multiple models. Analytics and database queries sent to LLMs for analysis are perfect candidates since they’re inherently tabular. Prompt engineering with large example datasets becomes more feasible when each example consumes fewer tokens.

The format is production-ready for any application making high-volume LLM API calls where token costs are significant. If you’re spending $500+ monthly on OpenAI, Claude, or similar services, TOON likely pays for itself immediately. If you regularly pass arrays of 100+ records to models, the savings compound dramatically.

However, TOON isn’t a universal JSON replacement. Deeply nested, irregular structures with low tabular eligibility often don’t save meaningful tokens — compact JSON may be equally efficient. One-off LLM calls don’t justify the setup overhead. General-purpose APIs and data storage should absolutely continue using JSON, which remains the industry standard with vastly larger ecosystem support. TOON is a specialized optimization for the LLM boundary specifically.

Pure flat tables (simple CSV data) actually benefit minimally since CSV itself is already maximally compact. TOON’s value emerges when you need the structure and nesting capability JSON provides but want better token efficiency. The recommended pattern is: use JSON everywhere in your application layer, convert to TOON only when sending to LLMs, let models generate JSON output (still best supported), then work with JSON in your application.

A thriving ecosystem emerged in weeks

Johann Schopplich launched TOON with a viral tweet on October 26, 2025 that garnered 335,500+ views. Within three weeks, the community delivered implementations in over 15 programming languages. The main TypeScript/JavaScript implementation has 16,495+ GitHub stars.

Official implementations from the toon-format organization cover JavaScript/TypeScript, Python, Rust, .NET/C#, Go, and Dart. Community developers contributed Ruby, PHP, Swift, Java, Kotlin, Elixir, Scala, OCaml, Clojure, Crystal, C++, Gleam, Lua, and R implementations. An R package even reached CRAN on November 10, 2025. The rapid proliferation signals strong developer interest in token optimization.

The CLI tool (@toon-format/cli) launched November 1, enabling quick JSON ↔ TOON conversion from the terminal with token savings statistics. Online converters and playgrounds emerged for experimentation. The specification reached v2.0 on November 10, 2025, following semantic versioning with clear compatibility guarantees and a comprehensive conformance test suite.

Integration patterns are straightforward. In Python with OpenAI:

import toonfrom openai import OpenAIdata = [{"id": 1, "name": "Alice", "role": "admin"}]toon_data = toon.encode(data)prompt = f"Analyze this data:\n{toon_data}\n\nQuestion..."response = client.chat.completions.create

TOON works with all major LLM providers since it’s text-based — OpenAI, Anthropic, Google, xAI, and open-source models process it identically. The format integrates smoothly with RAG frameworks like LangChain, LlamaIndex, and Haystack. No model retraining or special configuration required.

Media coverage and community sentiment

The format attracted immediate attention from major technical publications. freeCodeCamp published “What is TOON? How Token-Oriented Object Notation Could Change How AI Sees Data” on November 13, 2025. Medium hosted 10+ in-depth articles from AI practitioners. VirtusLab explored “Cutting LLM costs in the protocol layer.” DEV Community contributors produced tutorials and implementation guides.

Developer sentiment is pragmatically positive. The token savings and cost reduction resonate strongly with teams managing LLM infrastructure budgets. Real-world use cases emerged quickly in RAG systems, multi-agent frameworks, and prompt optimization workflows. Notably, the community recognizes TOON as a specialized tool rather than a JSON replacement — this measured perspective suggests sustainable adoption rather than hype-driven churn.

Scalevise reported production deployment with 50%+ token reduction across thousands of daily API requests to ChatGPT and Claude, plus 15% latency improvements. The format’s human readability also simplified debugging compared to reading dense JSON dumps. These early success stories validate the theoretical benchmarks with production metrics.

The future of token-aware engineering

TOON represents a fundamental insight: as AI systems consume more structured data, optimization at the data format layer delivers measurable benefits without changing models or architectures. This is protocol-layer innovation that works with existing infrastructure.

The format could follow a trajectory similar to JSON’s rise for web APIs. As token costs remain a dominant factor in LLM economics, efficiency formats may become standard practice. AI agent communication protocols, RAG system architectures, and multi-agent frameworks might adopt TOON as default serialization. The rapid ecosystem growth suggests developers recognize this potential.

Technical evolution continues with active specification development, community RFC processes, and additional features like advanced delimiter options and further compression techniques. The MIT license and open governance model support sustainable growth. Cross-language conformance testing ensures implementations remain compatible as the specification evolves.

Market pressures favor adoption. Token-based pricing models from OpenAI, Anthropic, and others create direct financial incentives for efficiency. Context window limitations in current models make compact formats more valuable. As LLM applications scale to millions of interactions, token savings compound dramatically.

Conclusion

TOON emerged to solve a specific, expensive problem: JSON wastes tokens when sending structured data to LLMs. With proven 30–60% token reductions, production-ready implementations across major languages, and measurable cost savings, TOON has established itself as a valuable optimization tool for AI developers.

The format excels in scenarios with uniform tabular data, high-volume LLM operations, and token-budget constraints. It’s not a universal JSON replacement — nor does it try to be. Instead, TOON occupies a clear niche as a translation layer between applications and AI models, optimizing the specific boundary where token efficiency matters most.

For teams spending significant amounts on LLM APIs or hitting context window limits, TOON offers straightforward ROI without architectural changes. As the AI industry matures and token efficiency becomes increasingly critical, innovations like TOON demonstrate that sometimes the best optimization is intelligently removing what you don’t need. The question isn’t whether to explore TOON — it’s whether you can afford not to, when competitors might be operating at 40% lower LLM costs.

TOON is production-ready now. Specification v2.0, comprehensive documentation, growing ecosystem, and proven results. The tools exist, the community is engaged, and the savings are measurable. Token-aware engineering has arrived.