Stripping code formatting cuts LLM token cost without hurting accuracy.
Average input tokens drop by 24.5%, with output quality basically unchanged.
The core issue is simple, indentation, spaces, and newlines help humans read but they inflate tokens that models pay to process.
They remove only cosmetic formatting while keeping program meaning identical, checked by matching the abstract syntax tree of the code.
They test Fill in the Middle code completion, where a model fills a missing block, across Java, C , C#, and Python.
Performance stays stable on unformatted input, big models barely move, smaller ones wobble a bit, Python sees less savings because its layout is part of the language.
One surprise, models still print nicely formatted code even when given smashed input, so output token savings are small.
To fix that, 2 cheap tactics work, explicit prompts that say output without formatting, and light fine tuning on unformatted samples.
With clear instructions or tiny training, output length shrinks by 25% to 36% while pass rate on the first try holds.
They also ship a tool that strips formatting before inference then restores it after, so humans read clean code while the model pays less.
----
Paper – arxiv. org/abs/2508.13666
Paper Title: "The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget"