Google just dropped another banger!
the figures in this paper were drawn by the system described in the paper.
PaperBanana is an agentic framework that generates publication-ready academic illustrations from methodology descriptions.
no manual design, no Figma, just your method section and a caption.
here's how it works:
five specialized agents collaborate in sequence:
> Retriever: finds relevant reference diagrams from a curated set of NeurIPS papers. matches by visual structure, not topic.
> Planner: translates your methodology text into a detailed visual description using in-context learning.
> Stylist: applies aesthetic guidelines (color palettes, typography, layout) auto-summarized from hundreds of top-tier papers.
> Visualizer Critic loop: generates the image, critiques it against source text, and refines. repeats for 3 rounds.
one surprising finding: randomly selected examples work nearly as well as semantically matched ones. what matters is showing the model what good diagrams look like, not finding the topically perfect reference.
in blind evaluations, humans preferred PaperBanana outputs nearly 3 out of 4 times.
it also extends to statistical plots using code-based generation for numerical precision.
link in the next tweet.