Imaginative Perception Tokens
UW, OpenAI, Microsoft, and AI2 teach VLMs to imagine unseen visual perspectives.
These tokens boost spatial reasoning over text chain-of-thought across perspective taking, path tracing, and multiview counting.
No images are generated at inference time.