To plan effectively in an unseen 3D world, embodied agents need to form and update beliefs of the 3D world. In our recent work,
@yifanyin_11 proposed 3D-Belief, a generative 3D world model that continuously updates scene memory AND 3D imagination of unseen space as an agent explores a new environment. We show great results of improved imagination quality 🖼️, spatial reasoning 🌎, and closed-loop planning 🤖 with 3D-Belief. This work presents a new way to think about what kinds of world modeling are necessary for embodied reasoning and planning under partial observability, particularly in unseen environments.
See the thread below for more details 👇
What should a world model capture for embodied agents?
An agent acting under partial observability needs a belief over the 3D world: what it has seen, what may exist beyond view, and how that belief should update as it moves.
Introducing 3D-Belief, a 3D world model for embodied belief inference under partial observability. 🧵[1/9]