Physical AI Researcher @ NVIDIA

Joined May 2026
Photos and videos
Valts Blukis retweeted
Wonderful to be back from #CVPR2026, and excited to share the release of our follow-up work: VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation VoLo introduces the idea of a physical orchestrator for open-vocabulary, long-horizon manipulation. Our goal is to move toward robots that can reason, plan, act, monitor, and recover by adaptively using VLA/WAMs, vision models, and action primitives as tools. We introduce three main contributions: πŸ€– VoLoAgent β€” a physical orchestrator that plans, monitors, and recovers by adaptively using, halting, and redirecting robot actions with tools. πŸ“Š RoboVoLo β€” a high-fidelity benchmark with 126 open-vocabulary long-horizon manipulation tasks spanning common sense, memory/state tracking, complex references, and world knowledge. πŸ“ˆ A large-scale empirical study comparing action models, code-as-policy systems, TAMP-style systems, and ablations of the VoLoAgent orchestrator, complemented by real-robot experiments. This work was done during my internship at @NVIDIA and would not have been possible without my brilliant collaborators: Hugo Hadfield, Alexander Zook, @mikacuy, @luke_ch_song, @erwincoumans, @xuningy, Faisal Ladhak, @qu_1006, @BirchfieldStan, Jonathan Tremblay, and @robovalts. Huge thanks to everyone! πŸ”— Project: chicychen.github.io/VoLo/ πŸ”— Previous work, SpaceTools: spacetools.github.io/ #Robotics #EmbodiedAI #VisionLanguageModels #VLAModels #RobotLearning #NVIDIA #CVPR2026 #LongHorizonManipulation #AI #ComputerVision
2
16
71
8,590
Valts Blukis retweeted
πŸŽ‰ We added 2 SOTA WAMs to the RoboLab Leaderboard πŸŽ‰ Current leaders on RoboLab-120 (specific instr.): πŸ₯‡Cosmos3-Nano-Policy (39.7%) πŸ₯ˆΟ€0.5 (28.1%) πŸ₯‰DreamZero (28.1%) β†’ See full results at: research.nvidia.com/labs/srl… β†’ All policy clients available at: github.com/NVlabs/RoboLab/
7
21
127
30,576
Valts Blukis retweeted
Presenting BOP-Ask at #CVPR2026 this Saturday in Denver! πŸ“ 33M QA pairs. 6 tasks. 8 VLMs benchmarked against human annotators. Most VLMs stop at perception. BOP-Ask pushes them into fine grained interaction. πŸ”— bop-ask.github.io/ #ComputerVision #Robotics #VLM #AI
1
5
143