Senior Research Scientist @ NVIDIA | PhD in Robotics @ CMU

Joined March 2009
6 Photos and videos
Pinned Tweet
When every generalist robot model scores 95% on a benchmark, the numbers become meaningless. What if we built a photorealistic benchmark that never saturates and can generate new scenes and tasks with AI Workflows in minutes? We introduce RoboLab! 🧵(1/6)
10
27
149
27,781
VoLo is a really good way to address long horizon manipulation problems is to combine a variety of tools and capabilities in robotics into a single system. We call this a physical orchestrator. We've come a long way since code-as-policies.
Wonderful to be back from #CVPR2026, and excited to share the release of our follow-up work: VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation VoLo introduces the idea of a physical orchestrator for open-vocabulary, long-horizon manipulation. Our goal is to move toward robots that can reason, plan, act, monitor, and recover by adaptively using VLA/WAMs, vision models, and action primitives as tools. We introduce three main contributions: 🤖 VoLoAgent — a physical orchestrator that plans, monitors, and recovers by adaptively using, halting, and redirecting robot actions with tools. 📊 RoboVoLo — a high-fidelity benchmark with 126 open-vocabulary long-horizon manipulation tasks spanning common sense, memory/state tracking, complex references, and world knowledge. 📈 A large-scale empirical study comparing action models, code-as-policy systems, TAMP-style systems, and ablations of the VoLoAgent orchestrator, complemented by real-robot experiments. This work was done during my internship at @NVIDIA and would not have been possible without my brilliant collaborators: Hugo Hadfield, Alexander Zook, @mikacuy, @luke_ch_song, @erwincoumans, @xuningy, Faisal Ladhak, @qu_1006, @BirchfieldStan, Jonathan Tremblay, and @robovalts. Huge thanks to everyone! 🔗 Project: chicychen.github.io/VoLo/ 🔗 Previous work, SpaceTools: spacetools.github.io/ #Robotics #EmbodiedAI #VisionLanguageModels #VLAModels #RobotLearning #NVIDIA #CVPR2026 #LongHorizonManipulation #AI #ComputerVision
2
17
1,830
Xuning Yang retweeted
Cosmos3 (post-trained on DROID) surpassed strong VLA & WAM baselines to rank #1 on RoboLab All the compute FLOPs invested during the massive Cosmos3 pre-training and mid-training contribute to unlocking a better robot foundation model.😄
🎉 We added 2 SOTA WAMs to the RoboLab Leaderboard 🎉 Current leaders on RoboLab-120 (specific instr.): 🥇Cosmos3-Nano-Policy (39.7%) 🥈π0.5 (28.1%) 🥉DreamZero (28.1%) → See full results at: research.nvidia.com/labs/srl… → All policy clients available at: github.com/NVlabs/RoboLab/
9
58
6,165
🎉 We added 2 SOTA WAMs to the RoboLab Leaderboard 🎉 Current leaders on RoboLab-120 (specific instr.): 🥇Cosmos3-Nano-Policy (39.7%) 🥈π0.5 (28.1%) 🥉DreamZero (28.1%) → See full results at: research.nvidia.com/labs/srl… → All policy clients available at: github.com/NVlabs/RoboLab/
7
21
127
30,568
Xuning Yang retweeted
a new form of greeting has dropped: see you at icml🇰🇷 1/1 accepted to ICML. more details soon
3
2
25
1,862
Xuning Yang retweeted
We are still far from zero-shot policy deployment on new tasks
Evaluation is a critical bottleneck in building robot foundation models. Check out our latest work RoboLab, led by @xuningy, which addresses this exact challenge. Its a high-fidelity simulation environment for testing these models. A truly generalist policy should be able to complete these tasks zero-shot, and this benchmark highlights exactly how far we still have to go. More info 👇
4
10
55
9,328
Xuning Yang retweeted
Generalist robot policies need a benchmark that works across any robot and any policy. 🦾 Introducing RoboLab, a high‑fidelity simulation benchmark built on NVIDIA Isaac and Omniverse to evaluate generalist robot policies in diverse, photoreal, physics‑based environments. Coming soon to the NVIDIA Isaac Lab‑Arena roadmap for large‑scale, robotic policy evaluation. 📖 nvda.ws/47RbOgX #NationalRoboticsWeek
8
39
260
23,906
When every generalist robot model scores 95% on a benchmark, the numbers become meaningless. What if we built a photorealistic benchmark that never saturates and can generate new scenes and tasks with AI Workflows in minutes? We introduce RoboLab! 🧵(1/6)
10
27
149
27,781
→ Customization: Comes with 200 objects, 100 backgrounds, lighting, camera poses… don’t like it? No problem, add your own → Diagnostics: motion quality, failure events, sensitivity analysis for failure attribution (5/6)
1
7
951
RoboLab comes with RoboLab-120 — a curated, diverse benchmark of 120 tasks to get started. Set up and run in <20 min. (6/6) Try it out 👇 🌐 research.nvidia.com/labs/srl… 📄 arxiv.org/abs/2604.09860 💻 github.com/NVLabs/RoboLab
3
21
1,524