Prof @Stanford, Distinguished Research Scientist and AV research lead @nvidia. PhD from @MITAeroAstro. Robotics, autonomous systems, AI. Opinions are my own.

Joined November 2018
31 Photos and videos
How much time should robots spend thinking? Vision-Language Models are increasingly used as high-level planners for robots, and the prevailing strategy has been to scale test-time compute to boost capability. But more reasoning steps, bigger models, and longer memory all come with increased latency, tokens, and FLOPs—often with diminishing and uneven returns. So when, and where, is test-time compute actually worth its cost? 🧐 We study three dominant scaling axes and find that each unlocks a distinct capability, showing that test-time compute is not a uniform lever: - Chain-of-thought depth helps with tasks involving implicit semantic, physical, or spatial constraints, but its additional latency is not always necessary (on VLABench, a non-CoT model matches a CoT model on 44% of tasks). - Model size governs the breadth of skills a planner can reliably draw upon, but its benefits appear only when those additional skills are actually required. - Memory history improves performance on long-horizon, history-dependent tasks, but can actively hurt performance elsewhere. Across all three axes, a consistent pattern emerges: the gap between cheap and expensive configurations is large, but highly non-uniform and task-dependent. DIRECT (Dynamic Inference Router for Embodied Compute Tradeoffs) is a lightweight router that reads scene instruction context and sends each task to the cheapest planner that can still solve it, allocating compute per task rather than committing to one fixed model. 👉 Takeaway: smart allocation of test-time compute can recover frontier-level planning at a fraction of the cost. 📄 Paper: arxiv.org/abs/2606.12402 🔗 Website: jadee-dao.github.io/direct/ Work led by @_jadelynn @milanganai With an outstanding team of collaborators: @ajaysridhar0 @Mozhgan_nasr @katielulula Clark Barrett @jiajunwu_cs @chelseabfinn #Robotics #VLM #EmbodiedAI #MachineLearning #TestTimeCompute
2
12
42
4,207
Marco Pavone retweeted
The Alpamayo Summit at CVPR brought together AV researchers and industry leaders together under one room. Hear from Marco Pavone (@drmapavone), Senior Director of Autonomous Vehicle Research, and other NVIDIA experts on how Alpamayo is accelerating AV development. 📺 Watch the on-demand replay: nvidia.com/en-us/on-demand/p…
3
6
912
I look forward to participating in the Verification Summit (verificationsummit.ai/) and sharing my perspective on Physical AI safety. I strongly agree that verification and validation are key frontiers for unlocking Physical AI in high-stakes, high-reliability applications, from autonomous cars to industrial robotics! @fv_summit @khoslaventures @PramaanaLabs @boldcapfund
I think this will turn out to be a very important area. Founders should work on things AI is not good at.
7
25
3,929
Excited to share the latest expansion of the @nvidia #Alpamayo open platform for reasoning-based autonomous vehicles. Since its launch earlier this year, Alpamayo has seen rapid adoption across industry and academia, with its reasoning models surpassing 400,000 downloads and earning a #COMPUTEX 2026 Best Choice Award. As announced by Jensen Huang during his #COMPUTEX keynote, we are now introducing several major additions designed to accelerate the development of next-generation AV systems (more details here: huggingface.co/blog/drmapavo…): 🚗 Alpamayo 2 Super — a new 32B-parameter driving foundation model with: • Full 360° surround-view perception • Advanced reasoning capabilities and chain-of-causation outputs • Meta-actions such as lane changes, yielding, and stopping • Reasoning auto-labeling and visual grounding for scalable data annotation • State-of-the-art performance across reasoning, prediction, and alignment tasks 🔄 AlpaGym — an open-source framework for closed-loop reinforcement learning, enabling AV models to learn from the consequences of their actions in simulation and helping bridge the gap between training and real-world deployment. 📊 New Open Benchmarks — including challenges for closed-loop driving and long-tail reasoning to help the community measure progress and drive innovation. 🛠️ Alpamayo Recipes — a centralized repository of end-to-end workflows covering supervised fine-tuning, reinforcement learning, quantization, and model customization. Reasoning models and closed-loop training are becoming foundational technologies for autonomous systems. Our goal is to provide the open tools, models, infrastructure, and benchmarks needed to accelerate progress across the entire AV ecosystem. A huge thank you to the many researchers, engineers, and community members whose feedback helped shape this release. Resources: • Overview of the latest Alpamayo release (note: some components will be released over the coming weeks): huggingface.co/blog/drmapavo…@nvidia announcement: nvidianews.nvidia.com/news/n… #AutonomousVehicles #PhysicalAI #Robotics #AI #MachineLearning #ReinforcementLearning #OpenSource #NVIDIA #Alpamayo @NVIDIADRIVE @NVIDIAAI
10
28
3,388
We’ve just released the #Alpamayo Chain-of-Causation (CoC) Autolabeling Pipeline — a feature that has been highly requested by the community! The pipeline automatically derives: 🔹 Meta-actions: high-level categorical descriptions of ego motion 🔹 Chain-of-causation labels: causal links between scene factors and the ego vehicle’s intended behavior Autolabeling pipeline: github.com/NVlabs/alpamayo-c… Learn more about the Alpamayo open platform: huggingface.co/blog/drmapavo… We’re excited to see what the community builds with it, and we hope this tool will help accelerate research in the rapidly growing area of #reasoning models for #Physical #AI. @NVIDIADRIVE @NVIDIAAI
1
15
54
3,695
Thrilled to personally invite you to the @nvidia #Alpamayo Summit at #CVPR! I’ll be opening the event with a talk titled “The ChatGPT Moment for Autonomous Driving” — exploring how reasoning AI is reshaping the entire autonomy stack and accelerating the path toward scalable, safe Level 4 autonomous driving. 📍 June 4, 2026 📍 Le Méridien Denver Downtown (Grove Ballroom) 🕞 3:30 PM — Networking Snacks 🕓 4:00 PM — Program Begins We’ll cover: • Open datasets • New reasoning models • AlpaSim • Safety frameworks • And much more Join the waitlist here → nvevents.nvidia.com/alpamayo… The event is currently sold out, but you can still join the waitlist. @NVIDIADRIVE @NVIDIAAI
1
3
32
2,109
Marco Pavone retweeted
Autonomous vehicle technology is advancing at an unprecedented pace. Marco Pavone (@drmapavone), Senior Director of Autonomous Vehicle Research at NVIDIA, breaks down how AI is enabling developers to completely rethink how autonomous systems are built. 📺 Watch the full video: nvda.ws/43rYlJy
2
10
30
2,331
🚗🏆 Breaking news: @nvidia #Alpamayo open platform has been named the winner of a COMPUTEX TAIPEI 2026 Best Choice Award in the Vehicle Technology & Smart Cockpit category, recognizing Alpamayo as one of the year’s major breakthroughs in automotive and Physical AI technology! Announcements: - blogs.nvidia.com/blog/nvidia… - bcaward.computex.biz/WinnerY… I’m incredibly proud of the team behind the Alpamayo open platform — Wenjie Luo @yan_wang_9 @iamborisi and the entire NVIDIA Autonomous Vehicle Research Group — and deeply grateful for the contributions from the NVIDIA AV production team Xinzhou Wu Sarah Tariq and many other researchers and developers across NVIDIA. This achievement was truly a collective team effort.👏 To get started with Alpamayo: — huggingface.co/blog/drmapavo…huggingface.co/blog/drmapavo… And stay tuned — we’ll have several exciting announcements in the coming weeks. @NVIDIADRIVE @NVIDIAAI
1
9
26
3,461
Introducing FRAX: Fast Robot Kinematics and Dynamics in #JAX — to be presented at the 2026 IEEE International Conference on Robotics and Automation (ICRA) Frontiers of Optimization for Robotics (FOR) Workshop. FRAX delivers extremely fast (low-microsecond) execution for common inverse-kinematic and inverse-dynamic control workloads, with a pure Python codebase that can achieve up to 5× faster performance than MuJoCo or Pinocchio Python bindings in several settings. At the same time, FRAX is fully differentiable and seamlessly compatible with CPU, GPU, and TPU execution through #JAX — enabling scalable workflows spanning robotics, control, planning, and machine learning. Our broader goal is to help bridge the gap between modern AI tooling and robotics computation, making it easier to develop scalable #Physical #AI systems. This also makes FRAX a great complement to CBFPY (github.com/StanfordASL/cbfpy), our package for robot safety and control barrier functions. Kudos to @danielpmorton for leading this effort. If you’ll be at ICRA, reach out! The FOR Workshop is on Monday, June 1, and we’ll have a poster there. 💻 GitHub: github.com/StanfordASL/frax 📄 Paper: arxiv.org/pdf/2604.04310 #Robotics #PhysicalAI #JAX #DifferentiablePhysics #MachineLearning #AutonomousSystems #GPU #Simulation #ICRA
7
53
402
28,863
Excited to announce the launch of the Stanford Sustainable Mobility Center, where I’ll be serving as inaugural co-director. Housed within Stanford Precourt Institute for Energy, the center brings together @Stanford’s strengths — from energy systems to AI and autonomy — alongside industry and government collaboration to accelerate real-world mobility solutions at scale. 🔗 Overview of the center: news.stanford.edu/stories/20… The center traces its origins to the Center for Automotive Research at Stanford (CARS), which I had the pleasure of directing for several years. 🚗🚢✈️ If you are interested in rethinking how people and goods move across land, sea, and air, I’d love to connect! @StanfordEng @StanfordASL
2
1
22
2,432
A central challenge in #physical #AI is data scarcity: vision-language-action (#VLA) models are fundamentally limited by the availability of high-quality robotics demonstrations. In our recent work, we introduce R&B-EnCoRe (arxiv.org/pdf/2602.08167), a framework that enables models to self-bootstrap embodied #reasoning by leveraging synthetic visuo-textual data together with limited embodiment-specific experience. In essence, R&B-EnCoRe allows models to learn how to reason in an embodied setting. Our approach treats reasoning as a latent variable and uses self-supervised refinement to learn reasoning strategies that are directly predictive of successful control—without human annotations, reward engineering, or external verifiers. We validate the approach across a range of embodiments—including manipulation, navigation, and autonomous driving—and across model scales from 1B to 30B parameters, observing consistent improvements: 💪 28% task success in real-world manipulation 🦿 101% score in legged locomotion navigation 🚗 −21% collision rate in autonomous driving Overall, this work highlights a promising direction: aligning internet-scale priors with embodiment-specific data to enable scalable, self-improving physical intelligence. Kudos to an amazing team: Milan Ganai Katie Luo @JonasFrey96 Clark Barrett 🌐 Website: milanganai.github.io/rnb-enc… 📄 Paper: arxiv.org/pdf/2602.08167
2
14
62
5,585
Excitingly, @nvidia #Alpamayo 1.5 is now available within Autoware: github.com/autowarefoundatio… Grateful to @ShinpeiKato and the rest of the TIER IV team for helping democratize the development of AV solutions. I look forward to seeing #Alpamayo’s adoption continue to grow! As Jensen said, “Everything that moves will be autonomous.” Together, we are making big strides toward this vision! More about Alpamayo 1.5: huggingface.co/blog/drmapavo… @NVIDIADRIVE @NVIDIAAI
15
55
21,450
Ψ₀ (psi-lab.ai/Psi0) is an open foundation model for universal humanoid loco-manipulation—and, more broadly, one of the first and most comprehensive ecosystems for developing humanoid vision-language-action models trained from egocentric data. It advances the state of the art in performance while shedding light on key aspects of model development, including how to effectively structure the training process. 📄 Paper: arxiv.org/abs/2603.12263 Kudos to @yuewang314 for spearheading such an impactful effort—excited to be part of this collaboration!

Introducing Ψ₀ (psi-lab.ai/Psi0) — an open foundation model for universal humanoid loco-manipulation. 🏆 Outperforms GR00T N1.6 by 40% overall success rate 📉 Uses only ~10% of the pre-training data 📦 Fully open-source: model, data, code, and deployment pipeline 1/10
7
656
Jensen today announced Alpamayo 1.5 at #NVIDIAGTC! #Alpamayo 1.5 is a major update to Alpamayo 1—@nvidia’s open 10B-parameter chain-of-thought reasoning VLA model, first introduced at #CES. Built on the #Cosmos-Reason2 VLM backbone and post-trained with RL, it adds support for navigation guidance, flexible multi-camera setups, configurable camera parameters, and user question answering. The result is an interactive, steerable reasoning engine for the AV community. We’re also releasing post-training scripts to help researchers and developers adapt the model. Additionally, we’ve significantly expanded the Alpamayo open platform across data and simulation, including releasing highly requested reasoning labels for the PhysicalAI Autonomous Vehicles dataset (huggingface.co/datasets/nvid…), as well as our chain-of-causation auto-labeling pipeline. 🔎 Learn more about Alpamayo 1.5 and the latest extensions to the Alpamayo open platform: huggingface.co/blog/drmapavo… (please note that most of the links will become active in the next few days.) Happy building—and stay tuned for more in the coming months! @NVIDIADRIVE @NVIDIAAI
7
30
161
17,869
What does it take to build autonomous vehicles that can reason about the world they drive in? Tomorrow at #NVIDIAGTC, Patrick Liu and I will take a deep dive into the #Alpamayo #reasoning model family—a family of reasoning-based vision–language–action (#VLA) models that form a core component of the Alpamayo open platform (huggingface.co/blog/drmapavo…). We’ll cover three main topics: - How reasoning-based VLA models like Alpamayo 1 are designed and built - What it takes to bring Alpamayo 1 to production, including some of our latest results - Several exciting announcements about the expansion of the Alpamayo open platform If you're working on autonomous driving, robotics, or foundation models for physical AI, this session will offer a look at where the field is heading. Session details: 📅 Monday, Mar 16 | 3:00 PM PDT 📍 #NVIDIAGTC 2026 🔗 nvda.ws/4rze5oj Looking forward to seeing many of you there. @NVIDIADRIVE @NVIDIAAI
18
75
7,692
Excited to share CoVer-VLA — a contrastive verifier and hierarchical test-time scaling framework that bridges the intention–action gap in generalist robot policies. We show that allocating compute to reasoning and verification at deployment can be more effective than scaling policy training alone. 🌐 Website: cover-vla.github.io 📄 Paper: arxiv.org/abs/2602.12281 🤗 Models: huggingface.co/cover-vla 💻 Code: github.com/cover-vla/cover-v… Work led by @jackyk02, in collaboration with @Azaliamirh and @chelseabfinn
1
12
90
6,496
Deep dive on @NVIDIA Alpamayo 1 (reasoning-based model for AVs) is now up. Watch the full recording: youtube.com/watch?v=V9E4GX5v… @NVIDIADRIVE @NVIDIAAI
💨 How fast can an autonomous vehicle think? With Alpamayo 1, NVIDIA's 10B-parameter chain-of-thought reasoning model, the distilled version can reason in real time. Hear Marco Pavone (@drmapavone), Yan Wang, Yurong You, and Wenhao Ding from our AV Research team break down Alpamayo 1 and what's next for reasoning in autonomous driving. 🔁 Watch the replay: nvda.ws/3O5gKb3
2
15
4,454