PhD in Stanford CS, Prev Undergrad at HKU. Interested in robotics

Joined September 2023
17 Photos and videos
Pinned Tweet
What missing in RL based humanoid controller from industrial robots are precision and force control. CHIP can do both. We propose a simple recipe to build humanoid impedance controller, which can be used for wiping, carrying large objects and multi-robot collaboration.
🚀 Introducing CHIP: Adaptive Compliance for Humanoid Control through Hindsight Perturbation! Current humanoids face a trade-off: they are either Agile & Stiff OR Slow & Soft. CHIP breaks this barrier. We enable on-the-fly switching between Compliant (wiping 🧼, collaborative holding 📦) and Stiff (lifting dumbbells 🏋️, opening doors 🚪💪) behaviors—all while maintaining agile skills like running! 🏃💨 Website: nvlabs.github.io/CHIP/ Join me for a deep dive on how CHIP enables adaptive control for complex tasks. 🧵↓
3
14
3,226
Sirui Chen retweeted
🪜 What if humanoids could climb ladders and work on them straight out of simulation? Meet LadderMan: a perceptive system for zero-shot sim-to-real ladder climbing and on-ladder manipulation. Watch the humanoid climb, stabilize, and manipulate—all in one system. 🤖👇
17
61
312
103,738
As dexterous hand become more human like , using human data also become easier than ever, we explore how to cotrain with both human and robot data and generalize to new task with only human data.
We show that robots can learn high-level task semantics, such as sorting rules, skill composition, and rule-based ordering, directly from human demos. This is useful because if your target task is a composition of the robot's existing skills, you could just collect human demos for it without collecting further robot data. Introducing Ego-Pi: VLA fine-tuning for egocentric human and robot data, a collaboration between @Stanford and @Meta. Website: egopipaper.github.io/ Paper: egopipaper.github.io/resourc… 1/6
4
21
5,798
Sirui Chen retweeted
Humanoid robotics is hitting a data wall. Teleop and mocap took us far, but they don’t scale to every object, terrain, and behavior. We’re releasing GRAIL: research.nvidia.com/labs/dai… — a fully digital pipeline for generating loco-manipulation data before the robot moves. 🧵(1/8)
4
69
346
41,753
Sirui Chen retweeted
What is missing to bring real-time motion research into AAA games and real-world robotics? We present MotionBricks, a step toward bridging this gap with two key components: - a single generative latent motion backbone covering 350,000 motion skills, running at 15,000 FPS with 2 ms latency and substantially improved quality and reliability. - a unified smart primitive interface for locomotion, object / scene interaction, with fine-grained control over generated behaviors. Webpage: nvlabs.github.io/motionbrick… Code: github.com/NVlabs/GR00T-Whol… Paper: arxiv.org/abs/2604.24833 (ACM TOG / SIGGRAPH 2026)
27
150
1,197
151,847
Sirui Chen retweeted
The Movement Lab has a new website ✨ A look at what we've been working on: humanoid robots, physics-based animation, and robot learning. tml.stanford.edu
2
32
190
16,397
Ego centric data for navigation is the way to go!
A person walks around campus for 5 hours with cameras. That's it. That's the training data. The result? A humanoid robot that traverses unseen buildings, crowds, and glass walls — zero robot data, zero finetuning. EgoNav is here. egonav.weizhuowang.com/ None of these behaviors were pre-programmed: • Waiting for a door to open before entering • Steering around glass walls invisible to depth sensors • Yielding to pedestrians and resuming • Re-routing when furniture is rearranged All emerged from 5 hours of a human walking around. The prior is real. (1/6) #Humanoid #Robotics #DiffusionModel #EgoNav
1
3
717
Sirui Chen retweeted
Happy Monday! More exciting SONIC releasing incoming.
6
31
184
14,714
Sirui Chen retweeted
Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception hommi-robot.github.io
12
70
331
75,021
Sirui Chen retweeted
Excited to release Minimalist Compliance Control! We achieve robust, compliant robot interaction across robot arms, dexterous hands, and humanoids, with NO force sensors or learning. If you’re wondering what remains, please see the thread below😉 Website: minimalist-compliance-contro…
9
49
273
30,666
Generalist for dex manipulation!
🤖 Can a single robot policy manipulate diverse tools without ever seeing them before? Introducing SimToolReal 🔨 : a generalist dexterous manipulation policy that transfers zero-shot sim→real to unseen tools unseen tasks All videos are 1x speed (60 Hz control) 🧵👇
1
5
614
Finally everyone can benefit from the result of 700h motion 128 GPU training!
SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code model already there, training code and gr00t integration coming soon! Code: github.com/NVlabs/GR00T-Whol… Docs: nvlabs.github.io/GR00T-Whole… Site: nvlabs.github.io/GEAR-SONIC/
3
18
1,993
Sirui Chen retweeted
We have seen rapid progress in humanoid control — specialist robots can reliably generate agile, acrobatic, but preset motions. Our singular focus this year: putting generalist humanoids to do real work. To progress toward this goal, we developed SONIC (nvlabs.github.io/GEAR-SONIC/), a Behavior Foundation Model for real-time, whole-body motion generation that supports teleoperation and VLA inference for loco-manipulation. Today, we’re open-sourcing SONIC on GitHub. We are excited to see what the community builds upon SONIC and to collectively push humanoid intelligence toward real-world deployment at scale. 🌐 Paper: arxiv.org/abs/2511.07820 📃 Code: github.com/NVlabs/GR00T-Whol…
11
67
353
67,395
Just like in the video, the year of humanoid is unstoppable!
Can humanoids perform agile, autonomous, long-horizon parkour—based on what they see in the world? We present 𝗣𝗲𝗿𝗰𝗲𝗽𝘁𝗶𝘃𝗲 𝗛𝘂𝗺𝗮𝗻𝗼𝗶𝗱 𝗣𝗮𝗿𝗸𝗼𝘂𝗿 (𝗣𝗛𝗣): a framework that chains dynamic human skills using onboard depth perception for long-horizon traversal. 1/6
2
320
Sirui Chen retweeted
Long-tail scenarios remain a major challenge for autonomous driving. Unusual events—like accidents or construction zones—are underrepresented in driving data, yet require semantic and commonsense reasoning grounded in control. We propose SteerVLA, a framework that uses VLM reasoning to steer a driving policy via grounded, fine-grained language instructions. Paper: arxiv.org/abs/2602.08440 Website: steervla.github.io/
5
23
176
70,988
This is amazing! Agility and adaptive compliant usually doesn’t come hand in hand, glad GentleHumanoid also make it works.
The same policy that throws jump kicks can also shake your hand gently. Open-sourcing our mjlab-based universal motion tracking framework, with compliance control built in. Demo: motion-tracking.axell.top
1
608
Sirui Chen retweeted
How can robots handle fragile, soft everyday objects like humans do, using vision & tactile to regulate force? 🤖🥚 Introducing our full-stack solution: a low-cost ($150) force gripper (0.45~45N), a force-aware teleoperator, and a reactive policy for learning force control.
5
13
124
17,785
Sirui Chen retweeted
Humanoid robots are agile but stiff. CHIP shows how to add adaptive compliance without breaking motion tracking. A single controller handles wiping, door opening, box lifting, writing, and even running while carrying objects.
2
3
16
583
Sirui Chen retweeted
🚀 Introducing CHIP: Adaptive Compliance for Humanoid Control through Hindsight Perturbation! Current humanoids face a trade-off: they are either Agile & Stiff OR Slow & Soft. CHIP breaks this barrier. We enable on-the-fly switching between Compliant (wiping 🧼, collaborative holding 📦) and Stiff (lifting dumbbells 🏋️, opening doors 🚪💪) behaviors—all while maintaining agile skills like running! 🏃💨 Website: nvlabs.github.io/CHIP/ Join me for a deep dive on how CHIP enables adaptive control for complex tasks. 🧵↓
10
51
216
25,469
Sirui Chen retweeted
Robust humanoid perceptive locomotion is still underexplored. Especially when different cameras see different terrains, paths get narrow, and payloads disturb balance... Introduce RPL, tackling this with one unified policy: • Challenging terrains (slopes, stairs and stepping stones); • Multiple directions; • Payloads; Trained in sim. Validated long-horizon in the real world. Watch the robot walk it all🦿 Details below👇
5
56
276
59,458