PhD CS @ NYU Courant

Joined December 2014
15 Photos and videos
Pinned Tweet
We don't need the name of an object to pick it up; we simply need to know where it is and what it looks like. Introducing Contact-Anchored Policies (CAPs): instead of language, we explicitly condition on contacts. Our policy learns object pickup with only 16 hours of data! 🧡
5
28
108
13,067
Jeff Cui retweeted
[10/n] Broader implication 2 Learning from human videos or wearables is a promising direction. These paradigms, however, often treat "future state" as its pseudo action label β€” implicitly assuming perfect tracking, which is effectively a stiff-controller assumption. If our findings generalize, rethinking this assumption could unlock even more of their potential.
1
7
66
33,041
Jeff Cui retweeted
Learning from human data requires human-like hardware. Humans use their wrists constantly, but table-top manipulators lack this flexibility. We build upon RUKA and introduce RUKA-v2: a tendon-driven hand with a 2-DOF wrist and finger abduction/adduction πŸ‘‹βœŒοΈ
7
29
116
8,795
Jeff Cui retweeted
✨ Meet YOR: Open-Source Bimanual Mobile Manipulator from @nyuniversity Fully open-source mobile manipulator with dual 6-DoF PiPER arms by AgileX Robotics, BOM cost only ~$10k! 🌐 yourownrobot.ai/ #Robotics #OpenSource #AgileXRobotics #PiPER #NYU
7
41
242
16,223
Jeff Cui retweeted
World models are neural simulators. But neural simulators need grounding. If you close your eyes and reach out for the coffee cup in front of you, you’ll be able to manipulate it. To pass The Physical Turing Test, we need action loops at scale, irrespective of the modality, and that’s what the bitter lesson teaches us. We are upgrading Simulation 1.0 to 1.5 - generative assets and scenes, and we are calling it PhysReady. [1/]
A child consumes more data in 1 month than any LLM has ever seen. Embodied agents learn by doing, but the data that teaches them is tactile, sensorial and causal. Such data does not exist. To make physical AGI possible, we need to generate this new data at an industrial scale. Enter Palatial: automated infrastructure that converts raw data into sensory rich playgrounds for robots to learn in. Today, we’re unveiling Palatial PhysReady, the first automated sim asset generator (try it ⬇️) [1/5]
5
4
41
3,677
Jeff Cui retweeted
Robot foundation models are limited by costly real data, while simulation data is plentiful but visually mismatched to reality. We present Point Bridge, a method that enables zero-shot sim-to-real transfer for robot learning with minimal visual alignment. pointbridge3d.github.io
4
40
222
19,742
Jeff Cui retweeted
okay, actually yes
interesting. We also observed that contact predictions help locomotion control. I wonder if this would be general for learning-based control if we first predict the contacts, e.g., for VLA/WMs.
1
16
100
19,069
Jeff Cui retweeted
Introducing YOR. Balancing budget and functionality for a capable mobile robot is always a challenge. To give researchers and hobbyists more options, we built our own open-source one for ~$10k.
1
9
55
3,558
Jeff Cui retweeted
The real gap isn't capability, it's accessibility. We need platforms that labs can actually build, hack and improve without needing Big budgets or NDAs. Something modular, documented, cheap and yet capable enough to conduct hours of research . We present you YOR
Why buy a robot when you can build your own? Meet YOR, our new open-source bimanual mobile manipulator robot – built for researchers and hackers alike for only ~$10k. πŸ§΅πŸ‘‡
4
25
135
20,875
Jeff Cui retweeted
Why buy a robot when you can build your own? Meet YOR, our new open-source bimanual mobile manipulator robot – built for researchers and hackers alike for only ~$10k. πŸ§΅πŸ‘‡
7
22
169
38,462
Fully open-source, customizable hardware is the way for robotics research. Introducing Your Own Robot (YOR), a mobile bimanual robot platform for ~$10k.
2
4
32
1,474
The Jetson integration allows us to run our learned policies directly onboard, without having to worry about networking jitter, with multiple RGB streams, base odometry, and proprioception (10x autonomous):
1
4
125
Also check out MolmoSpaces-Bench from @omarrayyann! Our contact-anchored policies (CAPs) perform well zero-shot across diverse environments and objects. Omar is the rockstar behind our sim env for CAP, enabling us to train and evaluate multiple models in a day.
Replying to @omarrayyann
It’s hard to find true zero-shot end-to-end policies – ones that work without any fine-tuning in fully novel, simulated environments, even for single tasks! We test two policy families, the Ο€ family from @physical_int and the recent Contact-Anchored Policies (CAP) from NYU & UCB. On all our tasks, we are making steady progress – but we are nowhere close to saturation yet.
1
6
784
Jeff Cui retweeted
Omar is the mastermind of EgoGym – our sim eval-only benchmark that we hillclimbed to improve in the real world. That it was even possible was surprising to me, but it turns out when your robot is trained on diverse data sim is just another new environment.
Very excited to release Contact-Anchored Policies (CAP) 🧒 today! Check out this thread for more details on that and on our in-the-loop simulation evaluations:
1
8
71
8,468
Jeff Cui retweeted
Best ideas are often the simplest in hindsight. Meet Contact-Anchored Policies (CAP)🧒: by conditioning policies on physical contact (vs language) we achieve env & embodiment generalization with super low resources. This policy ⬇️ learned to pick from scratch w/ 16 hrs of data 🧡
7
31
172
16,618
We don't need the name of an object to pick it up; we simply need to know where it is and what it looks like. Introducing Contact-Anchored Policies (CAPs): instead of language, we explicitly condition on contacts. Our policy learns object pickup with only 16 hours of data! 🧡
5
28
108
13,067