CEO @Neuracore_AI | Assistant Professor @imperialcollege | ex-Director of Dyson Robot Learning Lab | Postdoc @UCBerkeley w/ @pabbeel | PhD ICL w/ @ajdDavison

Joined January 2010
185 Photos and videos
Pinned Tweet
๐—”๐—ณ๐˜๐—ฒ๐—ฟ ๐Ÿญ๐Ÿฌ ๐˜†๐—ฒ๐—ฎ๐—ฟ๐˜€ ๐—ถ๐—ป ๐—ฟ๐—ผ๐—ฏ๐—ผ๐˜ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด, from my PhD at Imperial to Berkeley to building the Dyson Robot Learning Lab, one frustration kept hitting me: ๐—ช๐—ต๐˜† ๐—ฑ๐—ผ ๐—œ ๐—ต๐—ฎ๐˜ƒ๐—ฒ ๐˜๐—ผ ๐—ฟ๐—ฒ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ ๐˜๐—ต๐—ฒ ๐˜€๐—ฎ๐—บ๐—ฒ ๐—ถ๐—ป๐—ณ๐—ฟ๐—ฎ๐˜€๐˜๐—ฟ๐˜‚๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ผ๐˜ƒ๐—ฒ๐—ฟ ๐—ฎ๐—ป๐—ฑ ๐—ผ๐˜ƒ๐—ฒ๐—ฟ ๐—ฎ๐—ด๐—ฎ๐—ถ๐—ป? ๐—ง๐—ต๐—ฒ ๐—ฝ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ๐—ป ๐—œ ๐—ธ๐—ฒ๐—ฝ๐˜ ๐˜€๐—ฒ๐—ฒ๐—ถ๐—ป๐—ด: โ€ข New robotics team starts โ€ข Spends 6 months building data collection pipeline โ€ข Spends another 3 months debugging synchronization issues โ€ข Finally starts collecting task-specific data โ€ข Realizes their infrastructure choices limit their flexibility โ€ข Starts over ๐—ง๐—ต๐—ถ๐˜€ ๐—ถ๐˜€ ๐˜๐—ต๐—ฒ ๐˜„๐—ต๐—ผ๐—น๐—ฒ ๐—ฝ๐—ผ๐—ถ๐—ป๐˜ ๐—ผ๐—ณ ๐—ฟ๐—ผ๐—ฏ๐—ผ๐˜ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด: Robot learning is fundamentally data-driven. Whether you're picking strawberries or assembling electronics, the core infrastructure needs are identical. That's actually why I was so interested in pursuing data-driven robotics over a decade ago. ๐—ฌ๐—ผ๐˜‚ ๐—ฎ๐—น๐˜„๐—ฎ๐˜†๐˜€ ๐—ป๐—ฒ๐—ฒ๐—ฑ: โ€ข Multi-sensor data synchronization across different frequencies โ€ข Flexible storage that works with future algorithms โ€ข Visualization tools to understand your data โ€ข The ability to experiment with different temporal resolutions โ€ข Robust logging that captures everything you might need later The trend towards AI in robotics is growing, with robots needing to process and analyze large amounts of sensor data to manage variability and unpredictability in real environments. ๐—•๐˜‚๐˜ ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ๐˜† ๐˜๐—ฒ๐—ฎ๐—บ ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ๐˜€ ๐˜๐—ต๐—ถ๐˜€ ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜€๐—ฐ๐—ฟ๐—ฎ๐˜๐—ฐ๐—ต. Imagine if every web developer had to build their own database, web server, and deployment pipeline before writing their first line of application code. ๐—ง๐—ต๐—ถ๐˜€ ๐—ถ๐˜€ ๐˜„๐—ต๐˜† ๐—œ ๐—ณ๐—ผ๐˜‚๐—ป๐—ฑ๐—ฒ๐—ฑ ๐—ก๐—ฒ๐˜‚๐—ฟ๐—ฎ๐—ฐ๐—ผ๐—ฟ๐—ฒ. Instead of every robotics team spending months on infrastructure, we provide the common tools that let you go from "I have a robot" to "I'm shipping intelligent robot behaviors" in days, not months. ๐—ง๐—ต๐—ฒ ๐—ฟ๐—ฒ๐—ฎ๐—น ๐—ถ๐—ป๐—ป๐—ผ๐˜ƒ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ถ๐—ป ๐—ฟ๐—ผ๐—ฏ๐—ผ๐˜๐—ถ๐—ฐ๐˜€ ๐˜„๐—ผ๐—ป'๐˜ ๐—ฐ๐—ผ๐—บ๐—ฒ ๐—ณ๐—ฟ๐—ผ๐—บ ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ๐˜†๐—ผ๐—ป๐—ฒ ๐—ฟ๐—ฒ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ๐—ถ๐—ป๐—ด ๐˜๐—ต๐—ฒ ๐˜€๐—ฎ๐—บ๐—ฒ ๐—ฝ๐—น๐˜‚๐—บ๐—ฏ๐—ถ๐—ป๐—ด. ๐—œ๐˜'๐—น๐—น ๐—ฐ๐—ผ๐—บ๐—ฒ ๐—ณ๐—ฟ๐—ผ๐—บ ๐˜๐—ฒ๐—ฎ๐—บ๐˜€ ๐˜„๐—ต๐—ผ ๐—ฐ๐—ฎ๐—ป ๐—ณ๐—ผ๐—ฐ๐˜‚๐˜€ ๐—ฒ๐—ป๐˜๐—ถ๐—ฟ๐—ฒ๐—น๐˜† ๐—ผ๐—ป ๐˜„๐—ต๐—ฎ๐˜ ๐—บ๐—ฎ๐—ธ๐—ฒ๐˜€ ๐˜๐—ต๐—ฒ๐—ถ๐—ฟ ๐—ฎ๐—ฝ๐—ฝ๐—น๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐˜‚๐—ป๐—ถ๐—พ๐˜‚๐—ฒ. Robot learning shouldn't be bottlenecked by infrastructure. It should be bottlenecked by creativity. What's the longest you've spent building infrastructure before getting to the actual robotics problem you wanted to solve?
19
87
747
56,519
Data scarcity. Sim-to-real gaps. Deployment timelines that stretch into months. Different people, different companies, the same bottlenecks every time. It's the exact problem we started @Neuracore_AI to solve. We're working with robot learning teams on exactly these challenges, so if that sounds like yours, get in touch or drop me a message directly.
We took one question to IEEE International Conference on Robotics and Automation (ICRA) 2026: what's the biggest challenge in industrial robotics right now? Data scarcity. Sim-to-real gaps. Deployment that takes months, not days. Researchers, founders and engineers from KUKA, @Universal_Robot, @FlexivRobotics & @noitomocap all pointing at the same bottlenecks. And all of them are exactly what we're building Neuracore to solve. We're working with robot learning teams on exactly these problems. If that sounds like yours, get in touch. #ICRA2026 #Robotics #RobotLearning #PhysicalAI
1
60
The most common thing the team heard at @ieee_ras_icra: "we'd love to take on more learning-based projects, but we can't carry the deployment risk." That risk is exactly what we built @Neuracore_AI to remove. If that sounds familiar, drop me a message.
One week on from @ieee_ras_icra 2026. It was great to connect with researchers, system integrators and so many people working in the field. The conversations confirmed what we're hearing everywhere: teams want to take on robot learning projects, but the infrastructure to collect data, train and deploy at scale is holding them back. If you're a system integrator or automation team looking to take on projects you couldn't before, and deliver them faster than your competitors, let's talk. #ICRA2026 #Robotics #RobotLearning #SystemIntegrators #wuji @AgilexRobotics
4
1,532
Stephen James retweeted
Looking to find us at @ieee_ras_icra? Weโ€™re right at the entrance to hall B at booth S006. If youโ€™re a system integrator, or looking to go from demo to deployment faster than ever, come and chat with the team!
1
6
363
Great to see the @Neuracore_AI team in action at our first ever @ieee_ras_icra! If you're there this year come by and see the platform in action for yourself!
Are you at @ieee_ras_icra this week? Booth S006 is where you'll find us. We have a live demo running all week, plus the full team on hand to show you what the Neuracore platform can do for your company.
8
577
Stephen James retweeted
Neuracore is exhibiting at the @ieee_ras_icra 2026 in Vienna, Austria. Join us to see how robotics teams are eliminating the 80% of engineering time currently spent on data pipelines instead of robot learning. Come discuss the infrastructure bottlenecks killing your transition from lab prototypes to distributed fleets. Meet the team. See live demos of data recording, visualisation, training and deploying model using our infrastructure. Booth S006, June 1-5, 2026 | VIECON in Vienna, Austria #Neuracore #ICRA26
1
5
619
Thanks to the STIQ team for hosting last night. It was great to share what weโ€™re building at @Neuracore_AI and discuss some of the harder questions around where robotics is headed. Also great to be sharing the stage with All3, @recycleye , and @dexoryHQ. If youโ€™re exploring VLA, VLM, or robot learning deployments, my DMs are open - always happy to chat. #Robotics #RobotLearning #UKRobotics #STIQROBOTICS
1
9
552
Stephen James retweeted
That's a wrap on our inaugural sponsored hackathon! Congratulations to the winners of the "Best Use of Neuracore" award at the Oxford Edge and Oxford Artificial Intelligence Society Hackathon this weekend. Well done to Sarthak Das from the robot learning team at Neuracore for his effort on-site supporting teams and presenting the award!
1
1
12
451
Most simulation benchmarks for VLAs cannot tell you whether their numbers map to reality. REALM can: p < 0.001 correlation with real-world rollouts across 7 manipulation skills and 5 perturbations. The sim-to-real gap has been the central reason I have argued for collecting real data wherever possible. Most simulation benchmarks tell you something, but you cannot tell whether that something maps to reality. REALM, from Martin Sedlacek and the team at CTU Prague and Amsterdam, takes that problem seriously. The team built a simulation environment designed to correlate with real-world performance, and then validated it. Pearson values close to identity on task progression curves. Attention maps from ฯ€0 show 0.85 cosine similarity between matched real and simulated frames. They did not skip the validation step. They led with it. That changes what the simulation results actually mean. Across 15 perturbation factors covering visual, semantic, and behavioural variation, ฯ€0, ฯ€0-FAST, and GR00T N1.5 all show noticeable performance drops under semantic perturbations despite their internet-pretrained VLM backbones. All show sensitivity to camera viewpoint despite training on DROID's unusually diverse viewpoint distribution. The hardest axis of generalisation is across objects and their properties, not across skills. Reliability under perturbation is low across all three models. If the sim correlates with reality at the level REALM demonstrates, these are not simulation artefacts. They are real failure modes that real teams should be planning around. Two things this tells us. Validated simulation has a role in evaluation that it does not yet have in training. The cost of running thousands of perturbed rollouts in the real world is prohibitive. If REALM's correlation holds up across more task families, sim-based evaluation could become a serious tool for surfacing failure modes that ad-hoc real-world testing misses. The failure pattern across all three tested models also points back at the same place it always does. Pretraining buys you semantic grounding and skill primitives. It does not buy you robustness. The next generation of training data needs to focus on demonstrations where the object, scene, and viewpoint move underneath the skill, not on more demonstrations of the same skill on the same object. Paper link in comments.
3
8
77
6,861
There's been some exciting new features added to @Neuracore_AI this month! Head to our YouTube to see more: youtu.be/kjQ8RWJExb4
New to Neuracore? Check out our latest platform tour on YouTube and see how teams collect, observe, train, and deploy, all in one workflow. youtu.be/kjQ8RWJExb4
4
1,285
Excited to have @Neuracore_AI powering in this weekends Hackathon hosted by The Oxford Edge and @OxfordAI!
This weekend we're powering the Oxford Hardware / Physical AI Hackathon at @UniofOxford, with free access to the Neuracore platform for every participant. Hosted by The Oxford Edge and @OxfordAI with hardware from @FoundryRobotics, @Quanser and @huggingface LeRobot. Sensor kits from Atech. Coding credits from @AnthropicAI and @Cursor. If you're going, come find us!
5
1,306
Full fine-tuning is undoing the priors you spent the pretraining budget to build. That's the case PriorVLA is making, and the new paper from the team at CAS, Dexmal and collaborators is one of the cleaner demonstrations I have seen of the problem. Here's what happens. You take a pretrained VLA. You fine-tune on your downstream task. In-distribution evaluation looks fine. Then you test out-of-distribution and the model falls over. The pretraining gave you broad priors across diverse data. Fine-tuning pulled those priors toward the narrow patterns of your training set. The model effectively forgot what it knew. PriorVLA's response is to stop updating the pretrained action expert during fine-tuning. Freeze it, treat it as a read-only prior source, and train a parallel adaptation expert alongside it. Scene priors get pulled from the VLM, motor priors from the frozen expert, both routed into the adaptation expert via learned queries. Only 25% of the parameters a full fine-tune would touch actually get updated. The headline numbers: 11 points over ฯ€0.5 on RoboTwin 2.0-Hard, 99.1% average on LIBERO, 81% in-distribution and 57% out-of-distribution across 8 real-world tasks on two embodiments with standard data. The number that actually matters: with 10 demonstrations per task, PriorVLA beats ฯ€0.5 by 24 points in-distribution and 22 points out-of-distribution. A 24-point lift from 10 demos is the kind of sample efficiency that maps to how real teams ship robots, where you cannot collect thousands of demonstrations per skill. The broader implication is that we have been treating fine-tuning as if pretraining is just a smarter random initialisation. It isn't. Pretrained VLAs encode structure that downstream training overwrites unless you actively preserve it. Whether the right answer is frozen experts, LoRA-style adapters, or something else, the question of how to adapt without forgetting is now a first-class problem in the VLA stack. Credit: @CAS__Science Paper link in comments.
2
13
81
5,371
We kicked off our Robotics in Europe series last week with @IvanTregear and the @KAIKAKU_AI team over on Neuracore. The most underrated takeaway: operators don't need convincing on robotics. They need the layer underneath it. No databases, no analytics, no sensing means no foundation for automation to act on. The arm is the easy part. This is exactly why at @Neuracore_AI we stand with the teams building robotics today, ready for the learning-driven era ahead. Full episode and series live on our YouTube channel.
๐—ฅ๐—ฒ๐˜€๐˜๐—ฎ๐˜‚๐—ฟ๐—ฎ๐—ป๐˜๐˜€ ๐—ฑ๐—ผ๐—ป'๐˜ ๐—ป๐—ฒ๐—ฒ๐—ฑ ๐—ฐ๐—ผ๐—ป๐˜ƒ๐—ถ๐—ป๐—ฐ๐—ถ๐—ป๐—ด ๐—ผ๐—ป ๐—ฟ๐—ผ๐—ฏ๐—ผ๐˜๐—ถ๐—ฐ๐˜€. ๐—ง๐—ต๐—ฒ๐˜† ๐—ป๐—ฒ๐—ฒ๐—ฑ ๐˜๐—ต๐—ฒ ๐—น๐—ฎ๐˜†๐—ฒ๐—ฟ ๐˜๐—ต๐—ฎ๐˜ ๐˜€๐—ถ๐˜๐˜€ ๐˜‚๐—ป๐—ฑ๐—ฒ๐—ฟ๐—ป๐—ฒ๐—ฎ๐˜๐—ต ๐—ถ๐˜. @IvanTregear, CTO of @KAIKAKU_AI speaks on the misconception that operators are tech resistant. Most are eager to deploy. The real blocker is foundational: no databases, no analytics, no sensing layer for automation to act on. Head to the link in comments to watch our new series exploring Robotics in Europe.
2
605
New embodiments are landing every quarter. New arms, new grippers, new humanoids. Any serious robot learning team will be working with five different platforms inside a year. Neuracore was designed for that reality. Hardware agnostic isn't optional when the field moves this fast. It's the only architecture that holds up.
Most robot learning stacks assume you've already picked your hardware. Switch arms, switch grippers, switch sensors, and your data pipeline breaks. That's the Infrastructure Tax. And it's the reason teams spend more time wiring up robots than training models. Neuracore is hardware agnostic by design. Here it is making a cup of tea on an Open Arm. The same platform runs the same way on any embodiment you point it at, from research arms to industrial manipulators to humanoids. One stack. Any robot. Clean, high-fidelity data flowing into your training pipeline regardless of what's holding the kettle. Your hardware shouldn't decide your roadmap.
1
15
2,145
๐—ฆ๐—ฝ๐—ฒ๐—ฐ๐—ถ๐—ฎ๐—น๐—ถ๐˜€๐˜ ๐—ฝ๐—ผ๐—น๐—ถ๐—ฐ๐—ถ๐—ฒ๐˜€ ๐—ต๐—ฎ๐˜ƒ๐—ฒ ๐—ต๐—ถ๐˜€๐˜๐—ผ๐—ฟ๐—ถ๐—ฐ๐—ฎ๐—น๐—น๐˜† ๐—ผ๐˜‚๐˜๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฒ๐—ฑ ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐—น๐—ถ๐˜€๐˜ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€. ๐—ฃ๐—ต๐˜†๐˜€๐—ถ๐—ฐ๐—ฎ๐—น ๐—œ๐—ป๐˜๐—ฒ๐—น๐—น๐—ถ๐—ด๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ท๐˜‚๐˜€๐˜ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ฎ ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐—น๐—ถ๐˜€๐˜ ๐˜๐—ต๐—ฎ๐˜ ๐—บ๐—ฎ๐˜๐—ฐ๐—ต๐—ฒ๐˜€ ๐—ฎ๐—ป๐—ฑ ๐—ถ๐—ป ๐˜€๐—ผ๐—บ๐—ฒ ๐˜๐—ฎ๐˜€๐—ธ๐˜€ ๐—ฏ๐—ฒ๐—ฎ๐˜๐˜€ ๐˜๐—ต๐—ฒ๐—บ. ฯ€0.7 is a 5B parameter VLA trained on a mixture that would normally poison a policy: expert demonstrations, suboptimal autonomous rollouts, failure cases, egocentric human video, and web-scale multimodal data. The usual outcome is mode averaging and performance collapse. ฯ€0.7 avoids that by conditioning every training episode on multimodal context that describes not just what to do, but how it was done. Three conditioning signals in the prompt: โ€ข Detailed language instructions. Not "fold laundry" but "grasp the left sleeve with the left gripper, fold the shirt in half vertically." โ€ข Subgoal images. A separate world model generates near-future multi-view targets, and the policy learns inverse dynamics from current observation to subgoal. โ€ข Episode metadata. Quality, execution speed, mistake flags. This is what lets the model learn from suboptimal data without copying its mistakes. Results out of the box, against the RL specialist baselines from ฯ€*0.6: โ€ข Laundry folding (T-shirts, shorts): matches specialist throughput and success rate โ€ข Diverse laundry (hardest items): exceeds specialist throughput by 20% โ€ข Espresso machine: matches specialist performance โ€ข Box building: exceeds specialist throughput Single checkpoint, no task-specific post-training, no RL fine-tuning. The cross-embodiment result is more interesting. Shirt folding on a UR5e bimanual system, no task-specific data on that robot: 85.6% task progress, 80% success rate. Expert human teleoperators attempting the same task on UR5e for the first time scored 90.9% progress, 80.6% success. The policy lands in the same range as skilled humans attempting an unfamiliar embodiment cold. The UR5e arms are longer, heavier, and have different morphology than the source robot. ฯ€0.7 adapts its strategy. On the source robot, operators use tilted end-effector grasps. On UR5e, the policy switches to vertical grasps suited to the arm kinematics. Two things this implies for the broader VLA stack: The bottleneck for generalist VLAs has never really been the architecture. It's been the training mixture. If episode metadata is enough to make heterogeneous, partly-failed data productive, the cost curve for foundation model training in robotics changes meaningfully. Every team currently throwing away their autonomous rollout data is leaving training signal on the floor. The open question is how far world-model-generated subgoals scale. World model fidelity has been the soft underbelly of similar approaches before, and zero-shot cross-embodiment is a strong demonstration but a narrow one. Great work as always from the team at @physical_int. Paper link in comments.
2
5
51
12,451
๐Ÿ“Read the full paper and blog here: Paper: arxiv.org/pdf/2604.15483 Blog: pi.website/blog/pi07

453
Looking forward to joining this panel next Tuesday to talk about what it actually takes to get robots out of simulation and into the real world at scale. If you're in London and working in robotics, come join us!
One week to go. Next Tuesday, our Founder and CEO Stephen James takes the stage at STIQ Ltd's Robotics & Automation networking event, one of London's sharpest gatherings for roboticists, operators, buyers and investors. Stephen joins the panel with Obinna Njoku (@PepsiCo), Mark Slack (@CMRSurgical), Caroline France (@SciTechgovuk) and Oana Andreea Jinga (@dexoryHQ), hosted by Tom Andersson, to talk about how we're building the cloud-native infrastructure that closes the gap between simulation and real-world robot deployment. ๐Ÿ“ 20 Primrose St, London โฐ Tuesday 19 May, 17:30 to 21:00 Sign up here: luma.com/avzbz7v0?tk=cUJK8a
1
8
1,127
๐——๐˜†๐—ป๐—ฎ๐—บ๐—ถ๐—ฐ ๐—บ๐—ฎ๐—ป๐—ถ๐—ฝ๐˜‚๐—น๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ถ๐˜€ ๐—ผ๐—ป๐—ฒ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ต๐—ฎ๐—ฟ๐—ฑ๐—ฒ๐—ฟ ๐—ณ๐—ฎ๐—ถ๐—น๐˜‚๐—ฟ๐—ฒ ๐—บ๐—ผ๐—ฑ๐—ฒ๐˜€ ๐—ณ๐—ผ๐—ฟ ๐—ฉ๐—Ÿ๐—”๐˜€, ๐—ฎ๐—ป๐—ฑ ๐˜๐—ต๐—ฒ ๐—ฝ๐—ฒ๐—ฟ๐—ฐ๐—ฒ๐—ฝ๐˜๐—ถ๐—ผ๐—ป-๐—ฒ๐˜…๐—ฒ๐—ฐ๐˜‚๐˜๐—ถ๐—ผ๐—ป ๐—ด๐—ฎ๐—ฝ ๐—ถ๐˜€ ๐—ฝ๐—ฎ๐—ฟ๐˜ ๐—ผ๐—ณ ๐˜„๐—ต๐˜†. Static tasks tolerate 200ms inference delays. Dynamic tasks expose them. An object rolling at 0.5 m/s travels 10 cm during a 200ms inference cycle. By the time the action executes, the observation is stale. For continuous motion tasks, this becomes a real constraint. NTU S-Lab has released DynamicVLA, a 0.4B model that targets exactly this. It is built on a FastViT convolutional encoder and SmolLM2-360M truncated to 16 layers. Inference completes in around 226ms on an RTX A6000. The interesting part is not the raw speed, which is comparable to baselines. It is how the architecture handles latency. Continuous Inference overlaps prediction and execution, triggering new inference when the previous completes rather than waiting for action chunk exhaustion. Latent-Aware Action Streaming discards outdated actions and overwrites stale predictions with recent ones to keep what the policy is doing aligned with what the world is doing. ๐——๐—ข๐—  ๐—ฏ๐—ฒ๐—ป๐—ฐ๐—ต๐—บ๐—ฎ๐—ฟ๐—ธ ๐—ฟ๐—ฒ๐˜€๐˜‚๐—น๐˜๐˜€ ๐—ฟ๐—ฒ๐—ฝ๐—ผ๐—ฟ๐˜๐—ฒ๐—ฑ ๐—ถ๐—ป ๐˜๐—ต๐—ฒ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ: โ€ข 47% success against 13.6% baseline โ€ข 71.6% closed-loop reactivity against 28.3% for SmolVLA โ€ข 30.3% success without continuous inference and action streaming, a 36% degradation What is also worth noting is the data pipeline. Human teleoperators cannot react quickly enough to objects moving at 0.75 m/s, so the team replaces demonstrations with automated state-machine controllers. Dual RGB views handle 6D pose and velocity tracking in real time. The training data is generated by a system that can keep up with the task, not a human trying to. Inference lag is a real constraint once you move past pick-and-place demos, and work like this is a meaningful step. But the harder question for production robotics is still upstream. Can you actually collect enough data, at the right quality, to make these systems reliable in unconstrained environments. Architecture matters. The data path matters more. Nice work from the team at @NTUsg Lab. Paper and project page in comments below. #RobotLearning #VLA
3
23
192
18,889