Learning and planning for safe, embodied autonomous systems under uncertainty. Senior Research Scientist @ToyotaResearch. PhD from @StanfordMSL. 日本語 & English

Joined March 2018
39 Photos and videos
Pinned Tweet
A huge shout-out to TRI's VLA team for the public release of VLA Foundry! You can take full control of VLA training with this fully open-sourced codebase, which comes with a nice GUI dashboard with rigorous policy comparison powered by STEP🪜 tri-ml.github.io/step/

Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
11
4
44
7,778
Haruki Nishimura retweeted
We extended the deadline to *June 22nd*! Submit your coolest and craziest (in-progress or completed) works on generalist robot safety 😎 Workshop co-organized with the great team: @ArpitBahety @kensukenk @imp_aa @RobobertoMM @ianabraha @LihanZha
Excited to announce our #RSS2026 workshop: "Rethinking What It Means to be 'Safe' for Generalist Robots"! 🛡️🤖 Have new work or videos of robot safety failures? Submit by June 12! 👇
5
12
1,660
Haruki Nishimura retweeted
What does it actually mean for a modern robot to be safe? As generalist robots move across tasks, environments, and users, safety must encompass many dimensions: collisions, semantic constraints, perceptions of safety, privacy, and more. Join our discussion at RSS 2026!
Excited to announce our #RSS2026 workshop: "Rethinking What It Means to be 'Safe' for Generalist Robots"! 🛡️🤖 Have new work or videos of robot safety failures? Submit by June 12! 👇
4
6
1,590
Haruki Nishimura retweeted
Excited to announce our #RSS2026 workshop: "Rethinking What It Means to be 'Safe' for Generalist Robots"! 🛡️🤖 Have new work or videos of robot safety failures? Submit by June 12! 👇
1
4
13
5,256
Haruki Nishimura retweeted
Releasing RecGen: a collaboration between @ToyotaResearch, @toyota_europe, and @UvA_Amsterdam tackling a core 3D vision challenge: reconstructing complete multi-object scenes (parts, poses, textures, even occluded geometry) from just 1 to a few RGB-D views. Trained purely on synthetic data, RecGen achieves SOTA on real-world robotics and 6D pose benchmarks, handling occlusions, symmetry, and complex interactions. A step toward scalable, high-fidelity digital twins for robotics, and better evaluation and training of generalist policies. reconstruction-by-generation…
2
35
220
26,948
Haruki Nishimura retweeted
I was thrilled to be back at @MIT for the Robotics Seminar! The talk recording is available now: Rethinking Robot Safety & Alignment in the Era of Generalist Policies youtu.be/pZM8sgLAye0?si=GG7t…
5
65
9,253
Haruki Nishimura retweeted
A few interesting rollouts from the Foundry-QwenVLA-2.5B multi-task model on seen tasks in sim –  a 🧵. I really like behaviors that involve non-prehensile manipulation, like the little nudges in StoreCerealBoxUnderShelf.
Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
2
20
118
14,825
Haruki Nishimura retweeted
Having control over upstream LLM/VLM training is key to training a good robotics model. We hope VLA Foundry opens the door for researchers and practitioners to answer questions they previously wouldn’t even have thought of asking if upstream pretraining was simply inherited!
Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
2
6
28
3,528
Haruki Nishimura retweeted
TRIで最後に関わったプロジェクトである、VLA Foundryがついにリリースされました!異なる言語モデルやビジョンモデルを手軽に試せるだけでなく、Drake Blenderを用いたシミュレーション環境で複数タスクの評価も簡単に行えます。ぜひ試してみてください!
Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
17
117
15,169
This is hugely based on @das_princeton's implementation that came out of the collaboration between TLU tri.global/trustworthy-learn… and @Majumdar_Ani's group at Princeton out of an internship project!

This is actually a pretty big deal — we rely on @imp_aa’s implementations to tell when policies are statistically different than each other. If someone presents some quick mean-only results internally without the CLD analysis, you can be sure someone will eventually ask for it.
1
5
815
Haruki Nishimura retweeted
Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
10
76
491
74,717
A huge shout-out to TRI's VLA team for the public release of VLA Foundry! You can take full control of VLA training with this fully open-sourced codebase, which comes with a nice GUI dashboard with rigorous policy comparison powered by STEP🪜 tri-ml.github.io/step/

Releasing VLA Foundry: an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. End-to-end control from language pretraining to action-expert fine-tuning — no more stitching together incompatible repos.
11
4
44
7,778
See also: "Statistical Thinking for Robot Policy Evaluation: From Rigorous A/B Testing to Effective Visualization" medium.com/toyotaresearch/st…
2
250
Haruki Nishimura retweeted
Great to see @LeRobotHF using STEP as a tool for statistically rigorous policy comparison! arxiv.org/abs/2503.10966
Releasing the Unfolding Robotics blog! Time to unfold robotics: we trained a robot to fold clothes using 8 bimanual setups, 100 hours of demonstrations, and 5k GPU hours. Flashy robot demos are everywhere. But you rarely see the real story: the data, the failures, the engineering. We’re sharing everything: code, data, and details in the blog → huggingface.co/spaces/lerobo…
3
36
6,297
Congrats to the @LeRobotHF team on this remarkable contribution to the robotics community by open-sourcing "everything" including code, data, and all the valuable knowledge! Our TLU team at TRI is fortunate to have collaborated on statistical evaluation and analysis.
Releasing the Unfolding Robotics blog! Time to unfold robotics: we trained a robot to fold clothes using 8 bimanual setups, 100 hours of demonstrations, and 5k GPU hours. Flashy robot demos are everywhere. But you rarely see the real story: the data, the failures, the engineering. We’re sharing everything: code, data, and details in the blog → huggingface.co/spaces/lerobo…
1
2
9
913
Haruki Nishimura retweeted
A really solid step toward scalable, high-quality robot data collection — Raiden, from colleagues at TRI @ZakharovSergeyN (and led by @s1wase) lowering the barrier to entry for bimanual data collection, with support for leader–follower setups and SpaceMouse teleop. Big highlight - it natively supports camera calibration and integrates TRI’s learned stereo depth model out of the box, with strong improvements over vanilla ZED SDK. If you're working on robot learning or data collection pipelines, definitely worth a look👇 tri-ml.github.io/raiden/
Our 3D Vision team (3DGR) is releasing Raiden — a data collection toolkit for YAM robots. Built for scalable, high-quality data: supports leader–follower SpaceMouse teleop, multi-camera setups, and modern stereo depth (incl. TRI learned stereo). tri-ml.github.io/raiden/
2
17
172
15,371
Haruki Nishimura retweeted
Baking without premix.
1
6
22
9,982
Are you about to evaluate robot policies for your next paper, comparing your policy with baselines? Take a moment to review this article by @MashaItkina and myself, introducing practical tips on rigorous statistical analysis with easy-to-use Python tools: medium.com/toyotaresearch/st…
1
4
16
2,420
We also highlight our open-source, plug-and-play plotting tool in Python, which extends STEP to multi-policy comparisons and concisely visualizes the output of the statistical testing.
1
1
150
STEP is open-sourced here: tri-ml.github.io/step/ Explore the new plotting tool and tutorial here: lnkd.in/gBReeEdH Working examples of our statistical analysis tool can be found in the recent co-training study here: arxiv.org/abs/2602.01067

1
105