Hit 1 million RL steps/sec today!
This policy trained for 1 minute on a single GPU using a from-scratch physics simulator built with custom Triton kernels.
The sim runs at about 5 million physics steps per second. With decimation = 4, this produces ~1.3 million env steps/second. The PPO training loop brings it down to ~1 million steps/second.
Modern GPU-accelerated physics engines like IsaacSim and MJWarp are amazing, but they are, by design, general purpose. For any specific problem, they can leave a lot of performance on the table. And once you start coloring outside the lines, throughput can fall apart fast.
The video below is sim2sim verification: trained in our sim and evaluated against IsaacLab's built-in Go2 environment. Clean transfer!