Fable 5 Sim one shot (ish) >
"Create a mujoco sim environment with an openarm 2.0 URL here> LINK. Place it behind a black table with a cube on the left and a box on the right infront of it, upscale the Quality and lighting. Place one camera on each wrist, within the camera slot on the wrist, and place a camera slightly above the main torso on a stick. make the cameras fisheye. Use Gemini api to be a multimodal critique on everything visually, camera setup, scene etc, loop it with both pro 3.1 and er 1.6, adjust setup, re critique, redo until its perfect, if you think you are done send result image for my eyeball. Do an RL policy with the following rewards: approaching the cube with the gripper, grasping the cube, lifting the cube above the hight of the box, releasing the cube in the box. Disable the arm on the side of the box. If you have any issues training this policy do a policy for each step assuming the success of the prior one and then chain them and hotstart and retrain the full loop. Once this has reached 90 % success, collect 100 successful episodes make it to a lerobot v3 dataset and train pi 0.5 for 5000 steps on it via your Qualia API skill, run inference for 100 episodes give it ample time to complete the task but set some limit your gemini critic might deem reasonable. Send results. If results are above 60%, add some other cube colors, and do the same RL policy for each individual cube and collect 100 episodes for each color and train pi again on those 500 episodes. Again remember your gemini critic. Run inference on the result for different language commands on the colors, collect episodes and give me the success rate."
Resulted in a 91% succesful VLA policy in sim in an afternoon. Silly task but insane that this is possible in basically a one shot.