Check out our #ICLR2026 poster, “Generalizable End-to-End Tool-Use RL with Synthetic CodeGym”!
We developed CodeGym, an automated pipeline that converts coding problems into multi-turn tool-use environments for agent RL training. We end up with 13K environments and 80K task configurations. Training on CodeGym can significantly improve LLM tool use and multi-turn interaction capabilities on OOD tasks (e.g., Tau-Bench, ALFWorld).
Unfortunately, I can't make it to ICLR in person, so sharing our poster here!
Paper: arxiv.org/abs/2509.17325
Dataset: huggingface.co/datasets/Vani…
How can we boost LLM agents’ generalizability to OOD tasks and environments?
Check out CodeGym, our new project for synthesizing environments for LLM agent RL training. CodeGym is a synthetic environment generation framework for reinforcement learning on multi-turn tool-use tasks. It automatically converts static coding problems into interactive and verifiable RL training environments.
Training in CodeGym leads to strong OOD generalization — for example, a Qwen2.5-32B-Instruct model achieved an 8.7-point absolute accuracy gain on τ-Bench!
We’ve just released the paper, synthesis pipeline, and dataset:
📄 Paper: arxiv.org/abs/2509.17325
💻 Project: github.com/StigLidu/CodeGym
📊 Dataset: huggingface.co/datasets/Vani…
📷
More details in the thread👇
How can we boost LLM agents’ generalizability to OOD tasks and environments?
Check out CodeGym, our new project for synthesizing environments for LLM agent RL training. CodeGym is a synthetic environment generation framework for reinforcement learning on multi-turn tool-use tasks. It automatically converts static coding problems into interactive and verifiable RL training environments.
Training in CodeGym leads to strong OOD generalization — for example, a Qwen2.5-32B-Instruct model achieved an 8.7-point absolute accuracy gain on τ-Bench!
We’ve just released the paper, synthesis pipeline, and dataset:
📄 Paper: arxiv.org/abs/2509.17325
💻 Project: github.com/StigLidu/CodeGym
📊 Dataset: huggingface.co/datasets/Vani…
📷
More details in the thread👇