1/
Many recent works study self-evolving agents with skills.
We believe the key value of self-evolution is generalization: agents should extract skills from experience and adapt across tasks by reusing them.
Compared with offline settings, where skill libraries are pre-built from a "training" set, online learning is more challenging and realistic: the agent must extract and reuse skills using only the test tasks.
Therefore, following prior works, we strictly formulate online skill learning: given a task stream and an initially empty skill library, the agent extracts skills from its own trajectories on the fly and reuses them on subsequent tasks.
In our paper, we further restrict the agent from accessing ground-truth signals from environments after finishing each task, leaving trajectory evaluation and skill verification to the agent itself.