Our paper has been accepted to ICLR as a spotlight paper!
We introduced the refined Local Learning Coefficient, which measures “how much structure” there is in particular parts of the model associated to particular datasets or behaviors.
1/ How do attention heads form?
With our new approach, we show that attention heads have distinct developmental signatures. These signatures reveal how heads develop distinct functional roles specialized to different subsets of data. In the process, we discover a new circuit.