🚨Understanding In-Context Learning:
1. Pretrained LLMs can implement learning algorithms to learn from data in-context.
2. Transformers can encode multiple algorithms for the same task and use one based on context at inference time.
3. Attention-free models also exhibit ICL.
Recently, Transformers have been shown to implement learning algorithms in-context. Key questions:
What are their limits?
Can they exploit informative examples to learn more efficiently?
How does this relate to pretrained LLMs?
Our new preprint explores these questions.
🧵