ALT Pictorial representation of embedding recycling (figure 1 from our manuscript). The image is divided into two parts. At the top, a stack of documents is sent through a transformer model to perform a task. Output of the intermediate layer is shown to get cached into a database in the bottom part of the image. Then, when running a transformer model on a new task (bottom right), intermediate representations are queried out of the embedding store.