9/
This work is extremely exciting for AI safety, security, and transparency, and hopefully more research on activation-based task inspection, decoding, and interpretability!
We will open-source our TaskTrack toolkit, containing the dataset, activations, and inspection tools.