2/3 The secret sauce: Hidden layer activations in wide networks live in small subspaces! Train your wide-net for a few epochs, run PCA on the activations, project the weights on the PCA basis, and continue training to find your new state-of-the-art subnetwork.