Grokfast: Accelerated Grokking by Amplifying Slow Gradients
βΌ Researchers have unlocked a method to hasten 'grokking' in machine learning, boosting model generalization speeds by over 50 times with minimal code changes. By separating gradient components during training, the technique effectively enhances learning across various data types. #AI#machinelearningdeveloper
Code: github.com/ironjr/grokfastarxiv.org/abs/2405.20233