Aggregated Momentum: Stability Through Passive Damping
We introduce a simple variant of momentum optimization which is able to outperform classical momentum, Nesterov, and Adam on deep learning tasks with minimal hyperparameter tuning.
openreview.net