SGDR: Stochastic Gradient Descent with Warm Restarts
View Researcher's Other CodesDisclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).
Please contact us in case of a broken link from here
Authors | Frank Hutter, Ilya Loshchilov |
Journal/Conference Name | 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings |
Paper Category | Artificial Intelligence |
Paper Abstract | Restart techniques are common in gradient-free optimization to deal with multimodal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at https//github.com/loshchil/SGDR |
Date of publication | 2016 |
Code Programming Language | Multiple |
Comment |