Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks
Overview
Authors
Affiliations
This article deals with nonconvex stochastic optimization problems in deep learning. Appropriate learning rates, based on theory, for adaptive-learning-rate optimization algorithms (e.g., Adam and AMSGrad) to approximate the stationary points of such problems are provided. These rates are shown to allow faster convergence than previously reported for these algorithms. Specifically, the algorithms are examined in numerical experiments on text and image classification and are shown in experiments to perform better with constant learning rates than algorithms using diminishing learning rates.
Xiong M, Zheng S, Liu W, Cheng R, Wang L, Zhang H Sci Rep. 2024; 14(1):25856.
PMID: 39468121 PMC: 11519518. DOI: 10.1038/s41598-024-75703-8.
Zhu W, Xie H, Chen Y, Zhang G Int J Mol Sci. 2024; 25(8).
PMID: 38674012 PMC: 11050447. DOI: 10.3390/ijms25084429.
Wen X, Zeng M, Chen J, Maimaiti M, Liu Q Life (Basel). 2023; 13(11).
PMID: 38004265 PMC: 10672231. DOI: 10.3390/life13112125.