ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Hessian-based Bounds on Learning Rate for Gradient Descent Algorithms

Gowgi, P and Garani, SS (2020) Hessian-based Bounds on Learning Rate for Gradient Descent Algorithms. In: 2020 International Joint Conference on Neural Networks, IJCNN 2020, 19-24 Jul 2020, Glasgow; United Kingdom.

[img] PDF
IJCNN_2020.pdf - Published Version
Restricted to Registered users only

Download (1MB) | Request a copy
Official URL: http://doi.org/10.1109/IJCNN48605.2020.9207074

Abstract

Learning rate is a crucial parameter governing the convergence rate of any learning algorithm. Most of the learning algorithms based on stochastic gradient descent (SGD) method depend on heuristic choice of learning rate. In this paper, we derive bounds on the learning rate of SGD based adaptive learning algorithms by analyzing the largest eigenvalue of the Hessian matrix from first principles. The proposed approach is analytical. To illustrate the efficacy of the analytical approach, we considered several high-dimensional data sets and compared the rate of convergence of error for the neural gas algorithm and showed that the proposed bounds on learning rate result in a faster rate of convergence than AdaDec, Adam, and AdaDelta approaches which require hyper-parameter tuning. © 2020 IEEE.

Item Type: Conference Paper
Publication: Proceedings of the International Joint Conference on Neural Networks
Publisher: Institute of Electrical and Electronics Engineers Inc
Additional Information: The copyright of this paper belongs to Institute of Electrical and Electronics Engineers Inc
Keywords: Approximation theory; Clustering algorithms; Eigenvalues and eigenfunctions; Gradient methods; Heuristic algorithms; Heuristic methods; Matrix algebra; Neural networks; Parameter estimation; Stochastic systems, Adaptive learning algorithm; Analytical approach; Gradient descent algorithms; High dimensional data; Largest eigenvalues; Neural gas algorithms; Rate of convergence; Stochastic gradient descent, Learning algorithms
Department/Centre: Division of Electrical Sciences > Electronic Systems Engineering (Formerly Centre for Electronic Design & Technology)
Date Deposited: 22 Dec 2021 10:31
Last Modified: 22 Dec 2021 10:31
URI: http://eprints.iisc.ac.in/id/eprint/67405

Actions (login required)

View Item View Item