Investigating the Gradient Descent of Neural Networks at the Edge of Stability
Title
Investigating the Gradient Descent of Neural Networks at the Edge of Stability
Subject
Computer Science
Creator
Jonathan Auton
Date
2024
Contributor
Ranko Lazic, Matthias Englert
Abstract
Artificial neural networks are a type of self-learning computer algorithm that have become central to the development of modern AI systems. The most used self-learning technique is gradient descent, a simple yet effective algorithm that iteratively improves a network by tweaking it repeatedly in a direction of improving accuracy. However, new findings suggest the step size cannot be made small enough to avoid the effects of iterative instability. As a result, the learning process tends to become chaotic and unpredictable. What is fascinating about this chaotic nature is that despite it, gradient descent still finds effective solutions. My project seeks to develop an understanding of the underlying mechanisms of this chaotic nature that is paradoxically effective.
Files
Collection
Citation
Jonathan Auton, “Investigating the Gradient Descent of Neural Networks at the Edge of Stability,” URSS SHOWCASE, accessed November 23, 2024, https://urss.warwick.ac.uk/items/show/661.