Investigating the Gradient Descent of Neural Networks at the Edge of Stability

Title

Investigating the Gradient Descent of Neural Networks at the Edge of Stability

Subject

Computer Science

Creator

Jonathan Auton

Date

2024

Contributor

Ranko Lazic, Matthias Englert

Abstract

Artificial neural networks are a type of self-learning computer algorithm that have become central to the development of modern AI systems. The most used self-learning technique is gradient descent, a simple yet effective algorithm that iteratively improves a network by tweaking it repeatedly in a direction of improving accuracy. However, new findings suggest the step size cannot be made small enough to avoid the effects of iterative instability. As a result, the learning process tends to become chaotic and unpredictable. What is fascinating about this chaotic nature is that despite it, gradient descent still finds effective solutions. My project seeks to develop an understanding of the underlying mechanisms of this chaotic nature that is paradoxically effective.

Meta Tags

Jonathan Auton, Ranko Lazic, Matthias Englert, neural network, computer science, DCS, machine learning, self-learning, artificial intelligence, AI, gradient descent, gradient flow, GFS, GFS sharpness, edge of stability, EoS, sharpness, stability, instability, progressive sharpening, bifurcation diagram, alignment, visualisation, isosurface

Files

Citation

Jonathan Auton, “Investigating the Gradient Descent of Neural Networks at the Edge of Stability,” URSS SHOWCASE, accessed November 23, 2024, https://urss.warwick.ac.uk/items/show/661.