Renyuan Xu (USC): Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods

Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods

Abstract: Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, we prove the existence of an alternative ODE limit, a stochastic differential equation, or neither of these. For each case, we also derive the limit of the backpropagation dynamics and address its adaptiveness issue. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

When the gradient descent method is applied to the training of ResNets, we prove that it converges linearly to a global minimum if the network is sufficiently deep and the initialization is sufficiently small. In addition, the global minimum found by the gradient descent method has finite quadratic variation without using any regularization in the training. This confirms existing empirical results that the gradient descent method enjoys an implicit regularization property and is capable of generalizing to unseen data.

This is based on joint work with Rama Cont (Oxford), Alain Rossier (Oxford), and Alain-Sam Cohen (InstaDeep).

Bio: Renyuan Xu is a WiSE Gabilan Assistant Professor in the Epstein Department of Industrial and Systems Engineering at the University of Southern California. Before joining USC, she spent two years as a Hooke Research Fellow in the Mathematical Institute at the University of Oxford and she completed her Ph.D. in the IEOR Department at UC Berkeley. Renyuan's research interests lie broadly in the span of stochastic analysis, mathematical finance, game theory, and machine learning. Renyuan is the recipient of the JP Morgan AI Research Award 2022 and her research is also supported by the WiSE Institute at USC.

Explore More Events

Conferences
AI in Fintech Forum: 2024

Friday, May 10, 2024

326 Galvez Street
Frances C. Arrillaga Alumni Center
Stanford, CA 94305
United States
AFTLab Seminars
Oliver Giesecke (Stanford Hoover): Deep Learning for Corporate Bonds

Thursday, May 16, 2024 | 5:00pm - 6:00pm PDT

475 Via Ortega
Room 305
Stanford, CA 94305
United States
Doctoral Seminar
AFTLab PhD Student Research Workshop

Friday, May 31, 2024 | 1:00pm - 5:00pm PDT

All Upcoming Events

Renyuan Xu (USC): Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods

Event Details:

Location

Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods

Related Topics

Explore More Events

AI in Fintech Forum: 2024

Oliver Giesecke (Stanford Hoover): Deep Learning for Corporate Bonds

AFTLab PhD Student Research Workshop