Renyuan Xu (USC): Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods
348 Via Pueblo
Stanford, CA 94305
Asymptotic Analysis of Deep Residual Networks and Convergence of Gradient Descent Methods
Abstract: Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, we prove the existence of an alternative ODE limit, a stochastic differential equation, or neither of these. For each case, we also derive the limit of the backpropagation dynamics and address its adaptiveness issue. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
When the gradient descent method is applied to the training of ResNets, we prove that it converges linearly to a global minimum if the network is sufficiently deep and the initialization is sufficiently small. In addition, the global minimum found by the gradient descent method has finite quadratic variation without using any regularization in the training. This confirms existing empirical results that the gradient descent method enjoys an implicit regularization property and is capable of generalizing to unseen data.
This is based on joint work with Rama Cont (Oxford), Alain Rossier (Oxford), and Alain-Sam Cohen (InstaDeep).
Bio: Renyuan Xu is a WiSE Gabilan Assistant Professor in the Epstein Department of Industrial and Systems Engineering at the University of Southern California. Before joining USC, she spent two years as a Hooke Research Fellow in the Mathematical Institute at the University of Oxford and she completed her Ph.D. in the IEOR Department at UC Berkeley. Renyuan's research interests lie broadly in the span of stochastic analysis, mathematical finance, game theory, and machine learning. Renyuan is the recipient of the JP Morgan AI Research Award 2022 and her research is also supported by the WiSE Institute at USC.