Guan-Horng Liu - Machine Learning PhD Student - School of Aerospace Engineering

The presentation will be held in Coda C0915 Atlantic. Wednesday, June 26, at 1 PM EST

You are also welcome to join remotely via the provided Zoom link: https://gatech.zoom.us/j/3392051118?omn=97259773696

The abstract is included below, and copies of the proposal are available upon request.

 

Committee:

  1. Dr. Evangelos Theodorou (School of Aerospace Engineering, Georgia Tech; Advisor)
  2. Dr. Molei Tao (School of Mathematics, Georgia Tech)
  3. Dr. Yao Xie (School of Industrial and Systems Engineering, Georgia Tech)
  4. Dr. Justin Romberg (School of Electrical and Computer Engineering, Georgia Tech)
  5. Dr. Arnaud Doucet (Department of Statistics, University of Oxford; Google DeepMind)

 

Large-Scale Optimization for Deep Neural Network Architecture: A Dynamical System Theory Perspective

Abstract:

Optimization of deep neural networks (DNNs) has been a driving force in the advancement of modern artificial intelligence. Despite efforts to design DNN architectures that leverage domain-specific knowledge, the development of optimization algorithms has often progressed independently of architectural innovations. This thesis delves into large-scale optimization methods that leverage the underlying deep architectural structures being optimized. Specifically, we demonstrate that the dynamical system and optimal control theory pave a profound foundation for algorithmic characterization in learning various deep architectures, including standard DNNs, Neural ODEs, and SDEs such as diffusion models/bridges.

Optimal control, in its broadest sense, examines the principle of optimization over dynamical systems. This methodological perspective naturally arises in training neural differential equations and can be applied to standard DNNs, with Backpropagation emerging as an approximate dynamic programming. Through development, we emphasize the significance of control-theoretic components such as differential programming and nonlinear Feynman-Kac, unifying existing optimization methods and extending them to handle a broader class of complex dynamics and problem setups that may otherwise be hard to adapt or foresee. The developed methods have been applied to large-scale applications such as image generation, restoration, translation, as well as solving mean-field games and opinion modeling.