## Doctoral Dissertation Defense: Jared Praniewicz

#### Advisor: Dr. Florian Potra

Wednesday, September 1, 2021 · 10 AM - 12 PM

**Title:**

*Optimization Algorithms For Training Deep Neural Networks*

**Abstract**

A formal representation of a deep neural network is constructed, and it is demonstrated that networks satisfying the representation can be trained via feed forward back propagation efficiently. Analysis of the formal representation proves that optimization algorithms can not have a computational complexity of less than O(|E|) due to the dependence on back propagation. To ground the work in practice, a comparison is made of the popular optimization algorithms in use for training deep neural networks. The commonalities of the current algorithms provides a list of features to use and avoid when developing new deep learning optimization algorithms. Finally, two new optimization algorithms are developed. The first is linearized stochastic gradient descent (LSGD) which is a predictor-corrector method. Testing shows that LSGD achieves comparable or superior quality of fit to SGD, but with quicker and more stable initial training. The second is approximate stabilized Hessian gradient descent (ASHgrad) which is a quasi-Newton method. ASHgrad finds high quality critical points and trains rapidly, but is slow to compute due to limitations in the current machine learning frameworks.