Yasaman Bahri, Google Brain
The IACS seminar series is free and open to the public but registration is required.
Talk Abstract: Deep neural networks are a rich class of models now used across many domains, but our theoretical understanding of their learning and generalization is relatively less developed. A fruitful angle for investigation has been to study deep neural networks that are also wide (having many hidden units per layer), which has given rise to foundational connections between deep networks, kernel methods, and Gaussian processes. Dr. Bahri will briefly survey her past work in this area and then focus on recent work that sheds light on regimes not captured by existing theory. She will discuss how the choice of learning rate in gradient descent separates the dynamics of deep neural networks into two classes that are separated by a sharp phase transition as networks become wider. These two phases have distinct signatures that are predicted by a class of solvable models. Altogether these findings serve as building blocks for constructing a more complete, predictive theory of deep learning.
Additional information will be posted shortly.
Institute for Applied Computational Science (IACS)