Donald Goldfarb, Columbia University

Date

Oct 6, 2023, 11:00 am – 11:30 am

Friend 006

Speakers

Donald Goldfarb

Columbia University

Details

Due to the enormous number of parameters that Deep Neural Networks (DNNs) have, using the Hessian matrix or a full approximation to it is prohibitive. Hence, we have proposed, and will describe in our talk, efficient and effective ways to use second-order information to train DNNs. These include diagonal, block-diagonal and Kronecker factored quasi-Newton, natural gradient Fisher Information matrix approximations, and concepts of tensor normal covariance and self-concordance, that give rise to methods that often outperform first-order methods.