[ABI team seminar] Talk by Dr.Pratik Chaudhari on "A Picture of the Prediction Space of Deep Networks"

January 25, 2024 14:25

Abstract

This is an online seminar.
Registration is required.

Title:
A Picture of the Prediction Space of Deep Networks

Abstract:
Deep networks have many more parameters than the number of training data and can therefore overfit—and yet, they predict remarkably accurately in practice. Training such networks is a high-dimensional, large-scale and non-convex optimization problem and should be prohibitively difficult—and yet, it is quite tractable. This talk aims to shed light on these puzzles.

We will argue that deep networks generalize well because of a characteristic structure in the space of learnable tasks. The input correlation matrix for typical tasks has a “sloppy” eigenspectrum where, in addition to a few large eigenvalues, there is a large number of small eigenvalues that are distributed uniformly over a very large range. As a consequence, the Hessian and the Fisher Information Matrix of a trained network also have a sloppy eigenspectrum. Using these ideas, we will demonstrate an analytical non-vacuous PAC-Bayes generalization bound for general deep networks.

We will next develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we will reveal that the training process explores an effectively low dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We will also show that predictions of networks being trained on different tasks (e.g., different subsets of ImageNet) using different representation learning methods (e.g., supervised, meta-, semi supervised and contrastive learning) also lie on a low-dimensional manifold.

References:
1. Does the data induce capacity control in deep learning? Rubing Yang, Jialin Mao, and Pratik Chaudhari. [ICML ’22] https://arxiv.org/abs/2110.14163
2. The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold. Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, and Pratik Chaudhari. [2023 arXiv preprint] https://arxiv.org/abs/2305.01604
3. A picture of the space of typical learnable tasks. Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark Transtrum, James P. Sethna, and Pratik Chaudhari [ICML ’23]. https://arxiv.org/abs/2210.17011

Bio:
Pratik Chaudhari is an Assistant Professor in Electrical and Systems Engineering and Computer and Information Science at the University of Pennsylvania. He is a core member of the GRASP Laboratory. From 2018-19, he was a Senior Applied Scientist at Amazon Web Services and a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech. Pratik received his PhD (2018) in Computer Science from UCLA, and his Master’s (2012) and Engineer’s (2014) degrees in Aeronautics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyundai-Aptiv Motional) from 2014-16. He is the recipient of the Amazon Machine Learning Research Award (2020), NSF CAREER award (2022) and the Intel Rising Star Faculty Award (2022).

More Information

Date	February 16, 2024 (Fri) 10:00 - 12:00
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/168841

Related Laboratories

last updated on July 1, 2025 08:26Laboratory

Adaptive Bayesian Intelligence Team

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	15th	Link to the event page for the 16th	17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Events