[ABI Team Seminar] Talk by Frederik Kunstner (UBC) on “Adaptive Methods in Machine Learning and Why Adam Works so Well”

April 25, 2024 17:48

Abstract

This talk will be held in a hybrid format, both in person at AIP Open Space of RIKEN AIP (Nihonbashi office) and online by Zoom. AIP Open Space: *only available to AIP researchers.

DATE & TIME
May 29, 2024: 14:30 pm – 15:30 pm (JST)

TITLE
Adaptive Methods in Machine Learning and Why Adam Works so Well

SPEAKER
Frederik Kunstner (University of British Columbia)

ABSTRACT
The success of the Adam optimizer has made it the default in settings where stochastic gradient descent (SGD) performs poorly. However, our theoretical understanding of why Adam performs better is lagging. The literature presents many competing interpretations and hypotheses, but we do not yet have a clear understanding of which (if any) captures the key problem that Adam “fixes” to outperform SGD. This talk presents empirical results that evaluate recently developed assumptions to model difficulties of modern architectures such as large language models, where a large performance gap between SGD and Adam has been observed. We isolate a key property of language problems — a large vocabulary with a heavy-tailed, unbalanced distribution of output classes — as a potential cause of this performance gap.

BIOGRAPHY
Frederik Kunstner is a 5th year PhD student at the University of British Columbia, working with Mark Schmidt. His work is at the intersection of the theory of optimization methods and their application to machine learning, focusing on modeling the difficulties involved in training modern models. Prior to his PhD, Frederik studied at EPFL in Switzerland, and had the opportunity to intern at the RIKEN Center for Advanced Intelligence Project with Emtiyaz Khan in Japan and the Max Planck Institute for Intelligent Systems with Philipp Hennig in Germany.

More Information

Date	May 29, 2024 (Wed) 14:30 - 15:30
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/172656

Related Laboratories

last updated on July 1, 2025 08:26Laboratory

Adaptive Bayesian Intelligence Team

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	15th	Link to the event page for the 16th	17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Events