September 20, 2023 18:05


This talk will be held in a hybrid format, both in person at open space of RIKEN AIP (Nihonbashi office) and online by Zoom. The Open Space; only available to AIP researchers.

Title: Demystifying Attention Mechanism in Multi-layer Transformer and its application to Better Inference of Large Language Models (LLMs)

Speaker: Dr. Yuandong Tian, Meta AI Research (FAIR),

Abstract: Large Language Models (LLMs) have demonstrated remarkable efficacy across diverse applications, with the multi-layer Transformer architecture and self-attention playing a pivotal role. In this talk, we analyze the training dynamics of self-attention in 1-layer and multi-layer Transformer in a mathematically rigorous manner. This analysis characterizes the training dynamics of self-attention and how tokens are composed to form high-level latent patterns. Our theoretical insights are corroborated by extensive experimental evidence. Notably, one property called “contextual sparsity” enables us to develop novel approaches such as Deja Vu and H2O that substantially accelerate LLM inference. Finally, further study of the attention behavior yields positional interpolation (PI) that extends context window beyond pre-trained models with very few fine-tuning steps.

Bio: Yuandong Tian is a Research Scientist and Senior Manager in Meta AI Research (FAIR), working on reinforcement learning, optimization and understanding of neural networks. He has been the project lead for story generation (2023) and OpenGo project (2018). He is the first-author recipient of 2021 ICML Outstanding Paper Honorable Mentions and 2013 ICCV Marr Prize Honorable Mentions, and also received the 2022 CGO Distinguished Paper Award. Prior to that, he worked in Google Self-driving Car team in 2013-2014 and received a Ph.D in Robotics Institute, Carnegie Mellon University in 2013. He has been appointed as area chairs for NeurIPS, ICML, AAAI and AIStats.

More Information

Date October 6, 2023 (Fri) 11:00 - 12:00