[Tensor Learning Team Seminar] Talk by Dr. Yuandong Tian, Meta AI Research (FAIR)

September 20, 2023 18:05

Abstract

This talk will be held in a hybrid format, both in person at open space of RIKEN AIP (Nihonbashi office) and online by Zoom. The Open Space; only available to AIP researchers.

Title: Demystifying Attention Mechanism in Multi-layer Transformer and its application to Better Inference of Large Language Models (LLMs)

Speaker: Dr. Yuandong Tian, Meta AI Research (FAIR), https://yuandong-tian.com

Abstract: Large Language Models (LLMs) have demonstrated remarkable efficacy across diverse applications, with the multi-layer Transformer architecture and self-attention playing a pivotal role. In this talk, we analyze the training dynamics of self-attention in 1-layer and multi-layer Transformer in a mathematically rigorous manner. This analysis characterizes the training dynamics of self-attention and how tokens are composed to form high-level latent patterns. Our theoretical insights are corroborated by extensive experimental evidence. Notably, one property called “contextual sparsity” enables us to develop novel approaches such as Deja Vu and H2O that substantially accelerate LLM inference. Finally, further study of the attention behavior yields positional interpolation (PI) that extends context window beyond pre-trained models with very few fine-tuning steps.

Bio: Yuandong Tian is a Research Scientist and Senior Manager in Meta AI Research (FAIR), working on reinforcement learning, optimization and understanding of neural networks. He has been the project lead for story generation (2023) and OpenGo project (2018). He is the first-author recipient of 2021 ICML Outstanding Paper Honorable Mentions and 2013 ICCV Marr Prize Honorable Mentions, and also received the 2022 CGO Distinguished Paper Award. Prior to that, he worked in Google Self-driving Car team in 2013-2014 and received a Ph.D in Robotics Institute, Carnegie Mellon University in 2013. He has been appointed as area chairs for NeurIPS, ICML, AAAI and AIStats.

More Information

Date	October 6, 2023 (Fri) 11:00 - 12:00
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/163816

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	15th	Link to the event page for the 16th	17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Events