Tensor Learning Team Seminar 20231006

October 12, 2023 14:05

Tensor Learning Team Seminar 20231006 thumbnails

Description

[Tensor Learning Team Seminar]
Date and Time: Oct. 6th, 2023 11:00 a.m – 12:00 p.m (JST)
Venue: RIKEN AIP (Nihonbashi office) and online by Zoom.
Language: English

Speaker: Dr. Yuandong Tian, Meta AI Research (FAIR)
Title: Demystifying Attention Mechanism in Multi-layer Transformer and its application to Better Inference of Large Language Models (LLMs)

Abstract: Large Language Models (LLMs) have demonstrated remarkable efficacy across diverse applications, with the multi-layer Transformer architecture and self-attention playing a pivotal role. In this talk, we analyze the training dynamics of self-attention in 1-layer and multi-layer Transformer in a mathematically rigorous manner. This analysis characterizes the training dynamics of self-attention and how tokens are composed to form high-level latent patterns. Our theoretical insights are corroborated by extensive experimental evidence. Notably, one property called “contextual sparsity” enables us to develop novel approaches such as Deja Vu and H2O that substantially accelerate LLM inference. Finally, further study of the attention behavior yields positional interpolation (PI) that extends context window beyond pre-trained models with very few fine-tuning steps.

Bio: Yuandong Tian is a Research Scientist and Senior Manager in Meta AI Research (FAIR), working on reinforcement learning, optimization and understanding of neural networks. He has been the project lead for story generation (2023) and OpenGo project (2018). He is the first-author recipient of 2021 ICML Outstanding Paper Honorable Mentions and 2013 ICCV Marr Prize Honorable Mentions, and also received the 2022 CGO Distinguished Paper Award. Prior to that, he worked in Google Self-driving Car team in 2013-2014 and received a Ph.D in Robotics Institute, Carnegie Mellon University in 2013. He has been appointed as area chairs for NeurIPS, ICML, AAAI and AIStats.

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	15th	Link to the event page for the 16th	17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Videos

Description

Videos

Calendar

Sub menu

External Links

Center for Advanced Intelligence Project

Videos