Talk by Mr. Yao-Hung Hubert Tsai (CMU)

November 27, 2019 09:12

Abstract

Speaker: Yao-Hung Hubert Tsai (Carnegie Mellon University)
https://yaohungt.github.io/

Title: Transformer Dissection: An Unified Understanding for Transformer’s Attention via the Lens of Kernel (EMNLP 2019)

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To be more precise, we realize that the attention can be seen as applying kernel smoother over the inputs with the kernel scores being the similarities between inputs. This new formulation gives us a better way to understand individual components of the Transformer’s attention, such as the better way to integrate the positional embedding. Another important advantage of our kernel-based formulation is that it paves the way to a larger space of composing Transformer’s attention. As an example, we propose a new variant of Transformer’s attention which models the input as a product of symmetric kernels. This approach achieves competitive performance to the current state of the art model with less computation. In our experiments, we empirically study different kernel construction strategies on two widely used tasks: neural machine translation and sequence prediction.

More Information

Date	November 29, 2019 (Fri) 15:00 - 15:45
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/100989

Venue

Artificial Intelligence Research Unit, Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan(Google Maps)

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
1st	2nd	3rd	4th	Link to the event page for the 5th	Link to the event page for the 6th	7th
8th	9th	10th	11th	12th	Link to the event page for the 13th	14th
15th	16th	17th	Link to the event page for the 18th	Link to the event page for the 19th	20th	21th
22th	23th	Link to the event page for the 24th	Link to the event page for the 25th	Link to the event page for the 26th	Link to the event page for the 27th	28th
29th	30th

Center for Advanced Intelligence Project

Events