Talk by Mr. Yao-Hung Hubert Tsai (CMU)

2019/11/27 09:12

要旨

Speaker: Yao-Hung Hubert Tsai (Carnegie Mellon University)
https://yaohungt.github.io/

Title: Transformer Dissection: An Unified Understanding for Transformer’s Attention via the Lens of Kernel (EMNLP 2019)

Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To be more precise, we realize that the attention can be seen as applying kernel smoother over the inputs with the kernel scores being the similarities between inputs. This new formulation gives us a better way to understand individual components of the Transformer’s attention, such as the better way to integrate the positional embedding. Another important advantage of our kernel-based formulation is that it paves the way to a larger space of composing Transformer’s attention. As an example, we propose a new variant of Transformer’s attention which models the input as a product of symmetric kernels. This approach achieves competitive performance to the current state of the art model with less computation. In our experiments, we empirically study different kernel construction strategies on two widely used tasks: neural machine translation and sequence prediction.

詳細情報

日時	2019/11/29(金) 15:00 - 15:45
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/100989

場所

Artificial Intelligence Research Unit, Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan(Google Maps)

日曜日	月曜日	火曜日	水曜日	木曜日	金曜日	土曜日
		1日のイベントページへのリンク	2日のイベントページへのリンク	3日のイベントページへのリンク	4日のイベントページへのリンク	5日
6日	7日	8日	9日のイベントページへのリンク	10日のイベントページへのリンク	11日	12日
13日	14日	15日のイベントページへのリンク	16日のイベントページへのリンク	17日のイベントページへのリンク	18日	19日
20日	21日	22日	23日	24日	25日	26日
27日	28日	29日	30日	31日

革新知能統合研究センター

イベント