要旨
This event will be held at both open space and online by Zoom.
Title: A Tensor Perspective on Second-Order RNNs & Simulating Weighted Automata with Transformers
Abstract: After providing a brief overview of the research conducted by my group at Mila and Université de Montréal, I will present two recent works related to the formal analysis of sequence models. I will first introduce two families of models that are closely related: second-order RNNs and weighted automata.
Next, I will present recent results that analyze the ability of transformers to simulate the computations of weighted automata. These results attempt to shed light, from a formal language perspective, on the apparent capacity of transformer models to learn and perform abstract reasoning.
In the second part of the talk, I will provide a formal analysis of how tensor decomposition can be used to compress the parameters of second-order RNNs and how this affects their effectiveness. Specifically, our results demonstrate how the rank interacts with hidden size to control the model’s capacity.
This talk is based on the following publications from my group this year:
Rizvi-Martel, Michael, Maude Lizaire, Clara Lacroce, and Guillaume Rabusseau. “Simulating Weighted Automata Over Sequences and Trees with Transformers.” In International Conference on Artificial Intelligence and Statistics, pp. 2368-2376. PMLR, 2024.
Lizaire, Maude, Michael Rizvi-Martel, Marawan Gamal, and Guillaume Rabusseau. “A Tensor Decomposition Perspective on Second-Order RNNs.” In Forty-First International Conference on Machine Learning.
Bio:
Guillaume Rabusseau is an Associate Professor in the Department of Computer Science and Operations Research (DIRO) at Université de Montréal, a core member of Mila, and a Canada CIFAR AI Chair holder.
His research focuses on the intersection of machine learning, theoretical computer science, and multilinear algebra. Specifically, he explores the connections between tensors and machine learning, developing efficient learning schemes for structured data by leveraging linear and multilinear algebra.
His broader research interests include tensor decomposition techniques, the use of tensor networks for machine learning, quantum machine learning, kernel methods, (weighted) automata theory, and nonlinear computational models for strings, trees, and graphs.
詳細情報
日時 | 2024/10/25(金) 14:00 - 15:00 |
URL | https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/178709 |