Speaker: Shaojie Bai (Carnegie Mellon University)
Title: Deep Equilibrium Models: One “Implicit” Layer is All You Need (NeurIPS 2019, spotlight oral)
Abstract: Deep learning has long focused upon the hierarchy of representations, which is usually better learned by adding layers (i.e., depth) to increase a model’s both complexity and expressivity. In this work, we revisit and argue for an alternative perspective, where we only define one layer with an implicitly defined output of the model. We show how this one-layer model is equivalent to an infinite-depth model, and how it re-shapes our view on deep learning via the very concepts of equilibria and dynamical systems. Specifically, we introduce the deep equilibrium (DEQ) model, and discuss how we can 1) solve for this implicit-depth model’s equilibria directly via (black-box) Quasi-Newton methods; 2) backpropagate directly from these equilibria with O(1) memory (whereas typical deep networks need O(L) memory for L layers); and 3) theoretically analyze the universality of the representational power of the DEQ model (i.e., the proof that “one layer” is really all you need). Finally, we demonstrate that the DEQ approach is not predicated on any particular architectural choice, and that it scales to large, realistic, and high-dimensional sequence tasks with results on par with (or better) than the SOTA architectures (e.g., Transformers) despite only using a single layer and vastly improving the memory efficiency (by up to 88%). This work is based on the NeurIPS 2019 paper “Deep Equilibrium Models”.
|Date||November 29, 2019 (Fri) 15:45 - 16:30|