TY - CONF
T1 - LIPSCHITZ RECURRENT NEURAL NETWORKS
AU - Erichson, N. Benjamin
AU - Azencot, Omri
AU - Queiruga, Alejandro
AU - Hodgkinson, Liam
AU - Mahoney, Michael W.
N1 - Funding Information:
We would like to thank Ed H. Chi for fruitful discussions about physics-informed machine learning and the Antisymmetric RNN. We are grateful to the generous support from Amazon AWS and Google Cloud. NBE and MWM would like to acknowledge IARPA (contract W911NF20C0035), NSF, ONR and CLTC for providing partial support of this work. Our conclusions do not necessarily reflect the position or the policy of our sponsors, and no official endorsement should be inferred.
Publisher Copyright:
© 2021 ICLR 2021 - 9th International Conference on Learning Representations. All rights reserved.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.
AB - Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.
UR - http://www.scopus.com/inward/record.url?scp=85102787847&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85102787847
T2 - 9th International Conference on Learning Representations, ICLR 2021
Y2 - 3 May 2021 through 7 May 2021
ER -