TY - UNPB

T1 - Low-Cost Parameterizations of Deep Convolutional Neural Networks

AU - Treister, Eran

AU - Ruthotto, Lars

AU - Sharoni, Michal

AU - Zafrani, Sapir

AU - Haber, Eldad

PY - 2018

Y1 - 2018

N2 - Convolutional Neural Networks (CNNs) filter the input data using a
series of spatial convolution operators with compactly supported
stencils and point-wise nonlinearities. Commonly, the convolution
operators couple features from all channels. For wide networks, this
leads to immense computational cost in the training of and prediction
with CNNs. In this paper, we present novel ways to parameterize the
convolution more efficiently, aiming to decrease the number of
parameters in CNNs and their computational complexity. We propose new
architectures that use a sparser coupling between the channels and
thereby reduce both the number of trainable weights and the
computational cost of the CNN. Our architectures arise as new types of
residual neural network (ResNet) that can be seen as discretizations of
a Partial Differential Equations (PDEs) and thus have predictable
theoretical properties. Our first architecture involves a convolution
operator with a special sparsity structure, and is applicable to a large
class of CNNs. Next, we present an architecture that can be seen as a
discretization of a diffusion reaction PDE, and use it with three
different convolution operators. We outline in our experiments that the
proposed architectures, although considerably reducing the number of
trainable weights, yield comparable accuracy to existing CNNs that are
fully coupled in the channel dimension.

AB - Convolutional Neural Networks (CNNs) filter the input data using a
series of spatial convolution operators with compactly supported
stencils and point-wise nonlinearities. Commonly, the convolution
operators couple features from all channels. For wide networks, this
leads to immense computational cost in the training of and prediction
with CNNs. In this paper, we present novel ways to parameterize the
convolution more efficiently, aiming to decrease the number of
parameters in CNNs and their computational complexity. We propose new
architectures that use a sparser coupling between the channels and
thereby reduce both the number of trainable weights and the
computational cost of the CNN. Our architectures arise as new types of
residual neural network (ResNet) that can be seen as discretizations of
a Partial Differential Equations (PDEs) and thus have predictable
theoretical properties. Our first architecture involves a convolution
operator with a special sparsity structure, and is applicable to a large
class of CNNs. Next, we present an architecture that can be seen as a
discretization of a diffusion reaction PDE, and use it with three
different convolution operators. We outline in our experiments that the
proposed architectures, although considerably reducing the number of
trainable weights, yield comparable accuracy to existing CNNs that are
fully coupled in the channel dimension.

KW - Computer Science - Numerical Analysis

KW - Mathematics - Numerical Analysis

M3 - Preprint

BT - Low-Cost Parameterizations of Deep Convolutional Neural Networks

ER -