Stochastic weight pruning and the role of regularization in shaping network structure

Yael ZIv, Jacob Goldberger, Tammy Riklin Raviv

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The pressing need to reduce the capacity of deep neural networks has stimulated the development of network dilution methods and their analysis. In this study we present a framework for neural network pruning by sampling from a probability function that favors the zeroing of smaller parameters. This procedure of stochastically setting network weights to zero is done after each parameter updating step in the network learning algorithm. As part of the proposed framework, we examine the contribution of L1 and L2 regularization to the dynamics of pruning larger network structures such as neurons and filters while optimizing for weight pruning. We then demonstrate the effectiveness of the proposed stochastic pruning framework when used together with regularization terms for different network architectures and image analysis tasks. Specifically, we show that using our method we can successfully remove more than 50% of the channels/filters in VGG-16 and MobileNetV2 for CIFAR10 classification; in ResNet56 for CIFAR100 classification; in a U-Net for instance segmentation of biological cells; and in a CNN model tailored for COVID-19 detection. For these filter-pruned networks, we also present competitive weight pruning results while maintaining the accuracy levels of the original, dense networks.

Original languageEnglish
Pages (from-to)555-567
Number of pages13
JournalNeurocomputing
Volume462
DOIs
StatePublished - 28 Oct 2021

Keywords

  • COVID-19
  • Neural network compression
  • Node pruning
  • Pruning dynamics
  • Weight decay
  • Weight pruning

ASJC Scopus subject areas

  • Computer Science Applications
  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Stochastic weight pruning and the role of regularization in shaping network structure'. Together they form a unique fingerprint.

Cite this