Learning flat representations with artificial neural networks

  • Vlad Constantinescu
  • , Costin Chiru
  • , Tudor Boloni
  • , Adina Florea
  • , Robi Tacutu

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose a method of learning representation layers with squashing activation functions within a deep artificial neural network which directly addresses the vanishing gradients problem. The proposed solution is derived from solving the maximum likelihood estimator for components of the posterior representation, which are approximately Beta-distributed, formulated in the context of variational inference. This approach not only improves the performance of deep neural networks with squashing activation functions on some of the hidden layers - including in discriminative learning - but can be employed towards producing sparse codes.

Original languageEnglish
Pages (from-to)2456-2470
Number of pages15
JournalApplied Intelligence
Volume51
Issue number4
DOIs
StatePublished - 1 Apr 2021
Externally publishedYes

Keywords

  • Beta distribution
  • Infomax
  • Learning representations
  • Vanishing gradients

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning flat representations with artificial neural networks'. Together they form a unique fingerprint.

Cite this