Skip to main navigation Skip to search Skip to main content

Hierarchical Generalization Bounds for Deep Neural Networks

  • Haiyun He
  • , Christina Lee Yu
  • , Ziv Goldfeld

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep neural networks (DNNs) exhibit an exceptional generalization capability in practice. This work aims to capture the effect of depth and its potential benefit for learning within the paradigm of information-theoretic generalization bounds. We derive two novel hierarchical bounds on the generalization error that explicitly depend on the internal representations within each layer. The first result, is a layer-dependent generalization bound in terms of the Kullback-Leibler (KL) divergence, which shrinks as the layer index increases. The second bound, which is based on the Wasserstein distance, implies the existence of a layer that serves as a generalization funnel, which minimizes the generalization bound. We then specialize our bounds to the case of binary Gaussian classification, and present analytic expressions dependent on weight matrices rank or certain norms, for the KL divergence and the Wasserstein bounds, respectively. Our results may provide a new perspective for understanding generalization in deep models.

Original languageEnglish
Title of host publication2024 IEEE International Symposium on Information Theory, ISIT 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers
Pages2688-2693
Number of pages6
ISBN (Electronic)9798350382846
DOIs
StatePublished - 1 Jan 2024
Externally publishedYes
Event2024 IEEE International Symposium on Information Theory, ISIT 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Conference

Conference2024 IEEE International Symposium on Information Theory, ISIT 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Hierarchical Generalization Bounds for Deep Neural Networks'. Together they form a unique fingerprint.

Cite this