Abstract
We seek an entropy estimator for discrete distributions with fully empirical accuracy bounds. As stated, this goal is infeasible without some prior assumptions on the distribution. We discover that a certain information moment assumption renders the problem feasible. We argue that the moment assumption is natural and, in some sense, minimalistic ; weaker than finite support or tail decay conditions. Under the moment assumption, we provide the first finite-sample entropy estimates for infinite alphabets, nearly recovering the known minimax rates. Moreover, we demonstrate that our empirical bounds are significantly sharper than the state-of-the-art bounds, for various natural distributions and non-trivial sample regimes. Along the way, we give a dimension-free analogue of the Cover-Thomas result on entropy continuity (with respect to total variation distance) for finite alphabets, which may be of independent interest. Additionally, we resolve all of the open problems posed by ;rgensen and Matthews, 2010.
Original language | English |
---|---|
Pages (from-to) | 3190-3202 |
Number of pages | 13 |
Journal | IEEE Transactions on Information Theory |
Volume | 69 |
Issue number | 5 |
DOIs | |
State | Published - 27 Dec 2023 |
Keywords
- Information
- empirical estimation
- entropy
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences