The Minimax Risk in Testing the Histogram of Discrete Distributions for Uniformity under Missing Ball Alternatives

Alon Kipnis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study the problem of testing the goodness of fit of a discrete sample from many categories to the uniform distribution over the categories. As a class of alternative hypotheses, we consider the removal of an ℓp ball of radius ϵ around the uniform rate sequence for p≤ 2. When the number of samples n and number of categories N go to infinity while ϵ is small, the minimax risk R in ∗ in testing based on the sample's histogram (number of absent categories, singletons, collisions,...) asymptotes to 2Φ (- n N2 - 2/p \in 2 8N right), with Φ(x) the normal CDF. This characterization allows the comparison of the many estimators previously proposed for this problem at the constant level, rather than at the rate of convergence of the risk or the scaling order of the sample complexity. The minimax test mostly relies on collisions in the very small sample limit but otherwise behaves like the chisquared test. Empirical studies over a range of problem parameters show that our estimate is accurate in finite samples and that the minimax test is significantly better than the chisquared test or a test that only uses collisions. Our analysis relies on the asymptotic normality of histogram ordinates, the equivalence between the minimax setting and a Bayesian setting, and the characterization of the least favorable prior by reducing a multi-dimensional optimization problem to a one-dimensional problem.

Original languageEnglish
Title of host publication2023 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023
PublisherInstitute of Electrical and Electronics Engineers
ISBN (Electronic)9798350328141
DOIs
StatePublished - 1 Jan 2023
Externally publishedYes
Event59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023 - Monticello, United States
Duration: 26 Sep 202329 Sep 2023

Publication series

Name2023 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023

Conference

Conference59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023
Country/TerritoryUnited States
CityMonticello
Period26/09/2329/09/23

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Computer Science Applications
  • Computational Mathematics
  • Control and Optimization

Fingerprint

Dive into the research topics of 'The Minimax Risk in Testing the Histogram of Discrete Distributions for Uniformity under Missing Ball Alternatives'. Together they form a unique fingerprint.

Cite this