Nearest-Neighbor sample compression: Efficiency, consistency, infinite dimensions

Research output: Contribution to journalConference articlepeer-review

18 Scopus citations

Abstract

We examine the Bayes-consistency of a recently proposed 1-nearest-neighbor-based multiclass learning algorithm. This algorithm is derived from sample compression bounds and enjoys the statistical advantages of tight, fully empirical generalization bounds, as well as the algorithmic advantages of a faster runtime and memory savings. We prove that this algorithm is strongly Bayes-consistent in metric spaces with finite doubling dimension - the first consistency result for an efficient nearest-neighbor sample compression scheme. Rather surprisingly, we discover that this algorithm continues to be Bayes-consistent even in a certain infinite-dimensional setting, in which the basic measure-theoretic conditions on which classic consistency proofs hinge are violated. This is all the more surprising, since it is known that k-NN is not Bayes-consistent in this setting. We pose several challenging open problems for future research.

Original languageEnglish
Pages (from-to)1574-1584
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - 1 Jan 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: 4 Dec 20179 Dec 2017

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Nearest-Neighbor sample compression: Efficiency, consistency, infinite dimensions'. Together they form a unique fingerprint.

Cite this