A finite sample analysis of the Naive Bayes classifier

Research output: Contribution to journalArticlepeer-review

32 Scopus citations

Abstract

We revisit, from a statistical learning perspective, the classical decision-theoretic problem of weighted expert voting. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Naive Bayes weighted majority and related rules. In the case of known expert competence levels, we give sharp error estimates for the optimal rule. We derive optimality results for our estimates and also establish some structural characterizations. When the competence levels are unknown, they must be empirically estimated. We provide frequentist and Bayesian analyses for this situation. Some of our proof techniques are non-standard and may be of independent interest. Several challenging open problems are posed, and experimental results are provided to illustrate the theory.

Original languageEnglish
Pages (from-to)1519-1545
Number of pages27
JournalJournal of Machine Learning Research
Volume16
StatePublished - 1 Aug 2015

Keywords

  • Chernoff-Stein lemma
  • Experts
  • Hypothesis testing
  • Measure concentration
  • Naive Bayes
  • Neyman-Pearson lemma

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A finite sample analysis of the Naive Bayes classifier'. Together they form a unique fingerprint.

Cite this