Multi-instance learning with any hypothesis class

Sivan Sabato, Naftali Tishby

Research output: Contribution to journalArticlepeer-review

27 Scopus citations


In the supervised learning setting termed Multiple-Instance Learning (MIL), the examples are bags of instances, and the bag label is a function of the labels of its instances. Typically, this function is the Boolean OR. The learner observes a sample of bags and the bag labels, but not the instance labels that determine the bag labels. The learner is then required to emit a classification rule for bags based on the sample. MIL has numerous applications, and many heuristic algorithms have been used successfully on this problem, each adapted to specific settings or applications. In this work we provide a unified theoretical analysis for MIL, which holds for any underlying hypothesis class, regardless of a specific application or problem domain. We show that the sample complexity ofMIL is only poly-logarithmically dependent on the size of the bag, for any underlying hypothesis class. In addition, we introduce a new PAC-learning algorithm for MIL, which uses a regular supervised learning algorithm as an oracle. We prove that efficient PAC-learning for MIL can be generated from any efficient non-MIL supervised learning algorithm that handles one-sided error. The computational complexity of the resulting algorithm is only polynomially dependent on the bag size.

Original languageEnglish
Pages (from-to)2999-3039
Number of pages41
JournalJournal of Machine Learning Research
StatePublished - 1 Oct 2012
Externally publishedYes


  • Learning theory
  • Multiple-instance learning
  • PAC learning
  • Sample complexity
  • Supervised classification

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence


Dive into the research topics of 'Multi-instance learning with any hypothesis class'. Together they form a unique fingerprint.

Cite this