Bounds on the sample complexity for private learning and private data release

Amos Beimel, Shiva Prasad Kasiviswanathan, Kobbi Nissim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

49 Scopus citations

Abstract

Learning is a task that generalizes many of the analyses that are applied to collections of data, and in particular, collections of sensitive individual information. Hence, it is natural to ask what can be learned while preserving individual privacy. [Kasiviswanathan, Lee, Nissim, Raskhodnikova, and Smith; FOCS 2008] initiated such a discussion. They formalized the notion of private learning, as a combination of PAC learning and differential privacy, and investigated what concept classes can be learned privately. Somewhat surprisingly, they showed that, ignoring time complexity, every PAC learning task could be performed privately with polynomially many samples, and in many natural cases this could even be done in polynomial time. While these results seem to equate non-private and private learning, there is still a significant gap: the sample complexity of (non-private) PAC learning is crisply characterized in terms of the VC-dimension of the concept class, whereas this relationship is lost in the constructions of private learners, which exhibit, generally, a higher sample complexity. Looking into this gap, we examine several private learning tasks and give tight bounds on their sample complexity. In particular, we show strong separations between sample complexities of proper and improper private learners (such separation does not exist for non-private learners), and between sample complexities of efficient and inefficient proper private learners. Our results show that VC-dimension is not the right measure for characterizing the sample complexity of proper private learning. We also examine the task of private data release (as initiated by [Blum, Ligett, and Roth; STOC 2008]), and give new lower bounds on the sample complexity. Our results show that the logarithmic dependence on size of the instance space is essential for private data release.

Original languageEnglish
Title of host publicationTheory of Cryptography - 7th Theory of Cryptography Conference, TCC 2010, Proceedings
Pages437-454
Number of pages18
DOIs
StatePublished - 25 Mar 2010
Event7th Theory of Cryptography Conference, TCC 2010 - Zurich, Switzerland
Duration: 9 Feb 201011 Feb 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5978 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th Theory of Cryptography Conference, TCC 2010
Country/TerritorySwitzerland
CityZurich
Period9/02/1011/02/10

Fingerprint

Dive into the research topics of 'Bounds on the sample complexity for private learning and private data release'. Together they form a unique fingerprint.

Cite this