Abstract
We investigate the direct-sum problem in the context of differentially private PAC learning: What is the sample complexity of solving k learning tasks simultaneously under differential privacy, and how does this cost compare to that of solving k learning tasks without privacy? In our setting, an individual example consists of a domain element x labeled by k unknown concepts (c1; : : : ; ck). The goal of a multi-learner is to output k hypotheses (h1; : : : ; hk) that generalize the input examples. Without concern for privacy, the sample complexity needed to simultaneously learn k concepts is essentially the same as needed for learning a single concept. Under differential privacy, the basic strategy of learning each hypothesis independently yields sample complexity that grows polynomially with k. For some concept classes, we give multi-learners that require fewer samples than the basic strategy. Unfortunately, however, we also give lower bounds showing that even for very simple concept classes, the sample cost of private multi-learning must grow polynomially in k.
Original language | English |
---|---|
Journal | Journal of Machine Learning Research |
Volume | 20 |
State | Published - 1 Jun 2019 |
Keywords
- Agnostic learning
- Differential privacy
- Direct-sum
- PAC learning
ASJC Scopus subject areas
- Control and Systems Engineering
- Software
- Statistics and Probability
- Artificial Intelligence