TY - GEN
T1 - List and Certificate Complexities in Replicable Learning
AU - Dixon, Peter
AU - Pavan, A.
AU - Vander Woude, Jason
AU - Vinodchandran, N. V.
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - We investigate replicable learning algorithms. Informally a learning algorithm is replicable if the algorithm outputs the same canonical hypothesis over multiple runs with high probability, even when different runs observe a different set of samples from the unknown data distribution. In general, such a strong notion of replicability is not achievable. Thus we consider two feasible notions of replicability called list replicability and certificate replicability. Intuitively, these notions capture the degree of (non) replicability. The goal is to design learning algorithms with optimal list and certificate complexities while minimizing the sample complexity. Our contributions are the following. - We first study the learning task of estimating the biases of d coins, up to an additive error of ε, by observing samples. For this task, we design a (d + 1)-list replicable algorithm. To complement this result, we establish that the list complexity is optimal, i.e there are no learning algorithms with a list size smaller than d + 1 for this task. We also design learning algorithms with certificate complexity Õ(log d). The sample complexity of both these algorithms is Õ(dε22 ) where ε is the approximation error parameter (for a constant error probability). - In the PAC model, we show that any hypothesis class that is learnable with d-nonadaptive statistical queries can be learned via a (d + 1)-list replicable algorithm and also via a Õ(log d)-certificate replicable algorithm. The sample complexity of both these algorithms is Õ(νd22 ) where ν is the approximation error of the statistical query. We also show that for the concept class dTHRESHOLD, the list complexity is exactly d + 1 with respect to the uniform distribution. To establish our upper bound results we use rounding schemes induced by geometric partitions with certain properties. We use Sperner/KKM Lemma to establish the lower bound results.
AB - We investigate replicable learning algorithms. Informally a learning algorithm is replicable if the algorithm outputs the same canonical hypothesis over multiple runs with high probability, even when different runs observe a different set of samples from the unknown data distribution. In general, such a strong notion of replicability is not achievable. Thus we consider two feasible notions of replicability called list replicability and certificate replicability. Intuitively, these notions capture the degree of (non) replicability. The goal is to design learning algorithms with optimal list and certificate complexities while minimizing the sample complexity. Our contributions are the following. - We first study the learning task of estimating the biases of d coins, up to an additive error of ε, by observing samples. For this task, we design a (d + 1)-list replicable algorithm. To complement this result, we establish that the list complexity is optimal, i.e there are no learning algorithms with a list size smaller than d + 1 for this task. We also design learning algorithms with certificate complexity Õ(log d). The sample complexity of both these algorithms is Õ(dε22 ) where ε is the approximation error parameter (for a constant error probability). - In the PAC model, we show that any hypothesis class that is learnable with d-nonadaptive statistical queries can be learned via a (d + 1)-list replicable algorithm and also via a Õ(log d)-certificate replicable algorithm. The sample complexity of both these algorithms is Õ(νd22 ) where ν is the approximation error of the statistical query. We also show that for the concept class dTHRESHOLD, the list complexity is exactly d + 1 with respect to the uniform distribution. To establish our upper bound results we use rounding schemes induced by geometric partitions with certain properties. We use Sperner/KKM Lemma to establish the lower bound results.
UR - http://www.scopus.com/inward/record.url?scp=85191165008&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85191165008
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
Y2 - 10 December 2023 through 16 December 2023
ER -