TY - JOUR
T1 - Functions with average smoothness
T2 - 34th Conference on Learning Theory, COLT 2021
AU - Ashlagi, Yair
AU - Gottlieb, Lee Ad
AU - Kontorovich, Aryeh
N1 - Funding Information:
We thank Luigi Ambrosio and Ariel Elperin for very helpful feedback on earlier attempts to define a notion of average smoothness, and to Sasha Rakhlin for useful discussions. Pavel Shvartsman and Adam Oberman were very helpful in placing PMSE in proper historical context. This research was partially supported by the Israel Science Foundation (grant No. 1602/19) an Amazon Research Award. Additional support was provided by the Ariel Cyber Innovation Center in conjunction with the Israel Cyber directorate in the Prime Minister’s Office.
Publisher Copyright:
© 2021 Y. Ashlagi, L.-A. Gottlieb & A. Kontorovich.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds — assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. In addition to the usual average, we also examine a “weak” average that is more forgiving and yields a much wider function class. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the empirical covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.
AB - We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds — assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. In addition to the usual average, we also examine a “weak” average that is more forgiving and yields a much wider function class. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the empirical covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.
KW - Lipschitz
KW - average case
KW - doubling dimension
KW - metric space
KW - smoothness
UR - http://www.scopus.com/inward/record.url?scp=85124565561&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85124565561
SN - 2640-3498
VL - 134
SP - 186
EP - 236
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 15 August 2021 through 19 August 2021
ER -