We establish a tight characterization of the worst-case rates for the excess risk of agnostic learning with sample compression schemes and for uniform convergence for agnostic sample compression schemes. In particular, we find that the optimal rates of convergence for size-$k$ agnostic sample compression schemes are of the form $k n/k)n$, which contrasts with agnostic learning with classes of VC dimension $k$, where the optimal rates are of the form $kn$.
|Title of host publication||Proceedings of the 30th International Conference on Algorithmic Learning Theory|
|Editors||Aurélien Garivier, Satyen Kale|
|Place of Publication||Chicago, Illinois|
|Number of pages||17|
|State||Published - 1 Oct 2019|