Abstract
We provide a new proof that the expected error rate of consistent support vector machines matches the minimax rate (up to a constant factor) in its dependence on the sample size and margin. The upper bound was originally established by [1], while the lower bound follows from an argument of [2] together with reasoning about the VC dimension of large-margin classifiers. Our proof differs from the original in that many of our steps concern reasoning about the primal space, while the original carried out these steps by reasoning about the dual space. Our approach provides a unified framework for analyzing both the homogeneous and non-homogeneous cases, with slightly better results for the former. The fact that our analysis explicitly handles the non-homogeneous case offers significant improvements in the bounds compared to the usual textbook approach of reducing to the homogeneous case. We also extend our proof to provide a new upper bound on the error rate of transductive SVM, which yields an improved constant factor compared to inductive SVM. In addition to these bounds on the expected error rate, we also provide a simple proof of a margin-based PAC-style bound for support vector machines, and an extension of the agnostic PAC analysis that explicitly handles the non-homogeneous case.
Original language | English |
---|---|
Pages (from-to) | 99-113 |
Number of pages | 15 |
Journal | Theoretical Computer Science |
Volume | 796 |
DOIs | |
State | Published - 3 Dec 2019 |
Keywords
- Classification
- Generalization bound
- Margin bound
- PAC learning
- Statistical learning theory
- Support vector machine
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science