Abstract
We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing an approximate Lipschitz extension - the smoothest function consistent with the observed data - after performing structural risk minimization to avoid overfitting. We obtain finite-sample risk bounds with minimal structural and noise assumptions, and a natural runtime-precision tradeoff. The offline (learning) and online (prediction) stages can be solved by convex programming, but this naive approach has runtime complexity $O(n^{3})$, which is prohibitive for large data sets. We design instead a regression algorithm whose speed and generalization performance depend on the intrinsic dimension of the data, to which the algorithm adapts. While our main innovation is algorithmic, the statistical results may also be of independent interest.
Original language | English |
---|---|
Article number | 7944658 |
Pages (from-to) | 4838-4849 |
Number of pages | 12 |
Journal | IEEE Transactions on Information Theory |
Volume | 63 |
Issue number | 8 |
DOIs | |
State | Published - 1 Aug 2017 |
Keywords
- Regression analysis
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences