Abstract
A half-space over a distance space is a generalization of a half-space in a vector space. An important advantage of a distance space over a metric space is that the triangle inequality need not be satisfied, which makes our results potentially very useful in practice. Given two points in a set, a half-space is defined by them, as the set of all points closer to the first point than to the second. In this paper we consider the problem of learning half-spaces in any finite distance space, that is, any finite set equipped with a distance function. We make use of a notion of ‘width’ of a half-space at a given point: this is defined as the difference between the distances of the point to the two points that define the half-space. We obtain probabilistic bounds on the generalization error when learning half-spaces from samples. These bounds depend on the empirical error (the fraction of sample points on which the half-space does not achieve a large width) and on the VC-dimension of the effective class of half-spaces that have a large sample width. Unlike some previous work on learning classification over metric spaces, the bound does not involve the covering number of the space, and can therefore be tighter.
| Original language | English |
|---|---|
| Pages (from-to) | 73-89 |
| Number of pages | 17 |
| Journal | Discrete Applied Mathematics |
| Volume | 243 |
| DOIs | |
| State | Published - 10 Jul 2018 |
| Externally published | Yes |
Keywords
- Distance and metric spaces
- Half spaces
- Large width learning
- Margin
- Pseudo rank
ASJC Scopus subject areas
- Discrete Mathematics and Combinatorics
- Applied Mathematics