Abstract
A half-space over a distance space is a generalization of a half-space in a vector space. An important advantage of a distance space over a metric space is that the triangle inequality need not be satisfied, which makes our results potentially very useful in practice. Given two points in a set, a half-space is defined by them, as the set of all points closer to the first point than to the second. In this paper we consider the problem of learning half-spaces in any finite distance space, that is, any finite set equipped with a distance function. We make use of a notion of ‘width’ of a half-space at a given point: this is defined as the difference between the distances of the point to the two points that define the half-space. We obtain probabilistic bounds on the generalization error when learning half-spaces from samples. These bounds depend on the empirical error (the fraction of sample points on which the half-space does not achieve a large width) and on the VC-dimension of the effective class of half-spaces that have a large sample width. Unlike some previous work on learning classification over metric spaces, the bound does not involve the covering number of the space, and can therefore be tighter.
Original language | English |
---|---|
Pages (from-to) | 73-89 |
Number of pages | 17 |
Journal | Discrete Applied Mathematics |
Volume | 243 |
DOIs | |
State | Published - 10 Jul 2018 |
Externally published | Yes |
Keywords
- Distance and metric spaces
- Half spaces
- Large width learning
- Margin
- Pseudo rank
ASJC Scopus subject areas
- Discrete Mathematics and Combinatorics
- Applied Mathematics