Abstract
We study the sample-based k-median clustering objective under a sequential setting without substitutions. In this setting, an i.i.d. sequence of examples is observed. An example can be selected as a center only immediately after it is observed, and it cannot be substituted later. The goal is to select a set of centers with a good k-median cost on the distribution which generated the sequence. We provide an efficient algorithm for this setting, and show that its multiplicative approximation factor is twice the approximation factor of an efficient offline algorithm. In addition, we show that if efficiency requirements are removed, there is an algorithm that can obtain the same approximation factor as the best offline algorithm. We demonstrate in experiments the performance of the efficient algorithm on real data sets. evaluation of the downstream application of face detection. Our code is available at https://github.com/tomhess/No_Substitution_K_Median.
Original language | English |
---|---|
Pages (from-to) | 962-972 |
Number of pages | 11 |
Journal | Proceedings of Machine Learning Research |
Volume | 108 |
State | Published - 1 Jan 2020 |
Event | 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020 - Virtual, Online Duration: 26 Aug 2020 → 28 Aug 2020 |
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability