Skip to main navigation Skip to search Skip to main content

The Power of Uniform Sampling for Coresets

  • Vladimir Braverman
  • , Vincent Cohen-Addad
  • , H. C.Shaofeng Jiang
  • , Robert Krauthgamer
  • , Chris Schwiegelshohn
  • , Mads Bech Toftrup
  • , Xuan Wu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

47 Scopus citations

Abstract

Motivated by practical generalizations of the classic k-median and k-means objectives, such as clustering with size constraints, fair clustering, and Wasserstein barycenter, we introduce a meta-theorem for designing coresets for constrained-clustering problems. The meta-theorem reduces the task of coreset construction to one on a bounded number of ring instances with a much-relaxed additive error. This reduction enables us to construct coresets using uniform sampling, in contrast to the widely-used importance sampling, and consequently we can easily handle constrained objectives. Notably and perhaps surprisingly, this simpler sampling scheme can yield coresets whose size is independent of n, the number of input points. Our technique yields smaller coresets, and sometimes the first coresets, for a large number of constrained clustering problems, including capacitated clustering, fair clustering, Euclidean Wasserstein barycenter, clustering in minor-excluded graph, and polygon clustering under Fréchet and Hausdorff distance. Finally, our technique yields also smaller coresets for 1-median in low-dimensional Euclidean spaces, specifically of size O(?-1.5) in R2 and O(?-1.6) in R3.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science, FOCS 2022
PublisherInstitute of Electrical and Electronics Engineers
Pages462-473
Number of pages12
ISBN (Electronic)9781665455190
DOIs
StatePublished - 1 Jan 2022
Externally publishedYes
Event63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022 - Denver, United States
Duration: 31 Oct 20223 Nov 2022

Publication series

NameProceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
Volume2022-October
ISSN (Print)0272-5428

Conference

Conference63rd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2022
Country/TerritoryUnited States
CityDenver
Period31/10/223/11/22

Keywords

  • Wasserstein barycenter
  • capacitated clustering
  • clustering
  • coresets
  • fair clustering

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'The Power of Uniform Sampling for Coresets'. Together they form a unique fingerprint.

Cite this