Approximating large frequency moments with pick-and-drop sampling

Vladimir Braverman, Rafail Ostrovsky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Given data stream D = {p1,p2,...,pm} of size m of numbers from {1,..., n}, the frequency of i is defined as f i = |{j: pj = i}|. The k-th frequency moment of D is defined as Fk = ∑i=1n fi k. We consider the problem of approximating frequency moments in insertion-only streams for k ≥ 3. For any constant c we show an O(n 1-2/k log(n)log(c)(n)) upper bound on the space complexity of the problem. Here log(c)(n) is the iterative log function. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a pick-and-drop sampling; it samples a heavy element (i.e., element i with frequency Ω(Fk )) with probability Ω(1/n1-2/k) and gives approximation f̃i ≥ (1 - ε) fi. In addition, the estimations never exceed the real values, that is f̃i ≤ fj for all j. For constant ε, we reduce the space complexity of finding a heavy element to O(n 1-2/k log(n)) bits. We apply our method of recursive sketches and resolve the problem with O(n1-2/k log(n)log(c)(n)) bits. We reduce the ratio between the upper and lower bounds from O(log 2(n)) to O(log(n)log(c)(n)). Thus, we provide a (roughly) quadratic improvement of the result of Andoni, Krauthgamer and Onak (FOCS 2011).

Original languageEnglish
Title of host publicationApproximation, Randomization, and Combinatorial Optimization
Subtitle of host publicationAlgorithms and Techniques - 16th International Workshop, APPROX 2013 and 17th International Workshop, RANDOM 2013, Proceedings
Pages42-57
Number of pages16
DOIs
StatePublished - 15 Oct 2013
Externally publishedYes
Event16th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2013 and the 17th International Workshop on Randomization and Computation, RANDOM 2013 - Berkeley, CA, United States
Duration: 21 Aug 201323 Aug 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8096 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2013 and the 17th International Workshop on Randomization and Computation, RANDOM 2013
Country/TerritoryUnited States
CityBerkeley, CA
Period21/08/1323/08/13

Keywords

  • Data streams
  • frequency moments
  • sampling

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Approximating large frequency moments with pick-and-drop sampling'. Together they form a unique fingerprint.

Cite this