TY - GEN
T1 - Revisiting frequency moment estimation in random order streams
AU - Braverman, Vladimir
AU - Viola, Emanuele
AU - Woodru, David P.
AU - Yang, Lin F.
N1 - Publisher Copyright:
© Vladimir Braverman, Emanuele Viola, David P. Woodru, and Lin F. Yang;.
PY - 2018/7/1
Y1 - 2018/7/1
N2 - We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments Fp for 0 < p < 2 of an underlying n-dimensional vector presented as a sequence of additive updates in a stream. It is well-known that using p-stable distributions one can approximate any of these moments up to a multiplicative (1 + )-factor using O(− 2 log n) bits of space, and this space bound is optimal up to a constant factor in the turnstile streaming model. We show that surprisingly, if one instead considers the popular random-order model of insertion-only streams, in which the updates to the underlying vector arrive in a random order, then one can beat this space bound and achieve O(− 2 + log n) bits of space, where the O hides poly(log(1/) + log log n) factors. If− 2 ≈ log n, this represents a roughly quadratic improvement in the space achievable in turnstile streams. Our algorithm is in fact deterministic, and we show our space bound is optimal up to poly(log(1/) + log log n) factors for deterministic algorithms in the random order model. We also obtain a similar improvement in space for p = 2 whenever F2 log n · F1.
AB - We revisit one of the classic problems in the data stream literature, namely, that of estimating the frequency moments Fp for 0 < p < 2 of an underlying n-dimensional vector presented as a sequence of additive updates in a stream. It is well-known that using p-stable distributions one can approximate any of these moments up to a multiplicative (1 + )-factor using O(− 2 log n) bits of space, and this space bound is optimal up to a constant factor in the turnstile streaming model. We show that surprisingly, if one instead considers the popular random-order model of insertion-only streams, in which the updates to the underlying vector arrive in a random order, then one can beat this space bound and achieve O(− 2 + log n) bits of space, where the O hides poly(log(1/) + log log n) factors. If− 2 ≈ log n, this represents a roughly quadratic improvement in the space achievable in turnstile streams. Our algorithm is in fact deterministic, and we show our space bound is optimal up to poly(log(1/) + log log n) factors for deterministic algorithms in the random order model. We also obtain a similar improvement in space for p = 2 whenever F2 log n · F1.
KW - Data Stream
KW - Frequency Moments
KW - Insertion Only Stream
KW - Random Order
KW - Space Complexity
UR - https://www.scopus.com/pages/publications/85049776538
U2 - 10.4230/LIPIcs.ICALP.2018.25
DO - 10.4230/LIPIcs.ICALP.2018.25
M3 - Conference contribution
AN - SCOPUS:85049776538
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
A2 - Kaklamanis, Christos
A2 - Marx, Daniel
A2 - Chatzigiannakis, Ioannis
A2 - Sannella, Donald
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
Y2 - 9 July 2018 through 13 July 2018
ER -