TY - GEN
T1 - Clustering on sliding windows in polylogarithmic space
AU - Braverman, Vladimir
AU - Lang, Harry
AU - Levin, Keith
AU - Monemizadeh, Morteza
N1 - Publisher Copyright:
© Vladimir Braverman, Harry Lang, Keith Levin, and Morteza Monemizadeh.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - In PODS 2003, Babcock, Datar, Motwani and O'Callaghan [4] gave the first streaming solution for the k-median problem on sliding windows using O( k/τ4W2τ log2W) space, with a O(2O(1/τ)) approximation factor, where W is the window size and ∈ 2 (0, 1/2 ) is a user-specified parameter. They left as an open question whether it is possible to improve this to polylogarithmic space. Despite much progress on clustering and sliding windows, this question has remained open for more than a decade. In this paper, we partially answer the main open question posed by Babcock, Datar, Motwani and O'Callaghan. We present an algorithm yielding an exponential improvement in space compared to the previous result given in Babcock, et al. In particular, we give the first polylogarithmic space (-,-)-approximation for metric k-median clustering in the sliding window model, where- and- are constants, under the assumption, also made by Babcock et al., that the optimal k-median cost on any given window is bounded by a polynomial in the window size. We justify this assumption by showing that when the cost is exponential in the window size, no sublinear space approximation is possible. Our main technical contribution is a simple but elegant extension of smooth functions as introduced by Braverman and Ostrovsky [9], which allows us to apply well-known techniques for solving problems in the sliding window model to functions that are not smooth, such as the k-median cost.
AB - In PODS 2003, Babcock, Datar, Motwani and O'Callaghan [4] gave the first streaming solution for the k-median problem on sliding windows using O( k/τ4W2τ log2W) space, with a O(2O(1/τ)) approximation factor, where W is the window size and ∈ 2 (0, 1/2 ) is a user-specified parameter. They left as an open question whether it is possible to improve this to polylogarithmic space. Despite much progress on clustering and sliding windows, this question has remained open for more than a decade. In this paper, we partially answer the main open question posed by Babcock, Datar, Motwani and O'Callaghan. We present an algorithm yielding an exponential improvement in space compared to the previous result given in Babcock, et al. In particular, we give the first polylogarithmic space (-,-)-approximation for metric k-median clustering in the sliding window model, where- and- are constants, under the assumption, also made by Babcock et al., that the optimal k-median cost on any given window is bounded by a polynomial in the window size. We justify this assumption by showing that when the cost is exponential in the window size, no sublinear space approximation is possible. Our main technical contribution is a simple but elegant extension of smooth functions as introduced by Braverman and Ostrovsky [9], which allows us to apply well-known techniques for solving problems in the sliding window model to functions that are not smooth, such as the k-median cost.
KW - Clustering
KW - Sliding windows
KW - Streaming
UR - https://www.scopus.com/pages/publications/84958766652
U2 - 10.4230/LIPIcs.FSTTCS.2015.350
DO - 10.4230/LIPIcs.FSTTCS.2015.350
M3 - Conference contribution
AN - SCOPUS:84958766652
T3 - Leibniz International Proceedings in Informatics, LIPIcs
SP - 350
EP - 364
BT - 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2015
A2 - Harsha, Prahladh
A2 - Ramalingam, G.
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 35th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2015
Y2 - 16 December 2015 through 18 December 2015
ER -