TY - JOUR
T1 - DeepStream
T2 - Autoencoder-based stream temporal clustering and anomaly detection
AU - Harush, Shimon
AU - Meidan, Yair
AU - Shabtai, Asaf
N1 - Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/7/1
Y1 - 2021/7/1
N2 - The increasing number of IoT devices in “smart” environments, such as homes, offices, and cities, produce seemingly endless data streams and drive many daily decisions. Consequently, there is growing interest in identifying contextual information from sensor data to facilitate the performance of various tasks, e.g., traffic management, cyber attack detection, and healthcare monitoring. The correct identification of contexts in data streams is helpful for many tasks, for example, it can assist in providing high-quality recommendations to end users and in reporting anomalous behavior based on the detection of unusual contexts. This paper presents DeepStream, a novel data stream temporal clustering algorithm that dynamically detects sequential and overlapping clusters. DeepStream is tuned to classify contextual information in real time and is capable of coping with a high-dimensional feature space. DeepStream utilizes stacked autoencoders to reduce the dimensionality of unbounded data streams and for cluster representation. This method detects contextual behavior and captures nonlinear relations of the input data, giving it an advantage over existing methods that rely on PCA. We evaluated DeepStream empirically using four sensor and IoT datasets and compared it to five state-of-the-art stream clustering algorithms. Our evaluation shows that DeepStream outperforms all of these algorithms. Our evaluation also demonstrates how DeepStream's improved clustering performance results in improved detection of anomalous data.
AB - The increasing number of IoT devices in “smart” environments, such as homes, offices, and cities, produce seemingly endless data streams and drive many daily decisions. Consequently, there is growing interest in identifying contextual information from sensor data to facilitate the performance of various tasks, e.g., traffic management, cyber attack detection, and healthcare monitoring. The correct identification of contexts in data streams is helpful for many tasks, for example, it can assist in providing high-quality recommendations to end users and in reporting anomalous behavior based on the detection of unusual contexts. This paper presents DeepStream, a novel data stream temporal clustering algorithm that dynamically detects sequential and overlapping clusters. DeepStream is tuned to classify contextual information in real time and is capable of coping with a high-dimensional feature space. DeepStream utilizes stacked autoencoders to reduce the dimensionality of unbounded data streams and for cluster representation. This method detects contextual behavior and captures nonlinear relations of the input data, giving it an advantage over existing methods that rely on PCA. We evaluated DeepStream empirically using four sensor and IoT datasets and compared it to five state-of-the-art stream clustering algorithms. Our evaluation shows that DeepStream outperforms all of these algorithms. Our evaluation also demonstrates how DeepStream's improved clustering performance results in improved detection of anomalous data.
KW - Activity recognition
KW - Anomaly detection
KW - Autoencoder
KW - Dimensionality reduction
KW - Stream clustering
UR - http://www.scopus.com/inward/record.url?scp=85105068100&partnerID=8YFLogxK
U2 - 10.1016/j.cose.2021.102276
DO - 10.1016/j.cose.2021.102276
M3 - Article
AN - SCOPUS:85105068100
SN - 0167-4048
VL - 106
JO - Computers and Security
JF - Computers and Security
M1 - 102276
ER -