TY - GEN
T1 - Simulating User Activity for Assessing Effect of Sampling on DB Activity Monitoring Anomaly Detection
AU - Grushka-Cohen, Hagit
AU - Biller, Ofer
AU - Sofer, Oded
AU - Rokach, Lior
AU - Shapira, Bracha
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019/4/25
Y1 - 2019/4/25
N2 - Monitoring database activity is useful for identifying and preventing data breaches. Such database activity monitoring (DAM) systems use anomaly detection algorithms to alert security officers to possible infractions. However, the sheer number of transactions makes it impossible to track each transaction. Instead, solutions use manually crafted policies to decide which transactions to monitor and log. Creating a smart data-driven policy for monitoring transactions requires moving beyond manual policies. In this paper, we describe a novel simulation method for user activity. We introduce events of change in the user transaction profile and assess the impact of sampling on the anomaly detection algorithm. We found that looking for anomalies in a fixed subset of the data using a static policy misses most of these events since low-risk users are ignored. A Bayesian sampling policy identified 67% of the anomalies while sampling only 10% of the data, compared to a baseline of using all of the data.
AB - Monitoring database activity is useful for identifying and preventing data breaches. Such database activity monitoring (DAM) systems use anomaly detection algorithms to alert security officers to possible infractions. However, the sheer number of transactions makes it impossible to track each transaction. Instead, solutions use manually crafted policies to decide which transactions to monitor and log. Creating a smart data-driven policy for monitoring transactions requires moving beyond manual policies. In this paper, we describe a novel simulation method for user activity. We introduce events of change in the user transaction profile and assess the impact of sampling on the anomaly detection algorithm. We found that looking for anomalies in a fixed subset of the data using a static policy misses most of these events since low-risk users are ignored. A Bayesian sampling policy identified 67% of the anomalies while sampling only 10% of the data, compared to a baseline of using all of the data.
UR - http://www.scopus.com/inward/record.url?scp=85065780505&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-17277-0_5
DO - 10.1007/978-3-030-17277-0_5
M3 - Conference contribution
AN - SCOPUS:85065780505
SN - 9783030172763
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 82
EP - 90
BT - Policy-Based Autonomic Data Governance
A2 - Calo, Seraphin
A2 - Verma, Dinesh
A2 - Bertino, Elisa
PB - Springer Verlag
T2 - 2nd International Workshop on Policy-based Autonomic Data Governance, PADG 2018 in conjunction with the 23rd European Symposium on Research in Computer Security, ESORICS 2018
Y2 - 3 September 2018 through 7 September 2018
ER -