TY - GEN
T1 - Using Bandits for Effective Database Activity Monitoring
AU - Grushka-Cohen, Hagit
AU - Biller, Ofer
AU - Sofer, Oded
AU - Rokach, Lior
AU - Shapira, Bracha
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020/5/6
Y1 - 2020/5/6
N2 - Database activity monitoring systems aim to protect organizational data by logging users’ activity to Identify and document malicious activity. High-velocity streams and operating costs, restrict these systems to examining only a sample of the activity. Current solutions use manual policies to decide which transactions to monitor. This limits the diversity of the data collected, creating a “filter bubble” over representing specific subsets of the data such as high-risk users and under-representing the rest of the population which may never be sampled. In recommendation systems, Bandit algorithms have recently been used to address this problem. We propose addressing the sampling for database activity monitoring problem as a recommender system. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit problem and present a novel algorithm, (Formula Presented)–Greedy, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection using simulated data. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task, maximizing population coverage without decreasing the quality in terms of issuing alerts about events, and outperforming policies manually crafted by experts and other sampling methods.
AB - Database activity monitoring systems aim to protect organizational data by logging users’ activity to Identify and document malicious activity. High-velocity streams and operating costs, restrict these systems to examining only a sample of the activity. Current solutions use manual policies to decide which transactions to monitor. This limits the diversity of the data collected, creating a “filter bubble” over representing specific subsets of the data such as high-risk users and under-representing the rest of the population which may never be sampled. In recommendation systems, Bandit algorithms have recently been used to address this problem. We propose addressing the sampling for database activity monitoring problem as a recommender system. In this work, we redefine the data sampling problem as a special case of the multi-armed bandit problem and present a novel algorithm, (Formula Presented)–Greedy, which combines expert knowledge with random exploration. We analyze the effect of diversity on coverage and downstream event detection using simulated data. In doing so, we find that adding diversity to the sampling using the bandit-based approach works well for this task, maximizing population coverage without decreasing the quality in terms of issuing alerts about events, and outperforming policies manually crafted by experts and other sampling methods.
KW - Database activity monitoring
KW - Filter bubble
KW - Multi-armed bandit
KW - Sampling
UR - http://www.scopus.com/inward/record.url?scp=85085731996&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-47436-2_53
DO - 10.1007/978-3-030-47436-2_53
M3 - Conference contribution
AN - SCOPUS:85085731996
SN - 9783030474355
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 701
EP - 713
BT - Advances in Knowledge Discovery and Data Mining - 24th Pacific-Asia Conference, PAKDD 2020, Proceedings
A2 - Lauw, Hady W.
A2 - Lim, Ee-Peng
A2 - Wong, Raymond Chi-Wing
A2 - Ntoulas, Alexandros
A2 - Ng, See-Kiong
A2 - Pan, Sinno Jialin
PB - Springer
T2 - 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2020
Y2 - 11 May 2020 through 14 May 2020
ER -