TY - GEN
T1 - Secure Best Arm Identification in the Presence of a Copycat
AU - Cohen, Asaf
AU - Günlü, Onur
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with K arms of dimension d. In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The player's goal is to identify the best arm after T arm pulls. Moreover, assume a copycat Chloe is observing the arm pulls. The player wishes to keep Chloe ignorant of the best arm.While a minimax-optimal algorithm identifies the best arm with an Ω (T/log(d)) error exponent, it easily reveals its best-arm estimate to an outside observer, as the best arms are played more frequently. A naïve secure algorithm that plays all arms equally results in an Ω (T/d) exponent. In this paper, we propose a secure algorithm that plays with coded arms. The algorithm does not require any key or cryptographic primitives, yet achieves an Ω (T/log2(d)) exponent while revealing almost no information on the best arm.
AB - Consider the problem of best arm identification with a security constraint. Specifically, assume a setup of stochastic linear bandits with K arms of dimension d. In each arm pull, the player receives a reward that is the sum of the dot product of the arm with an unknown parameter vector and independent noise. The player's goal is to identify the best arm after T arm pulls. Moreover, assume a copycat Chloe is observing the arm pulls. The player wishes to keep Chloe ignorant of the best arm.While a minimax-optimal algorithm identifies the best arm with an Ω (T/log(d)) error exponent, it easily reveals its best-arm estimate to an outside observer, as the best arms are played more frequently. A naïve secure algorithm that plays all arms equally results in an Ω (T/d) exponent. In this paper, we propose a secure algorithm that plays with coded arms. The algorithm does not require any key or cryptographic primitives, yet achieves an Ω (T/log2(d)) exponent while revealing almost no information on the best arm.
KW - Best Arm Identification
KW - Coded Best Arm Identification
KW - Linear Stochastic Bandits
KW - Security
UR - https://www.scopus.com/pages/publications/105029067737
U2 - 10.1109/ITW62417.2025.11240381
DO - 10.1109/ITW62417.2025.11240381
M3 - Conference contribution
AN - SCOPUS:105029067737
T3 - 2025 IEEE Information Theory Workshop, ITW 2025
BT - 2025 IEEE Information Theory Workshop, ITW 2025
PB - Institute of Electrical and Electronics Engineers
T2 - 2025 IEEE Information Theory Workshop, ITW 2025
Y2 - 29 September 2025 through 3 October 2025
ER -