TY - JOUR
T1 - Personalizing Interventions with Diversity Aware Bandits
AU - Botta, Colton
AU - Segal, Avi
AU - Gal, Kobi
N1 - Funding Information:
This work was supported in part by the European Union Horizon 2020 WeNet research and innovation program under grant agreement No 823783.
Publisher Copyright:
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Online systems utilize user data, such as demographics, past performance, preferences and skillset to construct an accurate model of users and maximize personalization. Some of these user features are “shallow” traits which seldom change (e.g. age, race, gender) while others are “deep” traits that are more volatile (e.g. performance, goals, interests). In this work, we explore how reasoning about this diversity of user features can enhance performance of personalized systems. By modeling the personalization process as a Reinforcement Learning (RL) problem, we introduce Diversity Aware Bandits for Intervention Personaliztion (DABIP), a novel contextual multi-armed bandit algorithm that leverages the dynamics within user features to cluster users while maximizing outcomes. We demonstrate the efficacy of this approach using two real world datasets from different domains.
AB - Online systems utilize user data, such as demographics, past performance, preferences and skillset to construct an accurate model of users and maximize personalization. Some of these user features are “shallow” traits which seldom change (e.g. age, race, gender) while others are “deep” traits that are more volatile (e.g. performance, goals, interests). In this work, we explore how reasoning about this diversity of user features can enhance performance of personalized systems. By modeling the personalization process as a Reinforcement Learning (RL) problem, we introduce Diversity Aware Bandits for Intervention Personaliztion (DABIP), a novel contextual multi-armed bandit algorithm that leverages the dynamics within user features to cluster users while maximizing outcomes. We demonstrate the efficacy of this approach using two real world datasets from different domains.
KW - Contextual Multi-Armed Bandit
KW - Incentives
KW - Interventions
UR - http://www.scopus.com/inward/record.url?scp=85171175891&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85171175891
SN - 1613-0073
VL - 3456
SP - 254
EP - 263
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - Workshops at the 2nd International Conference on Hybrid Human-Artificial Intelligence, HHAI-WS 2023
Y2 - 26 June 2023 through 27 June 2023
ER -