Abstract
Online systems utilize user data, such as demographics, past performance, preferences and skillset to construct an accurate model of users and maximize personalization. Some of these user features are “shallow” traits which seldom change (e.g. age, race, gender) while others are “deep” traits that are more volatile (e.g. performance, goals, interests). In this work, we explore how reasoning about this diversity of user features can enhance performance of personalized systems. By modeling the personalization process as a Reinforcement Learning (RL) problem, we introduce Diversity Aware Bandits for Intervention Personaliztion (DABIP), a novel contextual multi-armed bandit algorithm that leverages the dynamics within user features to cluster users while maximizing outcomes. We demonstrate the efficacy of this approach using two real world datasets from different domains.
Original language | English |
---|---|
Pages (from-to) | 254-263 |
Number of pages | 10 |
Journal | CEUR Workshop Proceedings |
Volume | 3456 |
State | Published - 1 Jan 2023 |
Event | Workshops at the 2nd International Conference on Hybrid Human-Artificial Intelligence, HHAI-WS 2023 - Munich, Germany Duration: 26 Jun 2023 → 27 Jun 2023 |
Keywords
- Contextual Multi-Armed Bandit
- Incentives
- Interventions
ASJC Scopus subject areas
- General Computer Science