TY - GEN
T1 - Optimizing interventions via offline policy evaluation
T2 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
AU - Segal, Avi
AU - Gal, Kobi
AU - Kamar, Ece
AU - Horvitz, Eric
AU - Miller, Grant
N1 - Publisher Copyright:
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Volunteers who help with online crowdsourcing such as citizen science tasks typically make only a few contributions before exiting. We propose a computational approach for increasing users' engagement in such settings that is based on optimizing policies for displaying motivational messages to users. The approach, which we refer to as Trajectory Corrected Intervention (TCI), reasons about the tradeoff between the long-term influence of engagement messages on participants' contributions and the potential risk of disrupting their current work. We combine model-based reinforcement learning with off-line policy evaluation to generate intervention policies, without relying on a fixed representation of the domain. TCI works iteratively to learn the best representation from a set of random intervention trials and to generate candidate intervention policies. It is able to refine selected policies off-line by exploiting the fact that users can only be interrupted once per session. We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and significantly increased the contributions of thousands of users. This work demonstrates the benefit of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies.
AB - Volunteers who help with online crowdsourcing such as citizen science tasks typically make only a few contributions before exiting. We propose a computational approach for increasing users' engagement in such settings that is based on optimizing policies for displaying motivational messages to users. The approach, which we refer to as Trajectory Corrected Intervention (TCI), reasons about the tradeoff between the long-term influence of engagement messages on participants' contributions and the potential risk of disrupting their current work. We combine model-based reinforcement learning with off-line policy evaluation to generate intervention policies, without relying on a fixed representation of the domain. TCI works iteratively to learn the best representation from a set of random intervention trials and to generate candidate intervention policies. It is able to refine selected policies off-line by exploiting the fact that users can only be interrupted once per session. We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and significantly increased the contributions of thousands of users. This work demonstrates the benefit of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies.
UR - http://www.scopus.com/inward/record.url?scp=85060451404&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85060451404
T3 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
SP - 1536
EP - 1544
BT - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PB - AAAI press
Y2 - 2 February 2018 through 7 February 2018
ER -