TY - JOUR
T1 - Relaxed Exploration Constrained Reinforcement Learning
AU - Shperberg, Shahaf S.
AU - Liu, Bo
AU - Stone, Peter
N1 - Funding Information:
This work has taken place in the Learning Agents Research Group (LARG) at the Artificial Intelligence Laboratory, The University of Texas at Austin. LARG research is supported in part by the National Science Foundation (CPS-1739964, IIS-1724157, FAIN-2019844), the Office of Naval Research (N00014-18-2243), Army Research Office (W911NF-19-2-0333), DARPA, General Motors, Bosch, and Good Systems, a research grand challenge at the University of Texas at Austin. The views and conclusions contained in this document are those of the authors alone. Peter Stone serves as the Executive Director of Sony AI America and receives financial compensation for this work. The terms of this arrangement have been reviewed and approved by the University of Texas at Austin in accordance with its policy on objectivity in research.
Publisher Copyright:
© 2023 International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - This extended abstract introduces a novel setting of reinforcement learning with constraints, called Relaxed Exploration Constrained Reinforcement Learning (RECRL). As in standard constrained reinforcement learning (CRL), the aim is to find a policy that maximizes environmental return subject to a set of constraints. However, in RECRL there is an initial training phase in which the constraints are relaxed, thus the agent can explore the environment more freely. When training is done, the agent is deployed in the environment and is required to fully satisfy all constraints. As an initial approach to RECRL problems, we introduce a curriculum-based approach, named CLiC, that can be applied to existing CRL algorithms to improve their exploration during the training phase while allowing them to gradually converge to a policy that satisfies the full set of constraints. Empirical evaluation shows that CLiC produces policies with a higher return during deployment than policies learned when training is done using only the strict set of constraints.
AB - This extended abstract introduces a novel setting of reinforcement learning with constraints, called Relaxed Exploration Constrained Reinforcement Learning (RECRL). As in standard constrained reinforcement learning (CRL), the aim is to find a policy that maximizes environmental return subject to a set of constraints. However, in RECRL there is an initial training phase in which the constraints are relaxed, thus the agent can explore the environment more freely. When training is done, the agent is deployed in the environment and is required to fully satisfy all constraints. As an initial approach to RECRL problems, we introduce a curriculum-based approach, named CLiC, that can be applied to existing CRL algorithms to improve their exploration during the training phase while allowing them to gradually converge to a policy that satisfies the full set of constraints. Empirical evaluation shows that CLiC produces policies with a higher return during deployment than policies learned when training is done using only the strict set of constraints.
KW - Constrained Reinforcement Learning
KW - Curriculum Learning
UR - http://www.scopus.com/inward/record.url?scp=85171300701&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85171300701
SN - 1548-8403
VL - 2023-May
SP - 2821
EP - 2823
JO - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
JF - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
T2 - 22nd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2023
Y2 - 29 May 2023 through 2 June 2023
ER -