Abstract
This research introduces a novel setting for reinforcement learning with constraints, termed Relaxed Exploration Constrained Reinforcement Learning (RECRL). Similar to standard constrained reinforcement learning (CRL), the objective in RECRL is to discover a policy that maximizes the environmental return while adhering to a predefined set of constraints. However, in some real-world settings, it is possible to train the agent in a setting that does not require strict adherence to the constraints, as long as the agent adheres to them once deployed. To model such settings, we introduce RECRL, which explicitly incorporates an initial training phase where the constraints are relaxed, enabling the agent to explore the environment more freely. Subsequently, during deployment, the agent is obligated to fully satisfy all constraints. To address RECRL problems, we introduce a curriculum-based approach called CLiC, designed to enhance the exploration of existing CRL algorithms during the training phase and facilitate convergence towards a policy that satisfies the full set of constraints by the end of training. Empirical evaluations demonstrate that CLiC yields policies with significantly higher returns during deployment compared to training solely under the strict set of constraints. The code is available at https://github.com/Shperb/RECRL.
Original language | English |
---|---|
Pages (from-to) | 1727-1735 |
Number of pages | 9 |
Journal | Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS |
Volume | 2024-May |
State | Published - 1 Jan 2024 |
Event | 23rd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2024 - Auckland, New Zealand Duration: 6 May 2024 → 10 May 2024 |
Keywords
- Constrained Reinforcement Learning
- Curriculum Learning
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering