TY - JOUR
T1 - Enhancing Deep Reinforcement Learning with Scenario-Based Modeling
AU - Yerushalmi, Raz
AU - Amir, Guy
AU - Elyasaf, Achiya
AU - Harel, David
AU - Katz, Guy
AU - Marron, Assaf
N1 - Funding Information:
The work of R. Yerushalmi, G. Amir, A. Elyasaf and G. Katz was partially supported by a grant from the Israeli Smart Transportation Research Center (ISTRC). The work of G. Amir was supported by a scholarship from the Clore Israel Foundation. The work of D. Harel, A. Marron and R. Yerushalmi was partially supported by a research grant from the Estate of Harry Levine, the Estate of Avraham Rothstein, Brenda Gruss and Daniel Hirsch, the One8 Foundation, Rina Mayer, Maurice Levy, and the Estate of Bernice Bernath, a grant 3698/21 from the ISF-NSFC joint to the Israel Science Foundation and the National Science Foundation of China, and a grant from the Minerva foundation.
Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.
PY - 2023/1/11
Y1 - 2023/1/11
N2 - Deep reinforcement learning agents have achieved unprecedented results when learning to generalize from unstructured data. However, the “black-box” nature of the trained DRL agents makes it difficult to ensure that they adhere to various requirements posed by engineers. In this work, we put forth a novel technique for enhancing the reinforcement learning training loop, and specifically—its reward function, in a way that allows engineers to directly inject their expert knowledge into the training process. This allows us to make the trained agent adhere to multiple constraints of interest. Moreover, using scenario-based modeling techniques, our method allows users to formulate the defined constraints using advanced, well-established, behavioral modeling methods. This combination of such modeling methods together with ML learning tools produces agents that are both high performing and more likely to adhere to prescribed constraints. Furthermore, the resulting agents are more transparent and hence more maintainable. We demonstrate our technique by evaluating it on a case study from the domain of internet congestion control, and present promising results.
AB - Deep reinforcement learning agents have achieved unprecedented results when learning to generalize from unstructured data. However, the “black-box” nature of the trained DRL agents makes it difficult to ensure that they adhere to various requirements posed by engineers. In this work, we put forth a novel technique for enhancing the reinforcement learning training loop, and specifically—its reward function, in a way that allows engineers to directly inject their expert knowledge into the training process. This allows us to make the trained agent adhere to multiple constraints of interest. Moreover, using scenario-based modeling techniques, our method allows users to formulate the defined constraints using advanced, well-established, behavioral modeling methods. This combination of such modeling methods together with ML learning tools produces agents that are both high performing and more likely to adhere to prescribed constraints. Furthermore, the resulting agents are more transparent and hence more maintainable. We demonstrate our technique by evaluating it on a case study from the domain of internet congestion control, and present promising results.
KW - Deep reinforcement learning
KW - Domain expertise
KW - Machine learning
KW - Rule-based specifications
KW - Scenario-based modeling
UR - http://www.scopus.com/inward/record.url?scp=85146228557&partnerID=8YFLogxK
U2 - 10.1007/s42979-022-01575-2
DO - 10.1007/s42979-022-01575-2
M3 - Article
AN - SCOPUS:85146228557
SN - 2662-995X
VL - 4
JO - SN Computer Science
JF - SN Computer Science
IS - 2
M1 - 156
ER -