Abstract
We consider in this paper the application of deep reinforcement learning techniques to learning closed loop control and goal-oriented trajectory planning in a robotic application. We employ an end-to-end (from the motor input the required task) model free approach using a deep Q-learning framework to learn a motoric skill. We propose several improvements to the naive deep Q-learning algorithm which otherwise fails. First we use some rough prior knowledge we have on the goal of the task to heuristically explore the environment. Second we manage to prevent the so-called catastrophic forgetting of neural networks. We present our simulation results for accurate striking task in air hockey, and show the success and stability of our learning algorithm due to the proposed modifications. We also present simulations that further support our claim of successfully mitigating the problem of catastrophic forgetting.
Original language | English |
---|---|
Pages (from-to) | 158-169 |
Journal | International Journal of Mathematical Models and Methods in Applied Sciences |
Volume | 11 |
State | Published - 2017 |
Externally published | Yes |