TY - UNPB
T1 - Metric-Based Imitation Learning Between Two Dissimilar Anthropomorphic Robotic Arms
AU - Ebner von Eschenbach, Marcus
AU - Manela, Binyamin
AU - Peters, Jan
AU - Biess, Armin
PY - 2020/2/25
Y1 - 2020/2/25
N2 - The development of autonomous robotic systems that can learn from human demonstrations to imitate a desired behavior - rather than being manually programmed - has huge technological potential. One major challenge in imitation learning is the correspondence problem: how to establish corresponding states and actions between expert and learner, when the embodiments of the agents are different (morphology, dynamics, degrees of freedom, etc.). Many existing approaches in imitation learning circumvent the correspondence problem, for example, kinesthetic teaching or teleoperation, which are performed on the robot. In this work we explicitly address the correspondence problem by introducing a distance measure between dissimilar embodiments. This measure is the nused as a loss function for static pose imitation and as a feedback signal within a model-free deep reinforcement learning framework for dynamic movement imitation between two anthropomorphic robotic arms in simulation. We find that the measure is well suited for describing the similarity between embodiments and for learning imitation policies by distance minimization.
AB - The development of autonomous robotic systems that can learn from human demonstrations to imitate a desired behavior - rather than being manually programmed - has huge technological potential. One major challenge in imitation learning is the correspondence problem: how to establish corresponding states and actions between expert and learner, when the embodiments of the agents are different (morphology, dynamics, degrees of freedom, etc.). Many existing approaches in imitation learning circumvent the correspondence problem, for example, kinesthetic teaching or teleoperation, which are performed on the robot. In this work we explicitly address the correspondence problem by introducing a distance measure between dissimilar embodiments. This measure is the nused as a loss function for static pose imitation and as a feedback signal within a model-free deep reinforcement learning framework for dynamic movement imitation between two anthropomorphic robotic arms in simulation. We find that the measure is well suited for describing the similarity between embodiments and for learning imitation policies by distance minimization.
KW - Computer Science - Robotics
KW - Computer Science - Machine Learning
KW - Statistics - Machine Learning
KW - I.2.6
KW - I.2.9
U2 - 10.48550/arXiv.2003.02638
DO - 10.48550/arXiv.2003.02638
M3 - Preprint
BT - Metric-Based Imitation Learning Between Two Dissimilar Anthropomorphic Robotic Arms
ER -