Team-Imitate-Synchronize for Solving Dec-POMDPs.

Eliran Abdoo, Ronen I. Brafman, Guy Shani, Nitsan Soffair

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multi-agent collaboration under partial observability is a difficult task. Multi-agent reinforcement learning (MARL) algorithms that do not leverage a model of the environment struggle with tasks that require sequences of collaborative actions, while Dec-POMDP algorithms that use such models to compute near-optimal policies, scale poorly. In this paper, we suggest the Team-Imitate-Synchronize (TIS) approach, a heuristic, model-based method for solving such problems. Our approach begins by solving the joint team problem, assuming that observations are shared. Then, for each agent we solve a single agent problem designed to imitate its behavior within the team plan. Finally, we adjust the single agent policies for better synchronization. Our experiments demonstrate that our method provides comparable solutions to Dec-POMDP solvers over small problems, while scaling to much larger problems, and provides collaborative plans that MARL algorithms are unable to identify.
Original languageEnglish
Title of host publicationJoint European Conference on Machine Learning and Knowledge Discovery in Databases ECML/PKDD (4)
EditorsMassih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, Grigorios Tsoumakas
PublisherSpringer Cham
Pages216-232
Number of pages17
ISBN (Electronic)978-3-031-26412-2
ISBN (Print)978-3-031-26411-5
DOIs
StatePublished - 17 Mar 2023

Publication series

NameLecture Notes in Computer Science
Volume13716
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Fingerprint

Dive into the research topics of 'Team-Imitate-Synchronize for Solving Dec-POMDPs.'. Together they form a unique fingerprint.

Cite this