Skip to main navigation Skip to search Skip to main content

Solving Dec-POMDPs as POMDPs Using Imitation Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Dec-POMDPs model cooperative, sequential multi-agent decision problems. They are computationally challenging, and scaling up their performance is difficult. We describe a method for solving Dec-POMDPs in the paradigm of centralized planning with distributed execution. First, we solve a team POMDP in which agent observations are common knowledge. Then, each agent uses imitation learning to try and imitate its part of the centralized policy. Unlike some previous work, the agent not only tries to imitate its behavior within the team, but also its belief state. A final offline synchronization stage improves the likelihood that agents’ policies will be well-coordinated with each other. On standard Dec-POMDP benchmarks, our method performs better than the best Dec-POMDP model-based solution method, and QMIX, a leading multi-agent RL algorithm.

Original languageEnglish
Title of host publicationPRIMA 2025
Subtitle of host publicationPrinciples and Practice of Multi-Agent Systems - 26th International Conference, Proceedings
EditorsCatalin Dima, Angelo Ferrando, Vadim Malvone
PublisherSpringer Science and Business Media Deutschland GmbH
Pages117-132
Number of pages16
ISBN (Print)9783032135612
DOIs
StatePublished - 1 Jan 2026
Event26th International Conference on Principles and Practice of Multi-Agent Systems, PRIMA 2025 - Modena, Italy
Duration: 16 Dec 202519 Dec 2025

Publication series

NameLecture Notes in Computer Science
Volume16366 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Principles and Practice of Multi-Agent Systems, PRIMA 2025
Country/TerritoryItaly
CityModena
Period16/12/2519/12/25

Keywords

  • Dec-POMDP
  • Imitation Learning
  • POMDP

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Solving Dec-POMDPs as POMDPs Using Imitation Learning'. Together they form a unique fingerprint.

Cite this