A survey of point-based POMDP solvers

Guy Shani, Joelle Pineau, Robert Kaplow

Research output: Contribution to journalArticlepeer-review

218 Scopus citations

Abstract

The past decade has seen a significant breakthrough in research on solving partially observable Markov decision processes (POMDPs). Where past solvers could not scale beyond perhaps a dozen states, modern solvers can handle complex domains with many thousands of states. This breakthrough was mainly due to the idea of restricting value function computations to a finite subset of the belief space, permitting only local value updates for this subset. This approach, known as point-based value iteration, avoids the exponential growth of the value function, and is thus applicable for domains with longer horizons, even with relatively large state spaces. Many extensions were suggested to this basic idea, focusing on various aspects of the algorithm-mainly the selection of the belief space subset, and the order of value function updates. In this survey, we walk the reader through the fundamentals of point-based value iteration, explaining the main concepts and ideas. Then, we survey the major extensions to the basic algorithm, discussing their merits. Finally, we include an extensive empirical analysis using well known benchmarks, in order to shed light on the strengths and limitations of the various approaches.

Original languageEnglish
Pages (from-to)1-51
Number of pages51
JournalAutonomous Agents and Multi-Agent Systems
Volume27
Issue number1
DOIs
StatePublished - 1 Jul 2013

Keywords

  • Decision-theoretic planning
  • Partially observable Markov decision processes
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'A survey of point-based POMDP solvers'. Together they form a unique fingerprint.

Cite this