Hierarchical heuristic forward search in stochastic domains

Nicolas Meuleau, Ronen I. Brafman

Research output: Contribution to journalConference articlepeer-review

6 Scopus citations

Abstract

Many MDPs exhibit an hierarchical structure where the agent needs to perform various subtasks that are coupled only by a small sub-set of variables containing, notably, shared resources. Previous work has shown how this hierarchical structure can be exploited by solving several sub-MDPs representing the different subtasks in different calling contexts, and a root MDP responsible for sequencing and synchronizing the subtasks, instead of a huge MDP representing the whole problem. Another important idea used by efficient algorithms for solving flat MDPs, such as (L)AO*and (L)RTDP, is to exploit reachability information and an admissible heuristics in order to accelerate the search by pruning states that cannot be reached from a given starting state under an optimal policy. In this paper, we combine both ideas and develop a variant of the AO*algorithm for performing forward heuristic search in hierarchical models. This algorithm shows great performance improvements over hierarchical approaches using standard MDP solvers such as Value Iteration, as well as with respect to AO*applied to a flat representation of the problem. Moreover, it presents a general new method for accelerating AO*and other forward search algorithms. Substantial performance gains may be obtained in these algorithms by partitioning the set of search nodes, and solving a subset of nodes completely before propagating the results to other subsets.

Original languageEnglish
Pages (from-to)2542-2549
Number of pages8
JournalIJCAI International Joint Conference on Artificial Intelligence
StatePublished - 1 Dec 2007
Event20th International Joint Conference on Artificial Intelligence, IJCAI 2007 - Hyderabad, India
Duration: 6 Jan 200712 Jan 2007

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Hierarchical heuristic forward search in stochastic domains'. Together they form a unique fingerprint.

Cite this