An Interface for Black Box Learning in Probabilistic Programs

Jan-Willem van de Meent, Brooks Paige, David Tolpin, Frank Wood

Research output: Contribution to conferenceAbstract

Abstract

In this abstract we are interested in algorithms that combine inference with learning. As a motivating example we consider a program (see Figure 1), written in the language Anglican [7], which simulates the Canadian traveler problem (CTP) domain. In the CTP, an agent must travel along a graph, which represents a network of roads, to get from the start node (green) to the target node (red). Due to bad weather some roads are blocked, but the agent does not know which in advance. The agent performs depth-first search along the graph, which will require a varying number of steps, depending on which edges are closed, and incurs a cost for the traveled distance. The program in Figure 1 defines two types of policies for the CTP. For the policy where edges are chosen at random, we may perform online planning by simulating future actions and outcomes, also known as rollouts, and choosing the action that minimizes the expected cost. Alternatively we may learn a policy that, after an initial training period, can be applied without calculating rollouts. To do so we consider a deterministic policy for which we learn a set of parameters (the edge preferences).
Original languageEnglish
StatePublished - 23 Jan 2016
EventPOPL Workshop on Probabilistic Programming Semantics - Guadalajara, Mexico City, , Mexico
Duration: 23 Jan 201623 Jan 2016

Conference

ConferencePOPL Workshop on Probabilistic Programming Semantics
Country/TerritoryMexico
CityGuadalajara, Mexico City,
Period23/01/1623/01/16

Fingerprint

Dive into the research topics of 'An Interface for Black Box Learning in Probabilistic Programs'. Together they form a unique fingerprint.

Cite this