autonlab.org
WARNING: you are not looking at the live version but at an older version.

Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs (1999)

Andrew Moore Leemon Baird

Tags

Markov Decision Processes, Optimization, Reinforcement Learning

Abstract

If you have planned to achieve one particular goal in a stochastic delayed rewards problem and then someone asks about a different goal what should you do? What if you need to be ready to quickly supply an answer for any possible goal? This paper shows that by using a new kind of automatically generated abstract action hierarchy that with N states, preparing for all of N possible goals can be much much cheaper than N times the work of preparing for one goal. In goal-based Markov Decision Problems, it is usual to generate a policy ?(x), mapping states to actions, and a value function J(x), mapping states to an estimate of minimum expected cost-to-goal, starting at x. In this paper we will use the terminology that a multi-policy ? ? (x; y) (for all state-pairs (x; y)) maps a state x to the first action it should take in order to reach y with expected minimum cost and a multi-valuefunction J ? (x; y) is a definition of this minimum cost. Building these objects quickly and with little memory is the main purpose of this paper, but a secondary result is a natural, automatic, way to create a set of parsomonious yet powerful abstract actions for MDPs. The paper concludes with a set of empirical results on increasingly large MDPs.

Full text

Download (application/pdf, 447.5 kB)

Approximate BibTeX Entry

@inproceedings{moore-multivalue,
    Year = {1999},
    Pages = {1316-1323},
    Publisher = {Morgan Kaufmann},
    Address = {340 Pine Street, 6th Fl., San Francisco, CA 94104},
    Booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm},
    Author = {Andrew Moore Leemon Baird},
    Title = {Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs}
}

Copyright 2008, Carnegie Mellon University, Auton Lab. All Rights Reserved.