autonlab.org
WARNING: you are not looking at the live version but at an older version.

Covariant Policy Search (2003)

Drew Bagnell Jeff Schneider

Tags

Markov Decision Processes, Optimization, Reinforcement Learning

Abstract

Abstract We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths induced by a stochastic controller. Investigation of this approach leads to a covariant gradient ascent rule. Interesting properties of this rule are discussed, including its relation with actor-critic style reinforcement learning algorithms. The algorithms discussed here are computationally quite efficient and on some interesting problems lead to dramatic performance improvement over non-covariant rules.

Full text

Download (application/pdf, 135.8 kB)

Approximate BibTeX Entry

@inproceedings{bagnellCovariant,
    Month = {July},
    Year = {2003},
    Booktitle = {Proceedings of the International Joint Conference on Artificial Intelligence},
    Author = {Drew Bagnell Jeff Schneider},
    Title = {Covariant Policy Search}
}

Copyright 2010, Carnegie Mellon University, Auton Lab. All Rights Reserved.