tutorials:mdp.html

**How do you plan efficiently if the results of your actions are
uncertain? There is some remarkably good news, and some some significant
computational hardship. We begin by discussing Markov Systems (which
have no actions) and the notion of Markov Systems with Rewards. We then
motivate and explain the idea of infinite horizon discounted future
rewards. And then we look at two competing approaches to deal with the
following computational problem: given a Markov System with Rewards,
compute the expected long-term discounted rewards. The two methods,
which usually sit at opposite corners of the ring and snarl at each
other, are straight linear algebra and dynamic programming. We then make
the leap up to Markov Decision Processes, and find that we've already
done 82% of the work needed to compute not only the long term rewards of
each MDP state, but also the optimal action to take in each state.**

- In addition to these slides, for a survey on Reinforcement Learning, please see this paper or Sutton and Barto's book.

Download Tutorial Slides (PDF format)

**Powerpoint Format:** The Powerpoint originals of these slides are
freely available to anyone who wishes to use them for their own work, or
who wishes to teach using them in an academic institution. Please email
Andrew Moore at awm@cs.cmu.edu if you would
like him to send them to you. The only restriction is that they are not
freely available for use as teaching materials in classes or tutorials
outside degree-granting academic institutions.

**Advertisment: I have recently joined Google, and am starting up the
new Google Pittsburgh office on CMU's campus. We are hiring creative
computer scientists who love programming, and Machine Learning is one
the focus areas of the office. If you might be interested, feel welcome
to send me email: awm@google.com .**

tutorials/mdp.html.txt · Last modified: 2016/10/04 21:27 (external edit)

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial 4.0 International