|
TITLE: Patching Approximate Solutions in Reinforcement Learning
PRESENTER: Min Sub Kim, http://www.cse.unsw.edu.au/db/staff/info/msk.html, msk@cse.unsw.EDU.AU
AFFILIATION:School of Computer and Engineering, University of New South Wales, http://www.cse.unsw.edu.au
DATE: Friday 8th September 2006
TIME: 12:30:00
PLACE: CSE Seminar Room, Level 1, K17
ABSTRACT:
In this talk, I will present an approach to improving an approximate
solution in reinforcement learning by augmenting it with a small
overriding patch. Approximations are widely used in reinforcement
learning to cope with large problems, and offer several potential
advantages such as faster learning and reduced storage requirements.
However, these gains usually come at the cost of solution quality,
and the best solution within the constraints of the approximation may
be arbitrarily worse than global optimality.
I will present a technique for efficiently learning a small patch, which,
when combined with an approximate solution to a reinforcement learning
problem, produces performance much closer to the global optimal. This
approach is motivated by the observation that the sub-optimality of many
approximate solutions may be attributed to poor behaviour in small but
important parts of the problem. Augmenting the solution with an
overriding patch can overcome these shortcomings while retaining the
benefits of approximation elsewhere. Empirical evaluation demonstrates
the effectiveness of patching, producing combined solutions that are much
closer to global optimality.
BIOGRAPHY OF SPEAKER:
Min is a PhD student at the ARC Centre of Excellence for Autonomous
Systems in the School of Computer Science and Engineering at the
University of New South Wales. His PhD research is on reinforcement
learning and its application to games.
His other research interests include decision theoretic planning, machine
learning, and applications of machine learning to control and robotics.
Seminar information is also available at
http://www.cse.unsw.edu.au/db/ai/seminars/list/index.html
Host:
William Uther
Seminar Convenor:
Van Hai Ho
|