TITLE: Hierarchical Reinforcement Learning in Adversarial Environments
PRESENTER: Hing-Wah Kwok, , Hing-Wah.Kwok@dsto.defence.gov.au
AFFILIATION:School of Computer Science and Engineering, UNSW; Defence Science and Technology Organisation (DSTO) , Department of Defense, http://www.dsto.defence.gov.au/
DATE: Friday 14th September 2007
PLACE: CSE Seminar Room, Level 1, K17
Reinforcement learning has been successfully applied to single agent
domains. However, traditional reinforcement learning has several
shortfalls when it comes to concurrent multi-agent adversarial
environments. In these environments, the optimal policy is directly
dependent on the policies of the other agents in the system. Several
techniques have been developed for learning in an adversarial domain.
We will be looking specifically at one of these techniques for handling
opponents, the win or learn fast (WoLF) principle.
The other shortfall of traditional reinforcement learning is the curse of
dimensionality. Hierarchical value function decomposition was one way that
was developed to handle an ever increasing state space. By using
hierarchical techniques, it has been shown that there's a significant
learning speed increase at the expense of true optimality. We will look
at adapting hierarchical reinforcement learning to an adversarial
environment. We will then show that there's a similar speed and
performance increase by combining adversarial reinforcement learning
techniques with these hierarchical techniques.
BIOGRAPHY OF SPEAKER:
Hing-Wah Kwok is a postgraduate student at CSE under supervision of
Dr. William Uther (CSE) and Gregory Calbery (DSTO). Hing-Wah is working on
his thesis in the area of Acquisition of Domain Models for Stochastic
Planners & Reinforcement Learning Systems.
Van Hai Ho