C-Trace: A new algorithm for reinforcement learning of robotic control. Pendrith M.D., Ryan M.R.K., ROBOLEARN-96, Key West, Florida, 19-20 May, 1996.

There has been much recent interest in the potential of using reinforcment learning techniques for control in autonomous robotic agents. How to implement effective reinforcement learning in a real-world robotic environment still involves many open questions. Are standard reinforcement learning algorithms like Watkins' Q-Learning appropriate, or are other approaches more suitable? Some specific issues to be considered are noise/disturbance and the possibly non-Markovian aspects of the control problem. These are the particular issues we focus upon in this paper.

The test-bed for the experiments described in this paper is a real six-legged insectoid walking robot; the task set is to learn an effectively co-ordinated walking gait. The performance of a new algorithm we call C-Trace is compared to Watkins' well-known 1-step Q-Learning reinforcement learning algorithm. We discuss the markedly superior performance of this new algorithm in the context of both theoretical and existing empirical results regarding learning in noisy and non-Markovian domains.

Download full paper (compressed postscript)


Mark Pendrith - pendrith@cse.unsw.edu.au
Malcolm Ryan - malcolmr@cse.unsw.edu.au