Reinforcement Learning

My work in Reinforcement Learning began at the Turing Institute in 1987 when, under contract from the Westinghouse Corporation, we developed a procedure for controlling an Earth-orbiting satellite. Conventional control theory requires a mathematical model to predict the behaviour of a process so that appropriate control decisions can be made. Many processes are too complicated to model accurately. Often, not enough information is available about the process' environment. When the system is too complicated or the environment is not well understood, an adaptive controller may work. An adaptive controller learns how to use the control actions available to meet the system's objective. The process is treated as a 'black box' and the program interacts with it by conditioned response.

BOXES learning to control the pole and cart system.

References

Law, J. K. C. (1992). Adaptive Rule-based Control. Unpublished M.Cog.Sc., School of Computer Science and Engineering, University of New South Wales.

Michie, D. and Chambers, R. A. (1968). Boxes: An Experiment in Adaptive Control. In E. Dale and D. Michie (Eds.), Machine Intelligence 2. Edinburgh: Oliver and Boyd.

Sammut, C. (1988). Experimental Results from an Evaluation of Algorithms that Learn to Control Dynamic Systems. Proceedings of the Fifth International Conference on Machine Learning, Ann Arbor, Michigan: Morgan Kaufmann, pp. 437-443.

Sammut, C. and Cribb, J. (1990). Is Learning Rate a Good Performance Criterion of Learning? In B. W. Porter & R. J. Mooney (Eds.), Proceedings of the Seventh International Machine Learning Conference, Austin, Texas: Morgan Kaufmann, pp. 170-178.

Sammut, C. and Michie, D. (1991). Controlling a 'Black Box' Simulation of a Space Craft. AI Magazine, 12(1), pp 56-63.

Sammut, C. A. (1994). Recent Progress with BOXES. In K. Furakawa, Michie, D. & S. Muggleton (Eds.), Machine Intelligence 13. Oxford: The Clarendon Press, OUP, pp 363-384.

McGarity, M., Clements, D. and Sammut, C. (1995). Controlling a Steel Mill with BOXES. In S. Muggleton, K. Furakawa, & D. Michie (Eds.), Machine Intelligence 14. Oxford University Press.

Claude Sammut

Site Navigation[Skip]

Sidebar[Skip]