Next: Summary of the thesis
Up: phd
Previous: Acknowledgements
  Contents
Introduction
Machine learning has generally ignored time in supervised
classification. While there are many tools for learning static
information, such as classifying different flowers according to their
attributes, or determining people's credit worthiness, these are not
the only type of real world classification problem.
Most real domains changes over time. What is interesting and useful to
learn is not just to recognise when our classification is obsolete
(changes in the economic environment, for example, may affect the
accuracy of a credit-worthiness classifier); but also to use the
patterns of change over time itself directly as a means of
classification.
Consider a typical real-world temporal domain: speech. Our vocal
chords generate amplitude and frequency values that vary over time.
These variations denote a higher level concept, such as a word.
Looking at the amplitude or frequency at one point in time is unlikely
to help recognise words; it is only by looking at how the amplitude
and frequency vary that classification becomes possible. Other
examples include:
- Recognising action sequences or gestures. These arise in areas
such as human-computer interaction, handwriting recognition and
robot imitation.
- Medical applications. A patient's body rhythms are frequently
recorded and used for diagnosis, for example: electrocardiographs,
electroencephalographs, levels of various chemicals in the body and
so on.
- Observing sequences of events and extracting higher
level meaning from them. For example, looking at network logs and
trying to identify causes of congestion problems.
- Economic and financial time series, where the user may be
interested in event patterns indicating a particular phenomena or
preceding particular phenomena. For example, the user might be
interested in the events preceding a large market crash.
- Industrial/scientific data often includes temporal data;
increasingly, today's production facilities have many embedded
sensors which produce measurements at regular time intervals. The
task of interest may, for example, be detecting when catastrophic
events will occur, based on measurements of volume, pressure,
temperature or thickness.
This thesis explores the design and implementation of a general
classification tool for such temporal domains that tries to balance
the specific properties of each domain against the general issues that
arise in temporal classification. To do so, it proceeds in the following way:
- A brief explanation of the terminology and key problems are
presented together with a theoretical foundation and formalisation.
- Existing techniques for dealing with temporal classification are
discussed. These include hidden Markov models, dynamic time warping
and recurrent neural networks. Current research in the artificial
intelligence community is also explored.
- Metafeatures, the core of the TClass system, are presented.
- A general system for classification of multivariate time series, based on
metafeatures, is proposed.
- The performance of the new system, TClass, is evaluated on
two artificial and two real-world datasets: sign recognition and
electrocardiograph (ECG) diagnosis.
- Extensive avenues for future research are discussed. This
includes both direct extensions to the current work and also some
very different approaches to doing temporal classification.
- Conclusions on the work are presented.
Subsections
Next: Summary of the thesis
Up: phd
Previous: Acknowledgements
  Contents
Mohammed Waleed Kadous
2002-12-10