next up previous contents
Next: Definition of The Problem, Up: Introduction Previous: Introduction   Contents

Summary of the thesis

In this section a brief outline of the thesis is given, and its novel contributions are outlined.

One way that one might characterise multivariate time series such as sign language is by looking for sub-events that a human might detect as part of a sign. Consider one class in the Auslan domain, say thank. The gloss for thank is shown in Figure 1.1. The sign is described by its sub-events such as ``placed on chin'' and ``moved away with stress''.

Figure 1.1: The gloss for the Auslan sign thank [Joh89].
\begin{figure}\begin{center}
\leavevmode \epsfxsize =4in \epsfbox{thank-gloss.eps}\par\centering\centering\end{center}\end{figure}

The sign could be characterised as consisting of even simpler sub-events: a raising of the hand to touch the chin (in other words an increase in the vertical position of the hand), a vertical maximum as the hand touches the chin, followed by the hand moving down and away from the body (in other words a decrease in the vertical position of the hand, as well as an increase in the lateral distance from the body). Humans are quite comfortable talking about time series in these terms: increasing, decreasing, local maximum and so on. However, humans might also talk of different sub-events - we might talk of a person making a circular movement, for example.

Metafeatures parametrise events in a way that capture their properties, including the temporal characteristics. A simple metafeature might be a local maximum. The parameters of this metafeature might be the time at which the local maximum occurs and its height. To take a more complicated example, another metafeature might be a circle motion; which is parameterised as centre, radius, angular velocity, start time and duration. Metafeatures themselves are not novel; concepts such as extracting the minima and maxima of the signal, or even increasing, decreasing, or flat periods of the signal for classification have been previously explored.

Our application of them, however, is novel. One can imagine that each local maximum we find in the training instances is a point in a 2-dimensional space (one axis being the time, the other being the height). This parameter space provides a rich ground for feature construction.

For example, we may find that local maxima that occur around the height of the chin result in very different classifications to signs that have local maxima only a few centimetres higher (in fact the sign smell is almost identical to thank aside from the position) - at the nose. If we discovered that the class distribution of local maxima is significantly different if it's near the nose or the chin, then two interesting local maxima are selected: one near the chin and one near the nose.

Term these ``interesting examples'' synthetic events. Features for learning are constructed by detecting if each training instance has an actual (or as it is termed in this thesis, instantiated) event that is similar to the synthetic one. We describe several algorithms for discovering interesting examples, including a novel and effective one: directed segmentation.

TClass uses these synthetic events as the basis for feature construction. For example, an unlabelled test instance would be searched for local maxima. It would be analysed to see if any of the local maxima were similar to the synthetic events. If it did have a local maximum near the nose it would be labelled as such. In fact, two new features are constructed: HasYMaxNearChin and HasYMaxNearNose. Of course, these are human labels that we've associated with each of the two centroids, and they would really be called YMax0 and YMax1. Each training and test instances is attributed with these newly created features.

Several metafeatures can be applied to the training instances, each constructing synthetic features. Each list of features is an attribute vector and hence these can be concatenated to form one long attribute vector. Furthermore, non-temporal features (like age and gender of an ECG patient) and temporal aggregate features (like standard deviations, averages, maxima and minima) can also be appended. TClass makes it easy to mix temporal and non-temporal features, unlike many other temporal classification systems.

These combined attribute vectors can now be used as inputs to a propositional learner to produce a classifier. Furthermore, we can replace the attributes constructed from synthetic events with the original description, leading to a concept description expressed as the metafeatures. For example, if a rule has the form If YMax0 = True; then this could be rewritten using the original synthetic event as something like If y has local max of height 0.54 at time 20 (assuming that the chin is at a height of 0.54 units). This is a major novel contribution of this thesis: a temporal classifier that actually produces comprehensible but accurate descriptions.

However, the above gives no feel for the ``scope'' of acceptable variation for each metafeature. For example, if a rule contains a comparison of the form above, it is not clear what reasonable bounds are to be expected. Does ``approximately 0.54'' mean from to 0.4 to 0.7 or from 0 to 1? By looking at the original training data, we can extract bounds that allow the user to have a ``feel'' for what is reasonable. While these do not form a complete picture of the learnt concept, they give some intuition. This process of postprocessing the output of the learner to convey the bounds on variation is also novel.

We have implemented this system, and tested it on two artificial datasets and two real-world ones. We compare TClass with two ``controls'' - hidden Markov models and naive segmentation (dividing the data into a number of segments, averaging and then using a propositional learner). The first artificial dataset, proposed by other researchers, turns out to be too easy - all learners, including controls and TClass, attained 100 per cent accuracy. Still TClass generated rules that closely corresponded to the original (known) concepts. We propose our own second artificial more difficult dataset, on which TClass performs better in terms of accuracy and also produces comprehensible descriptions. The two real-world domains are Auslan (Australian Sign Language) and Type I ECGs. In the Auslan domain, we obtain 98 per cent accuracy; and in the ECG domain, we obtain about 72 per cent accuracy, comparable to a human expert with accuracy of 70 per cent, and similar to a hand-extracted set of features with 71 per cent.

There is a great opening for expanded research in the field, at a time when data mining is growing at a phenomenal rate. There are many avenues of future work. Some detailed descriptions of avenues for future work are given, including some early explorations of totally different approaches to temporal classification. Finally, some important conclusions are drawn.


next up previous contents
Next: Definition of The Problem, Up: Introduction Previous: Introduction   Contents
Mohammed Waleed Kadous 2002-12-10