Figure: General architecture for training.
We propose a general architecture for temporal classification problems as shown in figure 4. To explain concisely, we will use a simple illustrative domain, the Blues and Reds (BR) task. This domain has single channel (c), which can only have two possible values: true (T) or false (F). There are two possible classes for each stream, Blue or Red. A sample training set is shown in table 1. As can be seen, streams vary in length from 10 frames to 15 frames.
Table: The training set for the BR task.
The data is processed in two ways: global feature calculation and
event extraction. Global feature calculation extracts various
features the stream as a whole. Typical global features
include global maxima, global minima, means of each channel and the
duration of the stream. For many temporal domains, these global
features provide a good ``first cut'' representation for
classification
.
In the BR task, a simple global feature is the ratio of time that the channel c is true to the overall duration of the stream. The results of applying this to the data of table 1 are shown in column 2 of table 3.
Event extraction is applied by using a set of parametrised event primitives (PEPs). PEPs represent particular types of ``events'' that are expected to occur in the domain (the best PEPs are typically domain-specific). Each PEP is defined by:
A key characteristic of a PEP is that it should explicitly represent the temporal characteristics of the events as parameters; thus representing time in a way that facilitates learning.
One simple PEP we could use with the BR task is to finds runs of T values, and returns a tuple (t, d) where t is the time the run starts and d is the duration of the run. The finding function would be given a stream and return all pairs (t, d). The results of applying this PEP to the BR training data is shown in table 2. We shall term this PEP the TrueRun PEP.
Table: TrueRun PEP applied to the BR training set.
Since each PEP is described in terms of a set of parameters, clustering can be performed in the parameter space of each PEP. For example, if the data from table 2 is plotted (as in figure 5) we can cluster it as shown. Typically, rather than hand-clustering, some clustering algorithm would be used.
Each cluster then forms the basis of a synthetic event attribute. For each stream, the value of the synthetic event attribute is true only if it has an event belonging to that cluster. More generally, a confidence metric of cluster membership is used, rather than just a true/false value.
Figure: TrueRun parameter space and clustering for BR task.
Once we have these clusters, the training data can be re-attributed with these synthetic event attributes. Applying this to the BR data, we get the results shown in columns 3, 4 and 5 of table 3.
Table: Global and event attributes for the BR task.
The data from the global and event processing streams are recombined,
in a format suitable for processing by an attribute-value learner. The
result of applying c4.5rules [Quinlan, 1993] on the BR
data in table 3 are shown in figure
6 (left side). Furthermore, if there is some way
to produce a prototype from the clusters (for instance, by finding the
centroid of the cluster), we can substitute the prototype's
description to produce a descriptive rule
(as shown on
the right hand side of figure 6). It should also
be pointed out that because of the way C4.5 operates, the rule for
Blue is discriminant (i.e., how to tell it from Red), rather than
characteristic (giving the defining characteristics of Blue).
Figure: Rules produced by applying c4.5rules to table 3.