next up previous
Next: IMPLEMENTATION Up: Learning Comprehensible Descriptions of Previous: AUSLAN SIGN RECOGNITION

ARCHITECTURE AND IMPLEMENTATION

 

 

  figure854


Figure: General architecture for training.

We propose a general architecture for temporal classification problems as shown in figure 4. To explain concisely, we will use a simple illustrative domain, the Blues and Reds (BR) task. This domain has single channel (c), which can only have two possible values: true (T) or false (F). There are two possible classes for each stream, Blue or Red. A sample training set is shown in table 1. As can be seen, streams vary in length from 10 frames to 15 frames.

  table645
Table: The training set for the BR task.  

The data is processed in two ways: global feature calculation and event extraction. Global feature calculation extracts various features the stream as a whole. Typical global features include global maxima, global minima, means of each channel and the duration of the stream. For many temporal domains, these global features provide a good ``first cut'' representation for classificationgif.

In the BR task, a simple global feature is the ratio of time that the channel c is true to the overall duration of the stream. The results of applying this to the data of table 1 are shown in column 2 of table 3.

Event extraction is applied by using a set of parametrised event primitives (PEPs). PEPs represent particular types of ``events'' that are expected to occur in the domain (the best PEPs are typically domain-specific). Each PEP is defined by:

A key characteristic of a PEP is that it should explicitly represent the temporal characteristics of the events as parameters; thus representing time in a way that facilitates learning.

One simple PEP we could use with the BR task is to finds runs of T values, and returns a tuple (t, d) where t is the time the run starts and d is the duration of the run. The finding function would be given a stream and return all pairs (t, d). The results of applying this PEP to the BR training data is shown in table 2. We shall term this PEP the TrueRun PEP.

  table672
Table: TrueRun PEP applied to the BR training set.  

Since each PEP is described in terms of a set of parameters, clustering can be performed in the parameter space of each PEP. For example, if the data from table 2 is plotted (as in figure 5) we can cluster it as shown. Typically, rather than hand-clustering, some clustering algorithm would be used.

Each cluster then forms the basis of a synthetic event attribute. For each stream, the value of the synthetic event attribute is true only if it has an event belonging to that cluster. More generally, a confidence metric of cluster membership is used, rather than just a true/false value.

   figure862
Figure: TrueRun parameter space and clustering for BR task.

Once we have these clusters, the training data can be re-attributed with these synthetic event attributes. Applying this to the BR data, we get the results shown in columns 3, 4 and 5 of table 3.

  table689
Table: Global and event attributes for the BR task.  

The data from the global and event processing streams are recombined, in a format suitable for processing by an attribute-value learner. The result of applying c4.5rules [Quinlan, 1993] on the BR data in table 3 are shown in figure 6 (left side). Furthermore, if there is some way to produce a prototype from the clusters (for instance, by finding the centroid of the cluster), we can substitute the prototype's description to produce a descriptive rulegif (as shown on the right hand side of figure 6). It should also be pointed out that because of the way C4.5 operates, the rule for Blue is discriminant (i.e., how to tell it from Red), rather than characteristic (giving the defining characteristics of Blue).

 

figure713


Figure: Rules produced by applying c4.5rules to table 3. 




next up previous
Next: IMPLEMENTATION Up: Learning Comprehensible Descriptions of Previous: AUSLAN SIGN RECOGNITION

Mohammed Waleed Kadous
Wed May 19 20:21:38 EST 1999