next up previous contents
Next: Definition Up: Metafeatures: A Novel Feature Previous: Tech Support revisited   Contents

Inspiration for metafeatures

In the previous section, an overview of the application of metafeatures to a real problem was demonstrated. The motivation and inspiration behind metafeatures is now explained in greater depth.

At the core of many machine learning problems - and especially that of temporal classification - is the issue of appropriate representation: how to represent the learning problem in a manner that makes it amenable to established and well-understood machine learning techniques, and how to convert from the representation we are provided with as input into the required representation. The nature and difficulty of this change in representation depends on our target learning technique. Three possible approaches to temporal classification are:

Each of these approaches is valid and presents different representational challenges. Even the first approach requires a representation that makes sequential learning possible: raw sound signals, for instance, are not usually fed directly into a hidden Markov model; rather the original audio data are converted into a sequence of short term Fourier transform coefficients and energy spectrum coefficients. It is also true that the second, theoretically, allows for a more complex representation, as relational learning systems have a richer observation language than propositional learners. In particular, unlike propositional learners, relational learning does not limit us to a fixed set of features. In practice, there are two major weaknesses of relational learners that arise in temporal domains: large data sets and noisy data. This necessitates certain cleverness in the selection of appropriate representation, and has been explored in [RAB00] to some extent[*]. Rodriguez' does this by allowing temporal statements that are more robust to noise, for instance ``during this time period, the value exceeds 0.5 80 per cent of the time''.

The approach taken in this thesis is the third: finding a general way to convert temporal problems into a propositional form; then using existing propositional machine learning techniques.

It also should be stated that the approaches aren't quite as discrete as the above would indicate. For example, there are relational learners such as LINUS [LD94] that accomplish relational learning by converting relational data into a propositional form. Similarly, with hidden Markov models, it is not unusual to find that emission probabilities are modelled not by the traditional Gaussian distribution, but by a neural network. Also, there is a ``middle ground'' between relational and propositional learning such as graph-based induction, as is explored in Section 7.2.4.

Metafeatures are the basis of the change in representation to a format for propositional learners. They allow for the inclusion of background knowledge and domain knowledge for temporal classification. They also allow concepts learnt by the propositional learner to be re-expressed in the original temporal domain.

The key steps in using metafeatures is to find some kind of sub-event, some temporal pattern that can be used in the domain for which the following can be defined:

Once the training data has had the instantiated features extracted from it, these can be plotted in a space, termed the parameter space. Within the parameter space, we can try to find typical examples that will allow the construction of features that make it possible to represent the original temporal data in a format that facilitates propositional learning.

The application of metafeatures is not limited to temporal domains. Any problem where we can characterise instances in the domain as having substructures that can be represented parametrically can use metafeatures. Such domains include image recognition (where lines and shapes can be represented with parameters), optical character recognition and more. The exploration of these is beyond the scope of this thesis, but it is one extremely interesting avenue of future work.


next up previous contents
Next: Definition Up: Metafeatures: A Novel Feature Previous: Tech Support revisited   Contents
Mohammed Waleed Kadous 2002-12-10