This work has a very strong relation to the divide-and-conquer
framework proposed by dietterich:dnc. Rather than the
simple propositional form where instances are presented in the form of
, where
is a fixed size vector and
is a
class label, Dietterich's framework allows both
and
to be
complex, for instance,
may be a time series, an image, or a time
series of images (e.g. a movie), and similarly for
. In contrast,
the approach used in this paper allows for a complex
(such as a
time series), but retains the conventional representation for
-
i.e., a single class label. Dietterich's divide-and-conquer approach
consist of three stages: (a) dividing the original problem into
sub-problems (b) conquer (applying learning) and (c) merging the
subproblem solutions. Because the
is so simple in our case, the
merging step is usually trivial.
This work also closely relates to the areas of feature extraction and construction [Liu & Motoda, 1998], although the formal definitions of extraction and construction assume attribute-vector representation [Liu & Motoda, 1998, p. 4]. They do point to automated preprocessing (or data categorization) as future work [Liu & Motoda, 1998, p. 9]. They also point to the importance of comprehensibility.
There are some general techniques that can be applied to temporal and
structured domains. The best developed technique for temporal
classification is the hidden Markov model
[Rabiner, 1989]. HMMs have proved themselves to be very useful
for speech recognition, and are the basis of most commercial speech
recognition systems. However, they do suffer some serious drawbacks
for general use. Firstly, the structure of the HMM - similar to that
of a finite state machine - needs to be specified a priori.
The structure selected can have a critical effect on performance.
Secondly, extracting comprehensible rules from HMMs is not at all
easy. Thirdly, even making some very strong assumptions about the
probability model, there are frequently hundreds or thousands of
parameters per HMM. As a result, many training instances are required
to learn effectively. Recurrent neural networks and Kohonen maps have
also been used to solve temporal and sequence problems
[Bengio, 1996], but suffer similar problems.
Lee and Kim [Lee & Kim, 1995] take a syntactic approach to time series recognition. Based on knowledge of the financial markets, they develop a grammar for events. In our parlance, they are developing a fixed set of metafeatures for the problem.
Recent interest has also arisen in applying ILP to temporal classification problems. rodriguez:foltime is an interesting example of this, using a tailored search algorithm designed to cope with temporal constraints, although the scalability of this approach is an issue. geurts:ts also presents a system that uses ILP for temporal classification.