next up previous contents
Next: Recent interest from the Up: Related work Previous: Recurrent neural nets

Dynamic time warping

 

An older technique that been fallen out of favour since the advent of hidden Markov models is dynamic time warping [MR81]. In some senses, it is the temporal domain equivalent of instance-based learning with a complex distance function. In dynamic time warping, the problem is represented as finding the minimum distance between a set of template streams and the input stream. The class chosen is the ``closest'' template. However, rather than using a straightforward technique of comparing the value of the input stream at time t to the template stream at time t, an algorithm is used that tries to search the space of mappings from the time sequence of the input stream to that of the template stream, so that the total distance is minimised. This is not always a linear path; for example, we may find that time tex2html_wrap_inline1679 in the input stream corresponds to the time tex2html_wrap_inline1681 in the template stream, whereas tex2html_wrap_inline1683 in the input stream corresponds to tex2html_wrap_inline1685 in the template stream. The search space is constrained to reasonable bounds, such as the mapping function from input time to template time must be monotonically increasing (in other words, the sequence of events between input and template is preserved).

   figure602
Figure 2.5: Dynamic time warping

This is shown in figure 2.5, where the horizontal axis represents the time sequence of the input stream, and the vertical axis represents the time sequence of the template stream. The path shown results in the minimum distance between the input and template streams. The shaded in area represents the search space for the input time to template time mapping function.

Using dynamic programming techniques, the search for the minimum distance path can be done in polynomial time. However, dynamic time warping has some limitations. Whereas the calculations can be accomplished in polynomial time, this polynomial is approximately O( tex2html_wrap_inline1687 ) where N is the length of the sequence, and V is the number of templates to be considered. This means that practical applications may be quite slow. In addition, they require the definition of a distance function between the values of the stream at a given time frame. This may be easy if the stream is univariate, but it is more difficult if the stream is, as in our case, a multivariate one. Furthermore, it does not create any meaningful descriptions of the data. In addition, creation of the template streams from data is non-trivial and typically depends on the order of presentation of the training instances, since if we have eight instances from which we are trying to create a template, we have to warp them to each other in pairs, and repeat the process until we arrive at a single template.


next up previous contents
Next: Recent interest from the Up: Related work Previous: Recurrent neural nets

Mohammed Waleed Kadous
Tue Oct 6 13:04:40 EST 1998