next up previous contents
Next: Feature subset selection Up: Work on extending TClass Previous: Speed and Space   Contents


Downsampling

When presented with data, it is frequently presented at a high data rate. For example, sign language information about the hand might be received at 200 frames per second. Hence an average 2 second sign might be 400 frames long. Similarly, with ECG data, the electrical signals may be sample at 500 frames per second (sometimes also described as 500 Hertz); hence an individual heartbeat consists of 400 to 700 samples (corresponding to heart rates between 70 beats per minute and 50 beats per minute). This can be a huge amount of data to consider.

Therefore, the data could be resampled at a lower frequency - commonly called downsampling[*]. Under certain circumstances, this may not adversely affect the learning and classification performance and simultaneously reducing feature extraction time. For example, say we were examining the sign language data at 200 frames per second. We could take every two consecutive samples from the original data for each channel, average them, and for each two frames in the input, the output would have only one frame. In this case we are applying a downsampling factor of two: the data is reduced in size by a factor of 2. Hence, a 400 frame sample would be reduced to a 200 frame sample - halving the amount of data that must be dealt with. Similarly, a downsampling factor of 4 would reduce the data to one quarter of its original size[*].

Of course, this works well for continuous attributes, but not so well for discrete attributes. For discrete attributes, other methods can be employed, such as voting.


next up previous contents
Next: Feature subset selection Up: Work on extending TClass Previous: Speed and Space   Contents
Mohammed Waleed Kadous 2002-12-10