The naive Bayes learner attained an accuracy of approximately 76 per cent. It seemed that some classes were classified far more accurately than others.
The C4.5 learner produced the correct classification 42 per cent of the time, the incorrect classification 16 per cent of the time, and no classification 42 per cent of the time. This high number of ``unclassifieds'' may be caused by the uneven distribution of positive and negative instances to the learners. Since there are ten classes, only 10 per cent of the instances will be positive and 90 per cent will be negative. Faced with this, C4.5 favours negative classifications as a default.
More interesting was the results obtained by looking at the trees produced for each class. For example, consider the sign come. The tree produced is shown in figure 6.5. This particular tree had approximately 80 per cent accuracy. It was also observed that the trees produced in this way were very small. The smallest trees had five nodes and the largest trees had nine nodes, which is within the bounds of human comprehension.
Note that come-z-2, come-y-1 and come-fore-5 are names for clusters; the parameters of which are shown in table 6.1. Recall also that the numbers in table 6.1 represent the logs of the confidences we determined for our clusters.
Figure 6.5: Decision tree produced for come
Table 6.1: Clusters used by C4.5 to classify come.
Surprisingly, this is actually understandable when one considers the sign for come. It is a moving of the hand initially away from the body, towards a person standing in front of you. Then the hand is brought towards the body, usually with a slight upwards motion, finally closing the finger.
By looking simultaneously at the clusters and the decision tree, the above can roughly be translated as: If the person moves his hand away from himself rapidly and towards someone standing in front of him, then it is a come sign. Otherwise see if there is a medium speed upwards motion. If that upward motion ends with the finger being bent, then it is a come sign. Otherwise it is not.
The above can also be represented in a graphical form. This could be accomplished in the following way:
What we now have is a set of prototypes for the come sign. In the example shown, we would have two prototypes: one containing only the come-z-2 synthetic event and the other indicating the come-y-1 and come-fore-5 synthetic events.