next up previous contents
Next: Comprehensibility Up: Auslan Previous: Comprehensibility   Contents

Flock data

The Flock data is much ``cleaner'' than the Nintendo data. The results of the first test are shown in Table 6.14.


Table 6.14: Error rates for the Flock sign language data.
Approach Error
TClass with J48 $ 14.5 \pm 0.4$
TClass with PART $ 16.7 \pm 0.9$
TClass with IB1 $ 60.3 \pm 1.1$
TClass with Naive Bayes $ 29.2 \pm 0.8$
TClass with Bag $ 9.4 \pm 0.8$
TClass with AB $ 6.4 \pm 0.4$
Naive Segmentation $ \mathbf{5.5 \pm 0.5}$
Hidden Markov Model $ 12.9 \pm 0.6$


It seems that the hidden Markov model is not performing as well as it used to. Our hypothesis for this is that the streams in this case are much longer and contain many more channels. Hence, there are likely to be confusions as to the attribution of particular states.

We ran the same tests on smoothing as before, and searching with higher number of instances to see if it had any effect.


Table 6.15: Error rates with minor refinements for the Flock data.
Learner Base Smooth Centroids Relative C + R
J48 $ 14.5 \pm 0.4$ $ 18.0 \pm 0.4$ $ 15.3 \pm 1.0$ $ 14.2 \pm 0.4$ $ 15.1 \pm 0.6$
PART $ 16.7 \pm 0.9$ $ 18.4 \pm 0.6$ $ 17.6 \pm 0.5$ $ 16.7 \pm 0.8$ $ 16.8 \pm 0.4$
IB1 $ 60.3 \pm 1.1$ $ 56.5 \pm 1.6$ $ 60.1 \pm 1.4$ $ 52.2 \pm 3.2$ $ 49.6 \pm 1.5$
Bag $ 9.4 \pm 0.8$ $ 9.4 \pm 0.8$ $ 8.8 \pm 0.6$ $ 6.8 \pm 0.3$ $ 6.7 \pm 0.2$
AB $ 6.4 \pm 0.4$ $ 7.7 \pm 0.6$ $ 6.2 \pm 0.2$ $ 5.3 \pm 0.5$ $ \mathbf{5.1 \pm 0.3}$


Smoothing did not improve results - in fact it made them worse. This is because the data from our gloves is almost noise-free - the sensors are much higher quality. In fact the improvement from one to the other is huge - the error rate is typically one quarter of what it was in the Nintendo data.

This also puts paid to one of the concerns in the design - that the TClass system would not be able to handle copious data. Altogether, each sign in the Flock data contains approximately 11 times the data of the Nintendo data. It was feared that this additional processing load would actually have and adverse impact on the accuracy. As it turns out, this was not the case - it took full advantage of the additional channels in both accuracy and comprehensibility.

Also notable is that the difference in accuracy between the final algorithm and the base algorithm. In the case of AdaBoost, the difference is very significant; the error rate now less than any competing method, without applying voting. To see how well we could do, we ran boosting on the final method. The results are shown in Figure 6.29. The show that the accuracy obtained is significantly lower.

Figure 6.29: Effect of voting on the Flock sign data domain.
\begin{figure}\begin{center}
\leavevmode \epsfxsize =5in \epsfbox{newsign-abvote.eps}\par\centering\centering\end{center}\end{figure}

However, the gain from the search for more random centroids is probably not worth the additional computational effort. For example, if we look at computation times[*] for running the J48 classifier, the average with 1000 random centroids is 3542 seconds (i.e. just under an hour), while with 10000 centroids it is 27417 seconds (ie. about 7 hours 40 minutes). It does not seem to lead to an increase in accuracy in the J48 case, and the effect in the AdaBoost case of about 0.2 per cent which may not be statistically significant makes the point even further. The difference in tree size between the two is negligible.

Also notable is the effect on tree size. For almost the same learning task, the Nintendo data produces trees that contain on average 322.8 leaf nodes. With the Flock data, the tree size is massively reduced to only 132.6 leaf nodes. This is a 59 per cent reduction in tree size. 132.6 is close to the 95 classes we have; optimally the tree would have exactly 95 nodes, assuming that the concepts are disjunctive. This means that there are an average of 1.4 leaf nodes for every class. Furthermore, for the decision rule learner PART, there are 108.6 rules on average.


next up previous contents
Next: Comprehensibility Up: Auslan Previous: Comprehensibility   Contents
Mohammed Waleed Kadous 2002-12-10