next up previous contents
Next: Real-world datasets Up: TTest - An artificial Previous: Comprehensibility   Contents

Conclusions

TClass is able to achieve high accuracy, comprehensible description of the TTest dataset; it is competitive in its generation of correct and accurate descriptions - especially in situations where there is little or no noise. Smoothing does not appear to help. If one is not interested in comprehensibility of the learnt concepts, voting methods can be employed to great effect - obtaining accuracies far in excess of those of the baseline learners; on this data, at least half the error rate. Further, it shows better results than the baseline learners with fewer instances.

There does seem to be a problem with overfitting, however. This is likely because there are so many features generated. The trees produced are much larger than optimal, and by adjusting the parameters of decision-tree based learners to be more aggressive in their pruning we can generate more comprehensible rules with equivalent or higher accuracy. We will have to take this into account with our real-world datasets.



Mohammed Waleed Kadous 2002-12-10