next up previous contents
Next: The major difficulties in Up: Assessing success Previous: Assessing Accuracy of Strong   Contents

Comprehensibility - A subjective goal

In some domains we may also be interested in gaining insight into how the classifier works. For example, if a simple rule is generated for classification that can be understood by a human, it may be more desirable than another classifier which has a greater accuracy, but whose internal representation is a list of statistical tables.

Consider once again the Tech Support domain. It is possible to build a temporal classifier that does not produce comprehensible descriptions, and that predicted whether or not the customer was happy or angry. However, this would not be very useful in helping us to understand why people are unhappy.

It is, however, notoriously difficult to subjectively measure ``understandability'', as this is a matter of taste. Some people can extract meaning from 200 numbers because of their knowledge of the domain and learning algorithm; in other cases, simple rules might provide no insight.

Although comprehensibility is subjective, researchers in machine learning often use a number of heuristics to measure comprehensibility. Unfortunately, these methods are tied to particular learners. For example, with decision trees, the number of nodes in the tree are often used as measures of comprehensibility. The number of leaf nodes is sometimes also used. This system has obvious limitations; since it ignores the complexity of the nodes - it could for example be as simple as a comparison on a single attribute, or as complex as a linear combination of all attribute values. Obviously the latter is much simpler than the former.

For decision lists, a count of the number of rules is sometimes used. However, this too has limitations, since some rules can contain more disjuncts than others, and the disjuncts themselves can vary in their complexity. Some users therefore also consider the average number of disjuncts per rule.


next up previous contents
Next: The major difficulties in Up: Assessing success Previous: Assessing Accuracy of Strong   Contents
Mohammed Waleed Kadous 2002-12-10