In Table 3, the features generated are binary -
i.e. we are checking for the presence of particular instantiated
features. However, this gives a very ``crisp'' decision boundary. It
may be more useful to have a richer measure of membership. We use the
measure
, where
is the distance to the
nearest centroid and
is the distance to the second nearest
centroid. This measure has useful properties; for instance a point on
the boundary between two regions has
, whereas the centroid has a
measure of
. This expands the hypothesis language
significantly and makes the classification more robust.