Next: Implemented metafeatures
Up: Implemented segmenters
Previous: Expectation-Maximisation
  Contents
This is an implementation of the algorithm discussed in Section
4.12. It accepts the following parameters:
- numTrials: The number of random trials to run. Default
value is 1000.
- minCent: The minimum number of centroids to
consider. Default value is 2.
- maxCent: The maximum number of centroids to
consider. Default value is 8.
- clustBias: The algorithm currently selects a random
number between minCent and maxCent based on a
linear probability distribution. In other words, the probability of
choosing
centroids is
times the
probability of choosing
centroids. Strictly speaking,
this is incorrect, since there are far more subsets with, 8
instances than there are with 2 instances. In fact, it follows a
binomial distribution - and for cases where there are far more
instances than centroids, it approximates a factorial distribution.
To compensate for this, one can set a cluster bias
, so that the
probability of choosing a subset with
instances is proportional
. Default value is 1 (i.e. linear).
- dispMeasure: The disparity measure to use. Possible
values are gainratio, gain and chisquare.
Next: Implemented metafeatures
Up: Implemented segmenters
Previous: Expectation-Maximisation
  Contents
Mohammed Waleed Kadous
2002-12-10