What characteristics are desirable in our artificial dataset? Firstly, it must be multivariate; that is to say, there must be more than one channel of data. Secondly, it must show some of the forms of variation discussed in Chapter 3, namely:
Thirdly, it would be worth considering the effects of noise at the feature level, as well as at the Gaussian level; that is to say, to have, say one channel of data irrelevant or insignificant. Fourthly, we want the ``correct'' or ideal solution to the problem to be known; hence we can compare concepts produced by TClass with the known concepts. Fifthly, we want it to be reproducible, so that we can produce as many data as we like.
Furthermore, we want all of these parameters to be tunable, hence allowing us to look at the effect that each has individually.