How long will TClass take to learn and how much memory/space
will it take in order to calculate the results? A rather informal
analysis of each will be presented. First, some notation: Let
be
the number of streams in the training set. Let
be the number of
metafeatures we are applying to the domain. Furthermore, let us
assume that applying a particular metafeature will on average generate
instantiated features. Also, assume that on average, the length of
the streams is
.
The instantiated feature extraction consists of an inner loop that
examines each of the
streams in turn. Within that inner loop, the
function extract is called for each instance. Assuming that the
extract function is linear in the length of the stream (i.e.
is
)
, then the
feature extraction will take
time. However, the function is
called once for every metafeature, and hence the extraction stage is
.
As to space, under the previous assumptions, the amount of data to be
stored is to be stored is
: for
streams, and for
metafeatures,
instantiated features must be stored.
Synthetic feature construction is called once for each
metafeature. It is flattened into a single layer, but the amount of
data for each metafeature is still
. Consider the random search
algorithm. At each stage, it chooses a random subset of points with
some upper bound (in the default case eight). It then compares each
point (of which there are
) to each of the elements of the
subset. This operation is repeated a fixed number of times, so this is
still
. Since it is called once for each metafeature, the total
amount of time is also
.
Since there is some upper bound on the number of regions that will be
produced, then there will be
centroids which need to be stored.
Then there is the attribution stage. For each instance, of which there
are
,
instances (on average) with
metafeatures must be
compared. This involves looking against a set of centroids with an
upper bound. Hence this too is an
operation.
The data storage requirements are
, since for each instance,
there are a certain maximum number of metafeatures and a maximum
number of centroids per metafeature.
The final stage is the learning stage. However in general, we do not
know what the amount of time for the learning algorithm is. Let us
assume that the learning algorithm depends linearly on the number of
attributes and the number of instances. Then in this case, there are
instances, each with
features, so the learner will take
time.
The global feature extraction will generate
features. Since
many global features depend on things like working out the mean, each
one may take up to
time. Hence the total time is
.
Hence finally, the total time is
, or after
simplifying, it is
. As for the space, the most space
is required to store the data is
.
This is good. It means that the total time and space used is linear in five components: