next up previous contents
Next: Implemented global extractors Up: Providing input Previous: Class label files   Contents


Domain application file

In order to decide which components are to be used for a particular problem domain, TClass uses an application file (usually suffixed with a ``.tal'' - short for TClass Application List). This describes what global features, metafeatures and parameter space segmenters to use for a particular domain. An example component description file for the Tech Support domain is shown in Figure 5.13[*].

Figure 5.13: Component description file for Tech Support domain.
\begin{figure}\begin{boxedverbatim}globalcalc {
global V-mean mean {
channel...
...
numTrials ''10000''
dispMeasure chisquare
}
}\end{boxedverbatim}\end{figure}

The first part of the file in Figure 5.13 describes the global feature extractors. The first is a mean volume level (L is treated as 0 and H is treated as 1 - this is effectively the same as calculating the percentage of time the conversation volume is high). The format for the entries in the global section is:

global <attribute name> <feature type> {
  <parameter> <value> 
  ...
  <parameter> <value>
}

Figure 5.13 shows a typical example of a global declaration. It creates a global attribute called V-mean. It is uses a mean global feature extractor (other types of global feature extractors would include min and max). The mean global feature extractor accepts the parameter channel which tells it which channel to extract the mean for. The channel must be listed the in the domain description file. More information about globals can be found in Section 5.5.3.

The next section describes which metafeatures to use. Each metafeature application is described in a section like:

metafeature <metafeature name> <metafeature type> {
  <parameter> <value>
  ...
  <parameter> <value>
}

The metafeature's name is later used in the segmentation sections. The type governs what type of metafeature will be applied. For instance, in Figure 5.13, the rle (short for run-length encoding) metafeature is a straightforward generalisation of the LoudRun metafeature. In this case, we are looking for ``runs'' on the channel V that last for a minimum of length 1. Also we are only interested in runs of high-volume, not low-volume. Hence we limit our interest to runs of ``H''s.

The final section is for setting up parameter space segmentation. Typically, there will be equal number of metafeature applications and parameter space segmenters. The segmenter specifies which segmenters to apply to which metafeatures. The general format is:

segmenter <attribute prefix> <segmenter type>  {
   <parameter> <value>
   ...
   <parameter> <value>
}

The attribute prefix is the name that will be used for attribute values (although a number will be appended indicating which centroid number it is). The segmenter type governs whether we are using k-means (kmeans), expectation-maximisation (em) or directed segmentation (directed). The segmenters take one required parameter: the metafeature to apply the segmentation to. Hence we are building a segmenter for the metafeature loudrun, which is a directed segmenter (i.e., the random search algorithm described in Section 4.12). We specify that it should try 10000 random subsets for centroids; and we wish to use the $ \chi^2$ disparity measure.

This describes all of the inputs into the TClass system. We now discuss some of the available components.


next up previous contents
Next: Implemented global extractors Up: Providing input Previous: Class label files   Contents
Mohammed Waleed Kadous 2002-12-10