next up previous contents
Next: 2.7.5 Hidden Markov Models Up: 2.7 Learning tools Previous: 2.7.3 Instance-based Learning

2.7.4 Grammar-based techniques

Although not at the moment a mainstream learning technique, grammar-based learning algorithms may also be applied, although they are less general than other learning techniques.

A complete review of grammars is beyond the scope of this thesis, but a basic introduction will be given here.

A grammar consists of four things:

A set of terminals,
where a terminal is the most primitive symbol in the grammar. In English, for example, words would be considered terminals.

A set of non-terminals,
where non-terminals are symbols which are, as the name indicates, symbols which are not terminals. Typically, in English, this would be things like sentences, noun phrases, adverbial phrases, subjects and so on.

A set of production rules,
where a production rule is a transformation from one sequence of terminals and/or non-terminals to another sequence of terminals and non-terminals. The left hand side must have at least one non-terminal in it. An example of a production would be that if you have a noun_phrase (a non-terminal symbol) then it can be replaced with peartridgegif (which is a terminal symbol). There can be more than one production for a given right-hand side.

A Start symbol,
where a start symbol shows where all valid sequences of symbols we want to be able to produce can be derived from. For example, in English, the Start symbol might be sentence, from which we could derive some valid sentences, by following the production rules.

Given a stream of terminals, it is possible to take these terminals and build out of it a ``parse tree'' -- a tree illustrating the steps taken to get from the start symbol to the sequence of terminals observed (this activity is of course known as parsing)gif. This tree acts as a store of information, and by applying operations to this tree, meaning can be extracted from the sequence of apparently random terminal symbols.

Although this sounds complex, it is done on a daily basis by the compilers used in our computers. Although most people think of symbols as words, it might be possible to take a broader view and think of basic sub-gestures as the terminals (with some additional terminals for movement) of a language which could be the set of Auslan signs.

The problem with these systems at the moment is that these sort of grammars are typically hand-built. There is on-going research into automatic grammar learning programs, but these are still in their infancy.



next up previous contents
Next: 2.7.5 Hidden Markov Models Up: 2.7 Learning tools Previous: 2.7.3 Instance-based Learning



waleed@cse.unsw.edu.au