Bill Wilson's Natural Language Processing Research Page

Bill Wilson
School of Computer Science and Engineering
University of New South Wales

Except for occasional projects/papers with my former research student, Dr Kyongho Min, I've now moved on to other projects, but this page remains as a historical record.

One problem that I worked on was an attempt to capture the graphotactic patterns of English words: e.g. that a word can end but not begin with "nd". It developed into a comparison of different recurrent network archictectures for studying problems of this type. The graphotactic project in turn developed out of a project that used letter trigrams to try to distinguish, among novel words in text, the ones that were typographical or similar errors, and those that were real words not covered by one's lexicon. (Obviously, some typographical errors result in words with graphotactically legal structure (like the mis-spelling of "architecture" that I just noticed, a couple of sentences back), and some real unknown words would be borrowings from languages with different graphotactic structures, but the idea was to get at least some idea of whether the novel word was OK.

Two former students in this area are:

Here's a collection of ill-formed sentences (as used by Min, but collected by me).
Bill Wilson's contact info

Last updated: