- So it's lame.
But all the names having ``glove'' in them have been taken :-).
- Moore's Law states that the amount of computing
power available for a given price doubles once every two years. In
fact, Moore's Law was made to look conservative when the
improvements in instrumented glove technology that occurred while
this investigation was being carried out are considered.
- Plans are already underway to create a
CD-ROM version of Trevor Johnston's Auslan Dictionary
[Joh89]. It would not be difficult to record the
definitions in Auslan itself as well as in English.
example of this is known as the Stokoe ( stoa-kee) notation,
which looks to the untrained eye as if it was lifted off the side of
an Egyptian pyramid -- essentially it appears to be a series of
- By using a k-nearest
neighbour approach with IBL1, rather than the 1-nearest neighbour,
and with C4.5 by using ``soft thresholds'', where if a comparison
being made on a particular attribute and the value of that attribute
is close to the thing it is being compared against, we explore both
- As the Architecture Machine Group
at MIT came to be known.
- This term comes from D.
J. Sturman's PhD thesis [Stu92].
- Yes, that is an awful pun.
- This makes one think carefully
about why English uses so many words. Other languages, such as
Arabic and Chinese, use fewer words to convey a similar message. One
possible cause is that with vocal languages, high levels of
redundancy are used as a means of allowing error correction. The
same level of redundancy may not be required in signing, or for that
matter in other vocal languages.
- This is not
always the case. If subject order is important, then it does matter.
For example, ``man shoot dog'' and ``dog shoot man'' are equivalent
in Auslan, since it is clear that the dog cannot shoot the man. On
the other hand, ``man shoot woman'' and ``woman shoot man'' have
very different meanings. Usually the most emphatic phrase is at the
beginning of the sentence. So you can still use constructions such
as ``woman pause man shoot'', if you want to emphasise that it
was the woman in particular who the man shot.
however, objects with proper names have a sign attributable to an
aspect of that object. For example, some signers have a sign name
which is something about the way the look or act. The sign for
Sydney, for example, is the hands making the shape of the Harbour
- The Deaf
community feels that such an approach is not particularly effective
and based on recent studies, they suggest that a ``bilingual''
approach should be taken, where children are first taught Auslan,
and are then taught a second language such as Signed English, or
Oral English, in a similar way to the treatment given in ESL
(English as a second language) for non-native English speakers.
- As clichéd as the words Poetry in Motion
are, they seem nowhere more apt than watching an energetic Auslan
speaker in full swing. The speed and coordination required to do so
are surprising. If you don't believe me, go to a Deaf Club meeting,
or watch [Uni89].
- One member of the Deaf community told me that most
effective lip-readers are at least 50 years old -- because that's
how long it takes them to learn and get used to it.
- While this was true at the time of writing
of [SZ94], it is no longer the case, with several
low-cost gloves now on the market.
- It seems that
Microsoft has started a very depressing trend.
- The same argument we applied before is here. If you use
a sigmoid function, you get additional information like ``just above
or just below''.
- These sort of
networks are commonly-known as ``feed-forward''. There are other
types of neural nets that have feedback paths in them that feed the
output of the previous layer back through the net again.
- This tree is
actually one of the trees generated early on in the testing process.
It determines whether a handshape is a ``B'' or not.
- Of course, it's possible to get ties, and when this
occurs, several methods exist for breaking the tie, such as
reverting to 1-nn (ie just looking at the nearest neighbour), or
taking the sum of the distances and finding the minimum.
- A peartridge is a strange animal, known for
being incredibly funny, almost as funny as frogs. It is also, by
the way, also a noun, and hence a noun phrase.
course, if you were using the grammar proposed in the previous
section, you would end up with a peartridge in the parse tree,
which, surprisingly, is a fact that is well known by
``sobriety-challenged'' people around the world, especially on the
first day of Christmas.
- It is common to
find words that can be verbs or nouns or adjectives in spoken
languages. In sign languages this happens even more frequently.
- Most current large lexicon
speech recognition systems, use HMMs as well as a number of other
- Dorner uses this to indicate 3 dimensions in space plus
one in time.
- Formants are components of speech, identified by
their unique spectral properties.
- Several other mapping techniques were attempted
in addition to the one presented, but the one presented was found to
be the most effective.
- For example, if each
execution of C4.5 takes 20 minutes, this means we could sample 72
possibilities a day. A simple neural net might take 3 hours to
evaluate, which would mean 8 possibilities considered a day.
- And the greater research dollars
available in the US, and the larger Deaf population to support
- Humans have
become incredibly good at this, although we frequently don't
appreciate it. We can for example determine a type of material
simply by running our fingers over it, yet the sensation of touch is
not incredibly complex, and does not provide particularly accurate
or clean data [Rhe91].
- The dominant hand
may be either the left hand or the right hand, but is consistent
for a given signer -- just as in handwriting. However, the
PowerGlove only comes in a right-handed model. This means that the
system as it exists now is only suitable for right-handed signers.
This will be fixed once the PC-PowerGlove arrives.
- In the vague hope that by some miracle a $6500
CyberGlove would appear on the doorstep of the School of Computer
Science and Engineering one day.
- It was decided that it was best to
separate the functions of setting the terminal and the actual
glove-reading to allow the code to be maximally portable -- even to
platforms other than Unix. Provided the operating system the
software is being ported to has some form of redirection possible,
then gloveread will run on it.
it turns out, even this is not sufficient, since frequently people
will interpret the gloss differently. This turned out to be one of
the reasons that it was difficult to get inter-signer recognition
working as well as intra-signer recognition.
- Each ``click'' in the X dimension for example is
about 0.5 cm. This only gives a total swing in each of the
dimensions of about 120cm, which is not introduced by the hardware,
but by the simple fact that you can only express 256 values with 8
bits. This is enough for a relative positioning system (since most
signs do involve some bend at the elbow and people do not like to
swipe each other out by swinging their arms around dramatically),
but it is not enough for an absolute positioning system that is not
centred relative to the body.
- The words ``feature'' and ``attribute'' are used
synonymously in this document. From my understanding of the issues,
they are called features by the people trying to extract the
features/attributes and attributes by the people who write the
learning algorithms. Furthermore, when the word ``feature'' is used
here, it is not in the same context as that used in Wexelblat's
thesis, where a feature means an interesting change in a value of
the raw data.
- Unless a signer is really, really annoyed and
has entered into a very heated debate. In the same way that speaking
people raise their voices when they're arguing, signs become more
exaggerated and energetic as the arguments become more heated.
- It seems unusual, but the
ultrasonic emitters are mounted on the glove itself, and not vice
versa -- i.e. a logical alternative would be to put the emitters
at the L-bar and the receivers on the glove. As Chris Hand said in
discussion of this apparent design flaw: ``I don't know why they
did it. It was a mistake anyway. If you go the other way round
(as the Logitech flying mouse does), then you're sending
ultrasonic pulses towards something soft and round (human being)
rather than something hard and flat (monitor, desk) which reflects
echoes of the pulses all over the place, causing glitches.''
- Don't worry, nothing too
- In fact it can be shown that in the limit with
continuous glitching the filter behaves as 48#44 (in Z transform notation) which is an unusual
off-line linear filter.
- The official reason for this is that it is only fair not
to remove outlier points, and part of the testing of any system is
to see how well it copes with noisy data. The non-official,
pragmatic reason, is that 6 650 signs were collected and
going through them, analysing them and deciding which ones were true
outliers can take just a little time, a resource which an
undergraduate thesis student has plenty of :-).
- This dataset was the first
dataset taken. After the dataset was taken, the results appeared
too consistent. After some thought, it was realised that this was
because the words were appearing in the same order each time, and
thus the ``fatigue'' effects were equivalent for a given signs. In
subsequent samplings, the sign order was shuffled randomly to
ensure that the fatigue effects were not causing unnatural
consistency in the data.
- In fact there are several forms of
cross-validation -- for example there is n-fold cross validation,
where the data is divided into n near-equally sized sets. The
system is trained on n-1 of the sets and tested on the reamining
set. The test is then repeated n times, each time with a different
test set. At the extreme end there is leave-one-out cross
validation. In this case, we train on all of the training set
except one and then test the accuracy on the ``left-out'' sample.
This is repeated with each sample in turn left out once. In
practice, this is only computationally efficient with IBL1.
- Note that we
ignore the first sample, since 51#47 is clearly not defined,
since 52#48 are not defined.
- Note: We have ignored the first two samples,
because, as previously discussed, 60#56 and 61#57 do
- In this case the range is symmetrical about the x-axis,
but this is just a coincidence in this case.
- It was thought there would
be little difference in the behaviour of IBL1, IBL2 and IBL3.
- A poor man in this situation is
someone who does not have much CPU time or power available to him.
- This technique has certain statistically
complicating factors. For example, it means that the error rate
reported at low number of samples should be more reliable, except
for the fact that the learner is probably very sensitive to which
training examples were selected. At high number of examples, there
is less sensitivity to selection of the training examples, but the
testing examples are fewer and thus the reporting of error rate is
very sensitive. Furthermore, we cannot apply cross-validation,
since frequently the training set is smaller than the test set. Of
course, inverse cross-validation might be possible, i.e. simply
cycling through the possible training sets, rather than the test
sets, but this makes comparison of test results unfair. In brief,
however, expect the results to be a little noisy.
- PowerGloves cost approximately US$50 each, and the
interface device could be built for less than US$30.
- Typical difference in
performance was up to 5 per cent. At the same time, however, IBL3
significantly reduced the number of instances that had to be kept,
sometimes by up to 70 per cent. This of course means less computing
time spent comparing distances.
- You will recall that even though Starner's system
makes use of a camera, it requires that the user wear a light
yellow glove on one hand and a light orange glove on the other.
- There are actually 31 handshapes in Auslan, but 30
handshapes used in this test. The reason is that four
handshape is identical to the second variant of the
spread handshape. Their use in sign language is
distinguished by the context in which the handshape occurs, much as
in English spelling, where the ``c'' and ``k'' sounds can cause the
same phoneme. We tell which is being used in spoken English using
- Actually, although
the MCP (who actually wants to read ``metacarpophalangeal'', let
alone type it) is considered the first joint, because that's where
our fingers start, there are four joints in each of our
fingers -- the trapeziometacarpal for the thumb and the
metacarpocarpals for the other four digits. See figure
2.1 for a diagram.