A pivotal problem in American Sign Language (ASL) recognition is managing complexity. Because of its incredibly rich set of inflections that causes signs to appear in many different forms, it is futile to model the language on a per-sign basis for large-scale recognition. Instead, it is necessary to break down the signs into their constituent phonemes, which are limited in number.
Such a breakdown, however, raises two interesting problems: First, current work in ASL phonology still has some gaps that makes it difficult to apply directly to ASL recognition. Second, many events occur simultaneously in ASL, such as the movements of the strong and the weak hands, or a change of handshape and a movement of the arm.
To solve the first problem, we experimented with a modified version
of Liddell and Johnsons's Movement Hold model. [1] We show what changes
were necessary to adapt this model to a Hidden Markov Model (HMM) framework
for continuous recognition. Studying these changes may in turn help to
clarify the position of sign language linguistics on
phonological models of ASL.
The second problem is serious, because HMMs are by nature a sequential framework and thus inadequate to capture phonemes that appear simultaneously in a sign. A naive approach would be to model all combinations of movements of the strong and the weak hands as single phonemes, but such an approach would require such a large number of HMMs that training them would be computationally intractable, because of the huge amounts of training data required. Clearly, it is necessary to keep the number of HMMs as low as possible.
To this end, we discuss a modification to the HMM framework that models
the strong and the weak hands as moving in parallel, independently from
each other. The advantage of this approach is that the HMMs for the strong
and the weak hands can be trained separately, greatly reducing the amount
of training data needed. If successful, this approach could also easily
be extended to using hand configuration and nonmanual markers in the recognition
algorithms.
References
==========
[1] Scott K. Liddell and Robert E. Johnson. American Sign Language:
The
phonological base.
_Sign Language Studies_, 64:195--277, 1989.
Back to the Conference Schedule