a

Common Gesture Language

so we know what our bodies are saying
overview | examples | skeleton | publications | links | people

Overview

There are many standards for describing human body posture and gestures in the various related fields of human-computer interaction, psychology, computer graphics and animation, motion capture, and so forth. This is an attempt to unify those approaches and to specify a language that can serve all related disciplines. We propose ways to achieve a flexible language that allows decoupling of "gesture recognition" (in its wide sense) methods from their use for many applications. The benefits of such a unifying approach include fast and easy technology changes without modifications to the content modules, independent development of source methods from target applications, automatic translators between language descriptors, and so forth. In terms of applications, this has immediate benefits for humans in immersive training and simulation applications, for interaction with mixed reality environments (e.g., for situational awareness), for human activity analysis (surveillance), and for automated translators of cultural body language.

Examples

Descriptors

Visemes

The English language requires 22 visemes to describe all mouth shapes that produce English phonemes. That both the sound and the visuals matter is best showcased in the so-called McGurk effect. Microsoft's text-to-speech API uses the 13 Disney visemes, which were deemed sufficient to render cartoon characters that speak. A possible mapping between the two is shown on this good page about visemes. Visemes are not sufficient for modern animation demands, primarily because phonemes and visemes do not simply occur at the same time. Instead, we "co-articulate" phonemes: we prepare for the next one while still sounding the prior one. How the move to the anatomically-based FACS description can be made in animation is detailed in this viseme article at Gamasutra. The following example shows how to represent visemes in the suggested CGL.

Skeleton Tree

A skeleton tree that could be employed in a common gesture language. Note that it is not as detailed as it should be, but any common gesture language should be extensible to allow for future integration of finer detail. Also note that, unlike in most other human skeletons, FACS action units are a part of this tree.

Publications and Presentations

Links

People

Currently involved with the Common Gesture Language project are: Mathias Kölsch, Craig Martell, Jeff Weekley