|
Real-time control of
three-dimensional avatars (controllable, responsive animated characters) is an important
problem in the context of computer games and virtual environments. Avatar
animation and control is difficult, however, because a large repertoire of
avatar behaviors must be made available, and the user must be able to select
from this set of behaviors, possibly with a low-dimensional input device. One
appealing approach to obtaining a rich set of avatar behaviors is to collect an
extended, unlabeled sequence of motion data appropriate to the application. In
this project, we explore efficient methods to exploit such a motion database
for interactive avatar control.
In our system, the motion
database initially consists of a number of motion clips containing many motion
frames. The motion database is preprocessed to add variety and flexibility by
creating connecting transitions where good matches in poses, velocities, and
contact state of the character exist. The motion frames are then clustered into
groups for efficient searching and for presentation in the interfaces. A unique
aspect of our approach is that the original motion data and the generalization
(clusters) of that data are closely linked; each frame of the original motion
data is associated with a tree of clusters that captures the set of actions
that can be performed by the avatar from that specific frame. The resulting
cluster forest allows us to take advantage of the power of clusters to
generalize the motion data without losing the actual connectivity and detail
that can be derived from that data. This two-layer data structure (motion graph
+ cluster forest) can be efficiently searched at run time to find appropriate
paths to behaviors and locations specified by the user.
Developing an intuitive
interface for avatar control is challenging due to the high dimensionality of
avatar's motion and the real-time constraints. We explored three different
interfaces to provide the user with intuitive control of the avatar's motion:
Sketch, choice, and performance interfaces.
In the maze example, we recorded
a subject walking in an empty environment to create a motion database, and then
use that motion to control the avatar in a virtual environment with obstacles.
The user specifies a path through the environment by sketching on the terrain,
and the database is searched to find motion sequences to follow the path and
avoid the obstacles.
|
Similarly,
we recorded motion on small sections of rough terrain, and use that motion to
allow an avatar to navigate an extended rough terrain environment.
|
In choice interfaces, the user
is continuously presented with a set of possible options (directions,
locations, or behaviors) from which to choose. The user can scroll through the
possible options to select one at any time. As the avatar moves, the display
changes so that the choices remain appropriate to the context. The display
should be uncluttered to avoid confusing the user. In practice, this means that
roughly three or four actions should be presented to the user at any given
time. We use the cluster forest to obtain a small set of actions for display
that are typically well-dispersed.
|
In performance interfaces, the
user acts out the desired motion in front of a video camera and the avatar duplicates
the motion by selecting a sequence of motions of the database. The video data
is processed to produce a silhouette, which is then matched to the silhouettes
generated from the database in order to find a matching motion. When the
appropriate motion is not in the database, the closest motion is identified and
selected.
|
Acknowledgement
Supported in part by the NSF under Grant
IIS-0205224 and IIS-0326322.