Motion capture

from Wikipedia, the free encyclopedia
Motion capture markers on a subject's legs during a biomechanical examination

Under motion capture , literally motion detection , means a tracking method for the detection and recording of movements so that computers play them analyze further process use, and control of applications.

An example of such an application is the transfer of human movements to computer-generated 3D models. Examples of other types are head tracking and eye tracking , e.g. B. to control screen outputs or for analysis purposes and stereoscopic motion measurement . Another special motion capture process is performance capture (literally: representation capture ), which includes the capture of human facial and finger movements, i.e. facial expressions and gestures .

Motion detection has now also become widespread in video games, e.g. B. with Kinect , PlayStation Move and the Wii remote control .


3D motion investigation in movement science with the help of optical tracking

Basically, a distinction can be made between several methods that can also be used in combination.

Tracking with markers

Animation of the marker movements of a signing person whose movements were recorded by motion capture.

In optical tracking with markers, cameras are used which track active (that is, a signal-emitting) or passive markers on the persons or objects to be detected. Using the marker movements in the individual camera images, the position of the marker can be calculated in 3D using triangulation .

Tracking without a marker

Pattern recognition

Among other things, it is possible to use pattern recognition in image processing to track without any marker. Stereoscopic methods often facilitate recognition, since the object can be recognized from the triangulation of the two (several) camera positions.

Silhouette tracking

Silhouette tracking

Since the beginning of the 21st century, systems have been developed that can generate 3D motion detection based on silhouettes . After the silhouette has been extracted from the room using image processing algorithms, a virtual model is used to record the joint positions. The kinematic parameters are calculated on the basis of these. Above all, these systems offer significant advantages over the old, conventional analysis with markers, since potential errors in the markers are avoided (shifts on the skin; inaccurate application of the markers, etc.) and recording requires significantly less effort.


The very first series photographs by Eadweard Muybridge were intended to analyze human or animal movements. Later photographers and filmmakers attached clearly recognizable marks to their models in order to simplify the analysis of the movements. This technology survived into the video and computer age, although the markings were still transferred to the analysis software, frame by frame, by hand. Areas of application for this technology were mostly (sports) medicine, accident research ( crash tests ), and the rationalization of movement sequences in the world of work.

This technology could not contribute much to the production of films, although Muybridge's photos are still models for animators all over the world. It was only Max Fleischer who invented rotoscopy, a process that was able to transfer complex movements relatively easily into animation. The Lord of the Rings (1978) by Ralph Bakshi was created in this way as well as two more recent films by Richard Linklater . Nevertheless, no movement data is generated by means of rotoscopy, only the external appearance of the actors (partly software-supported) is transferred into the animation.

Motion capture for animated films

Further processing

After digitization, the raw data can be imported into current 3-D systems using a suitable plug-in and processed there. They are transferred to a virtual skeleton (a kind of three-dimensional stick figure). This skeleton is in turn linked to a wire frame model that follows the recorded movements. The construction of the skeleton ( rigging ) and its connection with the figure to be moved ( skinning ) contribute significantly to the credible reproduction of the recorded movement data.

After rendering , it looks to the viewer as if the virtual character is performing the movements of the original actor. This technique is increasingly used in computer-generated films such as Final Fantasy: The Powers Within You or Shrek . This technology is also used in many new PC and video games (e.g. Tony Hawk's Project 8 or the FIFA series from EA Sports ). In addition, both the production and marketing focus of the Rockstar Games game LA Noire was placed on the performance capturing technology used in the game. All the characters in the game were given a face that was as realistic as possible and the associated facial expressions using performance capturing in order to increase the atmospheric effect.

The "quality of movement" of the computer models depends on several factors:

  • Number of similar movements to be evaluated:
    The same movement is made several times by the actor, the movement patterns are compared and a mean value is formed from them.
  • "Joints of the computer model":
    A computer model is made up of various joints and bones (similar to that of a human). If the number of joints in the arm and hand of the model is smaller than that of the actor, not all data of the real movement can be used. "Edged" movements arise.
  • Attention to detail:
    With skeleton animation (in contrast to performance capture, see below), subtle movements such as muscle deformation, the movement of skin folds and the like are not recorded by the MoCap systems. The (invisible) movement of the skeleton, however, only reproduces part of the overall impression of a movement, which therefore not only depends on the quality of the raw data, but also on the skill of the animation artist when constructing the figure.

The animation of comic characters, which are brought to life by a clear exaggeration of natural movement sequences, has proven to be easier in practice than the lifelike reproduction that would theoretically be possible with the use of motion capture techniques. Studies have shown that animations are perceived as more strange the more they try to be realistic and avoid deliberate exaggeration. This "dent" in the perception curve is known as the uncanny valley . Overcoming this “uncanny valley” is currently not considered to be completely resolved despite the use of motion capture setups.


With motion detection, complex motion sequences (e.g. running, dancing) can be implemented with relatively little (time) expenditure, which other animation methods (e.g. key frame animation or keyframe animation ) are difficult or only possible with a high expenditure of time would have been.


The movements in motion capturing can appear artificial, even though the underlying data is of natural origin. This is due to the fact that so-called retargeting (the transfer of animation data to a virtual character with possibly different stature) is a complex process. If the human actor and the computer figure differ in their physiognomy, the data must be converted in some way. If the movements of a normal human actor are used, for example, to move a dwarf with very short legs, the decision must be made whether the type of movement is transmitted when walking (then the dwarf walks of course, but takes very small steps) or the absolute distance is used (in which case the dwarf will take giant strides, which may look unnatural or impossible). Since the fundamental problem arises with every adaptation of motion capture data to an artificial figure, this process is actually the most complex part of the whole when you have functioning motion capture data available. In larger productions, the movement data is therefore more often post-processed by experienced animators or even only used as a template for manual work (Titanic - the term for this is the so-called roto-animation).

Performance capture

Optical motion capture markers on a face for performance capture

Performance capture is a further development of motion capture technology, in which not only the body movements but also the facial expressions of the actors are scanned. Another characteristic that is often used to distinguish between motion capture and performance capture is the extent of the difference between the recorded actor and the animated character: if the difference is very small, the movement can be used with little touch-up; if there are large differences, complex conversion processes must transfer the movement to the new figure.

Performance capture has been used to animate fully computer generated characters in the following films:

Integrated performance capture

Films that fluctuate between people recorded in the real world and figures generated by the computer, without a boundary becoming visible here.

Films with individual performance capture characters:

The film Avatar (2009, director: James Cameron ) takes on a special role here, as large parts of the characters were animated with performance capture . In addition to the motion capture data of the actors, data from the camera was also recorded so that it could be transferred to the virtual environment.

See also

Individual evidence

  1. ^ Mori, Masahiro: The uncanny valley . Translation from Japanese: MacDorman, Karl F .; Schwind, Valentin. in: Haensch, Konstantin Daniel; Carnation, Lara; Planitzer, Matthias (Ed.): Uncanny Interfaces. Textem Verlag, Hamburg 2019. pp. 212–219. ISBN 978-3-86485-217-6 [republication]
  2. MacDorman, Karl F. : Masahiro Mori and the uncanny valley: A Retrospective in: Haensch, Konstantin Daniel; Carnation, Lara; Planitzer, Matthias (Ed.): Uncanny Interfaces. Textem Verlag, Hamburg 2019. pp. 220–234. ISBN 978-3-86485-217-6

Web links