the neuropsychological basis of perception of biological...
TRANSCRIPT
The Neuropsychological Basis of
Perception of Biological Motion
A Dissertation
Submitted for
the Degree of “Philosphiae doctoris” (PhD) in Neuroscience
at the
International Graduate School of Neuroscience (IGSN)
of the
RUHR-UNIVERSITY BOCHUM
by
Daniel Jokisch
Supervised by
Prof. Dr. Irene Daum
and
Prof. Dr. Nikolaus F. Troje
October 2004
Printed with permission of the International Graduate School of Neuroscience of the
RUHR-UNIVERSITY BOCHUM
First Referee: Prof. Dr. Irene Daum
Second Referee: Prof. Dr. Nikolaus F. Troje
Third Referee: Prof. Emily D. Grossman, PhD
Date of the oral examination: November 30th, 2004
Table of Contents
Chapter
I General Introduction
II Study 1: Biological Motion as Cue for the Perception of Size
III Study 2: Structural Encoding and Recognition of Biological
Motion: Evidence from Event-related Potentials and Source
Analysis
IV Study 3: Self Recognition versus Recognition of Others by
Biological Motion: Viewpoint-dependent Effects
V Study 4: Differential Involvement of the Cerebellum in
Biological and Coherent Motion Detection
VI General Discussion
VII References
List of Partial Publications
Declaration
Acknowledgments
Curriculum Vitae
Page
2
33
57
80
92
110
117
135
136
137
138
Chapter I General Introduction
2
I General Introduction
Movement patterns from fellow human beings contain a wide variety of information
providing important cues for successful social interaction. Therefore, the ability to
efficiently read this information is of particular relevance for each individual in
everyday life. Such motion patterns characteristic of living beings are termed as
biological motion (BM). The importance of this information is clearly not restricted to
human social interaction. Considering the animal kingdom, movements from
conspecifics, predators and preys provide a pivotal source of information. Accordingly,
its correct interpretation plays a major adaptive role.
For any animal, motion is an essential part of the visual environment. The ability to
detect animate motion and to adequately react to it is a basic requirement with respect
to an animals’ survival and successful reproduction. On the one hand, accurate and fast
movement recognition of a prey or predator animal and anticipation of its future
movements increases an animal’s fitness. Therefore, its chance of survival increases
since it can adjust a fight or flight reaction optimally. On the other hand, within a given
species, successful social interaction between possible partners or communication with
rivals requires the decoding of a variety of complex social signals mediated by
biological motion. Such successful social interaction is a prerequisite for successful
reproductive behavior. The possible partner needs to be classified in terms of sex, age,
social status and other attributes of biological, social and psychological relevance.
Motion patterns are an important source of information in that respect providing
information not only about the actions of a conspecific but also about its current
constitution in terms of physical fitness and emotional state.
The human species is characterized by a highly developed social structure and the most
complex communicative behavior among all animate beings. Communicative behavior
is based on speech as well as on non-verbal cues like gestures and mimics associated
with biological motion. Another non-verbal communication channel concerns the style
a person moves by providing a first impression about a person’s emotions and
Chapter I General Introduction
3
personality traits. This information may influence whether an individual finds someone
else sympathetic or even sexually attractive.
To efficiently use the large information content associated with BM and produce
appropriate behavior, the mammalian brain must be capable of perceiving this visual
information, decoding its meaning and deriving correct conclusions from it. The
extraordinary ability of BM perception is very old in terms of evolution and is assumed
to provide the basis for a number of higher cognitive functions. Its special relevance
results in the motivation to further elucidate its neuropsychological basis and to
investigate the interplay between perception and action in terms of meaningful
movements.
The following paragraphs describe the state of the art concerning research on the
perception of biological motion. First, the work of Gunnar Johansson is described who
initiated research on this perceptual phenomenon by his innovative studies. Next, the
large body of literature of literature on psychophysics investigating the information
content in the kinematics of movement patterns is summarized. In the following
paragraphs, the specificity of BM is stressed by the distinction of the categories of BM
and non-BM. The next large section of this chapter links the perceptual phenomenon of
BM perception to its underlying neural machinery. This section starts with a brief
overview about the foundations of the human visual system and its division into the
dorsal and ventral visual streams performing different aspects of visual analysis.
Within this framework, the results from electrophysiological studies, imaging studies
and computational studies are discussed in detail in order to give a extensive view
about recent findings and current developments in research on the neuronal basis of
BM perception. The next section emphasizes the functional significance of BM
perception and its associated neural machinery in higher cognitive functions. Among
these higher cognitive functions are social perception, action understanding, speech
perception and theory of mind. This section shows that BM perception is not an
isolated high-level visual phenomenon but rather an important perceptual ability
relevant for many high-level cognitive functions. Finally, in the last section of this
chapter the objectives of the current work are specified and the relationship to previous
studies is depicted.
Chapter I General Introduction
4
1.1 Theoretical background
In this section the theoretical background and the empirical evidence for the perception
of BM are described.
1.1.1 The concept of biological motion and psychophysical findings
About 30 years ago the Swedish psychologist Gunnar Johansson introduced the
concept of BM to experimental psychology (Johansson, 1973). He defined BM as
“motion patterns characteristic of living organisms in locomotion”. In everyday
perception, the visual information from BM is coupled with other sources of
information such as form or shape of animate objects. In order to isolate the
significance of animate motion under laboratory conditions, information from BM must
be separated from other sources of information. Johansson designed a new visual
stimulus display, the point-light display technique, which fulfilled these requirements.
By visualizing the position of the main joints of a walking person as bright dots against
a dark background he generated the vivid impression of a human figure in motion.
Using these displays, the compelling power of perceptual organization from BM from
only a few points was demonstrated. Observers need only 100-200 ms to organize such
displays into a coherent percept (Johansson, 1976).
Since this time, a large number of studies have used Johansson’s point-light displays as
stimulus material. It has been demonstrated that BM perception goes far beyond the
ability to recognize a set of moving dots as a human walker. The rudimentary
information contained in point-light displays of BM is sufficient even to solve
sophisticated recognition tasks. Observers are able to recognize the gender of a walking
person (Barclay, Cutting, & Kozlowski, 1978; Cutting, 1978; Kozlowski & Cutting,
1977; Mather & Murdoch, 1994; Troje, 2002a), can identify friends by their gaits
(Cutting & Kozlowski, 1977; Troje, Westhoff, & Lavrov, in press), and can recognize
themselves from a point-light display of their own movements (Beardsworth &
Buckner, 1981). Based on BM it is even possible to derive information about the
emotion of an actor (Dittrich, Troscianko, Lea, & Morgan, 1996; Pollick, Paterson,
Bruderlin, & Sanford, 2001). The ability to perceive BM is not restricted to human
Chapter I General Introduction
5
movements. Mather and West (1993) extended the point-light display paradigm to
animations of four-legged animals and showed that human observers can identify
different animals.
BM perception is strongly orientation dependent. Recognition performance decreases if
the stimuli are rotated with respect to their normal upright orientation. A number of
studies has shown that inversion of point-light displays impairs both the detection of an
actor as well as the recognition of actions, emotion, identity and several other attributes
(Bertenthal, Proffitt, & Kramer, 1987; Dittrich, 1993; Pavlova & Sokolov, 2000;
Shipley, 2003; Sumi, 1984). These findings resemble findings from face perception
(Thompson, 1980; Valentine, 1988). Both classes of visual stimuli have in common
that they contain sophisticated information for social recognition and communication.
Such orientation effects of faces and BM stimuli depend on the stimulus orientation
relative to the observer (Troje, 2003). Standard BM stimuli as used in the experiments
described above consist of an animation of the motion of dots attached to the joints of a
moving figure. As result, the animation contains information about the position of the
joints and about the motion of these points over time. Beintema and Lappe (2002)
developed a BM display in which the dots were not located on the joints, but rather on
a random position between the joints. Using a limited lifetime technique, each point
was reallocated to another position on the limbs in order to strongly degrade the local
motion. Observers were still able to recognize a human figure in these displays
indicating that in addition to local motion, dynamic form information about body
posture also contributes to BM perception.
Both top-down and bottom-up mechanisms contribute to the perceptual analysis of
BM. At the beginning of research on BM, a bottom up or low level processing
explanation was favored. This view was put forward by Johansson’s original approach
in which he considered BM processing from the perspective of visual vector analysis
(Johansson, 1973, 1976). He proposed an automatic extraction of a mathematically
lawful spatio-temporal relation in early visual patterns. This perspective was supported
by early bottom up computational models (Hoffman & Flinchbaugh, 1982; Webb &
Aggarwal, 1982) and experimental findings (Mather, Radford, & West, 1992). Further
support for the contribution of bottom-up processing of BM was given by Thornton
and Vuong (2004). Their results provide direct evidence that complex dynamic patterns
Chapter I General Introduction
6
can be processed incidentally by showing that task irrelevant point-light displays of
BM cannot be ignored and are processed to a level where they influence behavior.
Later on, several psychophysical studies indicated that top-down processes also play an
important role in BM perception. Bertenthal and Pinto (1994) suggested that the
perception of a global form specified by BM precedes the perception of the individual
elements or the local relation. This conclusion was drawn from an experimental
paradigm in which complex masking elements were used to render low-level
constraints uninformative. Additional support for the contribution of top-down
mechanisms was derived from findings that stereoscopic depth cues in conflict with
depictions of point-light walkers do not affect the perception of these walkers
(Bülthoff, Bülthoff, & Sinah, 1998). This result was attributed to a top-down
recognition-based influence.
Support for the notion of top-down influences in the perception of BM also stems from
experiments which varied the temporal characteristics of BM stimuli. Perception of
apparent motion results from the sequential presentation of static objects in different
spatial locations. When presented with sequential static images of an inanimate object
in different positions, the object is perceived as moving along the shortest or most
direct path, regardless whether this path is physically possible or not. The perception of
apparent motion is different when the object to be presented is a human figure.
When showing images of apparent motion of humans, observers perceived either a
direct path which was biomechanically impossible or an indirect path which was
biomechanically possible depending on the time interval between the stimuli (Shiffrar
& Freyd, 1990; Shiffrar & Freyd, 1993). If the time interval between images matched
the time required to perform the movement along the biomechanically possible path, a
realistic path was perceived. Based on this result it was concluded that perception of
human movement is constrained by an observer’s knowledge or experience with the
biomechanical properties of his own body.
Given the contribution of top-down and bottom-up processing in the perception of BM
the question on the role of visual attention arises. Attentional effects in processing of
BM were explored by Cavanagh, Labianca and Thornton (2001) and Thornton,
Rensink and Shiffrar (2002). Cavanagh and colleagues (2001) showed that
discrimination of specific features of point-light displays of BM seems to be a serial
Chapter I General Introduction
7
process, since reaction times increased with the number of items. The reaction time
increase was attributed to increasing attentional demands of the task. Results in a dual
task paradigm to explore the role of attention in the processing of BM (Thornton et al.,
2002) suggested that, in some cases, perception of BM can be automatic. But if
strategies operating in a global, top-down fashion are required, attentional demands
play a vital role.
1.1.2 Specificity of biological motion
Taken together, the behavioral findings indicate that the motion of living beings has
special features. In the absence of any other cue, motion can convey detailed and
specific information about what other organisms are doing. The content of information
in BM can be organized in three hierarchical levels. At the lowest level, humans can
detect whether an object is animate. The movements of inanimate objects are driven by
external forces whereas animate objects are usually self-propelled. At the next level
humans can detect agency since movements of agents are defined by their goals. At the
highest level we can detect intentionality. The movements of intentional agents are
determined by their beliefs and desires.
Looking closer at the physical properties of BM, the fact that BM is self-propelled
cannot does not provide a sufficient explanation for the specificity of BM. For instance,
cars, aircrafts, trains etc. share this property as well but clearly do not belong to the
category of animate objects. Moreover, it has been shown that the brain processes
categories of self-propelled, man-made objects differently compared to animate object
(Caramazza & Shelton, 1998). An interesting approach stresses the relevance of the
direct influence of the constant force of gravity on BM (Alexander, 1989). Gravity
determines one special feature of BM: the periodic dissipation of energy against
gravity. It might be possible that this typical pattern of motion created by the influence
of gravity is the crucial feature which is used by the mammalian visual system to detect
BM and process it as special category.
Chapter I General Introduction
8
1.1.3 Development of a neural machinery for the perception of biological
motion
The information channel for the analysis of BM is assumed to be very old in terms of
evolution. This assumption is based on the particular relevance of this ability in the
animal kingdom and is further supported by experimental findings of the ability to
perceive BM in a number of different species (cats, pigeons, macaques, chicks and
quails), using Johansson like displays as stimuli (Blake, 1993; Dittrich, Lea, Barrett, &
Gurr, 1998; Oram & Perrett, 1994; Regolin, Tommasi, & Vallortigara, 1999;
Yamaguchi & Fujita, 1999). With respect to its evolutionary importance, it is an
interesting question whether the ability to perceive BM and the underlying neuronal
connections are inborn or acquired during learning. Strong support for the notion that
this ability is innate comes from findings that newly hatched chicks prefer point-light
displays of chicks compared to other visual stimuli (Regolin et al., 1999; Yamaguchi &
Fujita, 1999). With respect to humans, the ability to process point-light displays of BM
has already developed in the first few months of life (Bertenthal, Proffitt, Spetner, &
Thomas, 1985; Fox & McDaniel, 1982). This finding does, however, not rule out that
the ability to perceive BM is innate.
1.1.4 Pathways of the human visual system
Before discussing the evidence for the neural substrates of BM perception on the basis
of neurophysiological and imaging studies, an overview of the brain mechanisms and
neural structures underlying human motion and form perception is given. This section
cannot provide a detailed description of all issues concerning the visual system because
of space limitations. Nevertheless, it provides the general framework in which the
experimental findings reported in the following sections can be integrated. The main
focus will be on the framework of the two extrastriate visual processing streams
(Ungerleider & Mishkin, 1982), the dorsal pathway and the ventral pathway, and their
significance for the perceptual mechanisms contributing to BM processing. This brief
overview is based on the description in Kandel, Schwartz and Jessell (2000).
Chapter I General Introduction
9
The primary visual cortex (area V1), the first neocortical structure analyzing visual
information, receives input from the magnocellular (M) and the parvocellular (P)
pathway from the retina. The M pathway is more sensitive to stimuli with lower spatial
and higher temporal frequencies. In contrast, the P pathway is essential for color vision
and is particularly sensitive to stimuli with higher spatial and lower temporal
frequencies. The M and P pathways pass from the retina through the parvocellular and
magnocellular layers of the lateral geniculate nucleus to the input layer of the primary
visual cortex. At this stage they feed into the two parallel extrastriate pathways
extending through the cerebral cortex. The dorsal pathway extends from V1, through
the middle temporal area (MT) and superior temporal area (MST), to the posterior
parietal cortex. The ventral pathway extends from V1, through V4, to the inferior
temporal cortex. Whereas the parietal pathway appears to primarily receive
magnocellular input, the inferior temporal pathway depends on both the P and the M
input.
The idea of two separate processing streams was put forward by Ungerleider and
Mishkin (1982). They found specific visual deficits following selective lesions in the
temporal and parietal cortex and suggested that the dorsal processing stream subserves
spatial perception and object localization (“where system”), whereas the ventral
processing stream underlies object perception and recognition (“what system”). This
view was modified by Goodale and Milner (1992), who confirmed the distinction
between the dorsal and the ventral pathway in human and nonhuman primates but
suggested a new interpretation with respect to the functional role of both pathways.
They proposed that both pathways use the same information in different ways.
According to their view, the dorsal pathway is assumed to use the visual information to
guide action (“how” system) and the ventral system is assumed to use the information
for conscious perception and object recognition. There is strong evidence that
processing in these two cortical pathways is hierarchical. Each level has strong
projections to the next level and back projections from higher to lower levels. The type
of processing changes systematically from one level to the next with an increasing
degree of complexity.
Chapter I General Introduction
10
1.1.4.1 Dorsal path
The dorsal pathway is critically involved in the perception of location and movement.
Moreover, it plays an important role in the control of eye and hand movements. The
first processing step of the dorsal pathway is performed in area V1. Cells in this area
respond to motion in one direction, while motion in the opposite direction does not
elicit a response. In monkeys, area MT is devoted to motion processing since almost all
of these cells are direction selective. In comparison to neurons in area V1, cells in area
MT have receptive fields that are ten times wider than those of cells in V1 projecting to
MT. Neurons in area MT respond to motion of bars of light by detecting contrasts in
luminance or by differences in texture or color. A cortical area adjacent to area MT,
area MST, also contains neurons that show responses to visual motion. These neurons
are assumed to process a specific type of global motion in the visual field called optic
flow. Information from optic flow plays a major role for a person’s own movement
through the environment since it concerns the perceived motion of the visual field
resulting from an individual’s own movement. Neurons in area MST have receptive
fields that cover a large part of the visual field and respond preferentially to large-field
motion. Additionally these neurons are sensitive to shifts in the origin of full-field
motion and to differences in speed between the center and periphery of the field.
Another area involved in perception of optic flow is the superior temporal area of the
parietal cortex (STP).
1.1.4.2 Ventral path
The ventral processing stream extends from V1 through V2 to V4 and then to the
inferior temporal cortex. As in area V1, neurons in area V2 are sensitive to the
orientation of stimuli, to color and to their disparity. These cells extend the analysis of
contours initiated in area V1. V2 neurons perform an analysis at a more abstract level
compared to V1 neurons since they respond not only to real contours but also to
illusory contours. Neurons in area V4 respond to color and to more complex forms
compared to neurons in previous layers. Recognition of complex forms is related to
processes occurring in the inferior temporal cortex. The most important visual input to
the inferior temporal cortex comes from area V4. The inferior temporal cortex consists
Chapter I General Introduction
11
of two major regions, areas TEO and TE. Receptive fields of neurons in area TEO are
generally larger than those of V4 neurons but smaller than those of neurons in area TE.
The primary inputs of TEO neurons come from area V4, the primary outputs go to area
TE. Therefore, the neural coding of visual object features in area TEO is more global
than in area V4 but not as global as in area TE.
1.1.5 Neural machinery contributing to the perception of biological
motion
The behavioral findings show the high degree of specificity of BM processing. They
imply a highly developed neural machinery dedicated to the perceptual analysis of BM
information reflecting the relevance of information from BM. This assumption has
stimulated much neurophysiological, imaging and neuropsychological research on the
neural basis of the perception of BM in recent years.
1.1.5.1 Electrophysiological studies of biological motion perception in non-humans
A few electrophysiological studies have investigated the recognition of BM in macaque
monkeys (Jellema & Perrett, 2003a, 2003b; Oram & Perrett, 1994, 1996). Some
neurons in the superior temporal polysensory area (STP) responded selectively to full-
body or hand movements. Single cell recordings in area STP yielded neurons which
responded selectively to the sight of whole body movements as well as to point-light
displays of BM (Oram & Perrett, 1994). Some of these neurons are view-dependent
since their response decreased substantially when the stimulus is presented from a
different viewing angle than the neuron’s preferred view. Many cells in the area around
STP have multimodal properties as indicated by the integration of information about
the form and motion of animate objects (Oram & Perrett, 1996). Area STP is located in
the vicinity of the superior temporal sulcus. In the following the term STS-complex is
used referring to the area around the superior temporal sulcus.
Recently it has been suggested that the neural representation for actual BM in the STS-
complex may also extend to BM implied from articulated static postures which form
Chapter I General Introduction
12
the end point of an action (Jellema & Perrett, 2003a). Neural activity in response to
face and body postures in the STS-complex can be influenced by the perceptual history
in terms of immediately preceding actions (Jellema & Perrett, 2003b). Such a
mechanism could support the formation of expectations about the impending behavior
of others.
Another set of action selective neurons has been found in the monkey premotor cortex
(Gallese, Fadiga, Fogassi, & Rizzolatti, 1996; Rizzolatti, Fogassi, & Gallese, 2001).
These neurons were termed “mirror neurons” since they respond both when the
monkey performed an action and when it observed the same action. Similar to neurons
in the STS-complex, mirror neurons often have multimodal properties (Kohler et al.,
2002). The functional role of this mirror neuron system will be described in a separate
paragraph.
1.1.5.2 Findings from electrophysiological studies in humans
Electrophysiological techniques such as electroencephalogram (EEG) and event-related
potentials (ERP) provide the opportunity to measure brain activity with a very high
temporal resolution, but have the disadvantage of a low spatial resolution in
comparison to functional imaging techniques. Two different approaches are currently
applied in the literature on electrophysiological studies of BM and action perception in
humans. One approach measures electrical brain activity when presenting stimuli
consisting of body movements in full view in which the display initially stands still and
movement onset occurs with a delay. The other approach uses point-light displays of
BM as stimulus material.
By applying the first approach, neural responses to onset of movements of the mouth
and the eyes (Puce, Smith, & Allison, 2000) were observed within 200 ms after motion
onset as measured by ERPs. Facial movements occurring on a continuously present
face elicited different N170 amplitudes for mouth opening versus closing and for eye
aversion versus eyes gazing at the observer. Similar results were found for the
observation of whole body actions of others (Wheaton, Pipingas, Silberstein, & Puce,
2001). ERPs elicited in response to movement onset in movie sequences of body
Chapter I General Introduction
13
stepping, hand closing and opening, and mouth opening and closing were selective for
specific hand and body motions.
An fMRI and ERP study of visual processing of natural and line-drawings displays of
moving faces (Puce et al., 2003) supports the notion that the temporal lobe integrates
facial form and motion in humans. The STS and the fusiform gyrus responded
selectively to both types of face stimuli, and they evoked larger ERPs compared to
control stimuli at around 200 ms after motion onset. Puce and Perrett (2003) recently
concluded that a specialized visual mechanism exist in the STS complex of both
humans and non-human primates which produces selective neural responses to moving
natural images of faces and bodies. By using the first approach, i.e. presenting displays
with delayed motion onset, it is possible to separate stimulus onset and movement
onset. But such displays lack typical features of BM, which make up its specificity. For
instance, the specific style of a movement is not considered. This source of information
is transported purely by the dynamics of the movement and cannot be conveyed by
displays of apparent motion. Therefore, information about the smoothness or the
intensity of a movement remains underestimated in this paradigm.
Hirai, Fukushima and Hiraki (2003) used the second approach and tried to clarify the
neural dynamics in BM perception by comparing ERPs elicited by point-light displays
of BM and scrambled motion. They report that both types of stimuli elicited peaks at
around 200 and 240 ms, which were larger in the BM condition than in the scrambled
motion condition. A recently published paper (Pavlova, Lutzenberger, Sokolov, &
Birbaumer, 2004) analyzed gamma MEG activity in response to BM generated from a
computer algorithm. Recognizable upright and non-recognizable inverted walkers
evoked enhancements in oscillatory gamma brain activity (25-30 Hz) over the left
occipital cortices as early as 100 ms from stimulus onset. Upright BM elicited further
gamma response over the parietal (130 ms) and right temporal (170 ms) lobes.
1.1.5.3 Findings from functional neuroimaging studies in humans
Neuroimaging research has demonstrated that viewing BM engages a specific structure
located in the area around the superior temporal sulcus (STS) that is often termed STS-
Chapter I General Introduction
14
complex. The first suggestion for the existence of a specialized mechanism dedicated
to the perception of BM came from an fMRI study aiming to examine the properties of
the V5-complex (Howard et al., 1996). This area is specialized for visual motion
perception. One of the stimulus categories used in this experiment were point-light
displays of BM. Activity in response to these stimuli was found in the V5-complex as
well as in areas of the superior temporal cortex. The latter finding was unexpected,
since this part of the superior temporal cortex had been assumed to belong to the
auditory cortex and was normally activated by the perception of speech. Several
functional imaging studies using fMRI or PET were carried out in the following years
which presented point-light displays of BM as stimulus material and used different
types of stimuli as control condition, such as scrambled motion, coherent motion,
rotating objects or static object (Bonda, Petrides, Ostry, & Evans, 1996; Grezes et al.,
2001; Grossman & Blake, 2001, 2002; Grossman et al., 2000; Servos, Osu, Santi, &
Kawato, 2002; Vaina, Solomon, Chowdhury, Sinha, & Belliveau, 2001). These studies
report selective activation of the superior temporal sulcus (STS) to visual stimuli
consisting of BM. In addition to area STS, activation specific to BM has also been
found in the cerebellum (Grossman et al., 2000; Vaina et al., 2001), area VP (Servos et
al., 2002), the amygdala (Bonda et al., 1996), the occipital and fusiform face area
(Grossman & Blake, 2002) and the premotor cortex (Santi, Servos, Vatikiotis-Bateson,
Kuratate, & Munhall, 2003; Saygin, Wilson, Hagler, Bates, & Sereno, 2004).
There seem to be hemispheric asymmetries associated with the processing of BM,
irrespective of the visual field in which the display was presented, with a pronounced
activity in the right STS-complex (Grezes et al., 2001; Grossman et al., 2000).
Moreover, even inverted displays of BM or imagined BM are sufficient to induce
activity in the STS-complex, but the activity level was lower in these conditions than
during actual viewing of the BM animations (Grossman & Blake, 2001). Recently it
has been shown that point-light displays of BM also induce activation in the premotor
cortex (Saygin et al., 2004). This finding is consistent with the mirror-neuron-theory
(Rizzolatti, Fadiga, Gallese, & Fogassi, 1996) which is described in detail in a separate
section.
Experimental evidence for the specificity of the STS-region to the processing of BM
was provided by a study comparing BM with meaningful and coordinated non-BM
Chapter I General Introduction
15
such as the pendulum movements of a grandfather clock. Activity in the STS-region
was not induced by meaningful and coordinated non-BM (Pelphrey et al., 2003). A
dissociation between visual processing of moving humans and moving manipulable
objects was also supported by the findings of Beauchamp and colleagues (Beauchamp,
Lee, Haxby, & Martin, 2003). They showed STS activity in response to human point-
light and video displays in contrast to activity in the middle temporal gyrus evoked by
tool video and point-light displays.
1.1.5.4 Findings from lesion studies in humans
Results from imaging studies are consistent with neuropsychological findings in
neurological patients suffering from focal brain lesions (Cowey & Vaina, 2000;
McLeod, Dittrich, Driver, Perrett, & Zihl, 1996; Schenk & Zihl, 1997; Vaina, 1994;
Vaina, Lemay, Bienfang, Choi, & Nakayama, 1990). These case studies provide
evidence for a dissociation between mechanisms involved in the perception of BM on
the one hand, and mechanisms involved in inanimate visual motion tasks or static
object recognition tasks on the other hand.
Patients with bilateral lesions involving the posterior visual pathways such as patients
LM (McLeod et al., 1996) and AF (Vaina, Lemay, Bienfang, Choi, & Nakayama, 1990)
showed severe deficits in visual motion perception (seeing coherent motion in random
noise, speed discrimination) but could nevertheless recognize human action patterns
presented as point-light displays. Patients with bilateral ventral lesions involving the
posterior temporal lobes such as patients EW (Vaina, 1994), who suffered from
prosopagnosia and object agnosia, could also identify BM in point-light animations. A
different pattern emerged in patient AL (Cowey & Vaina, 2000) who is hemianopic and
suffers from visual perceptual impairments in her seeing hemifield resulting from an
additional lesion in ventral extrastriate cortex. AL fails to recognize BM displays
despite intact static form perception and motion detection. Based on the investigation of
a sample of 39 patients with acquired brain damage, it was concluded that deficits in
perception of BM are not caused by impairments of basic visual motion or form
perception. They are a consequence of damage to structures involved in the combined
analysis of visual motion and form information (Schenk & Zihl, 1997).
Chapter I General Introduction
16
Perception of BM also seems to be affected in patients suffering from lesions in the
parietal cortex (Battelli, Cavanagh, & Thornton, 2003). Patients could easily perform a
classical form-from-motion task but were severely impaired in a visual search task
using BM sequences. The authors hypothesized attentional deficits impair the
integration process which links the unconnected traces of single dots to generate a
global percept of a human walker.
1.1.6 Computational simulations modeling neural mechanisms of
biological motion perception
According to a neural model based on previously reported findings from
psychophysical, neurophysiological and imaging studies, both the dorsal and ventral
processing streams contribute to the perceptual analysis of BM (Giese & Poggio,
2003). This computational model suggests a learning-based, feedforward mechanism
and provides a neurophysiologically plausible explanation for many of the key
experimental findings described in the previous paragraphs. Therefore, this model is
explained in more detail in the following paragraph. The computational model makes
four assumptions: First, it is divided into two parallel processing streams analogous to
the ventral and dorsal visual stream described previously. Second, both pathways
consist of “neural feature detectors” in a hierarchical order that extract form or optical
flow features. Third, the model assumes that the hierarchy is predominantly
feedforward. Forth, the representation of BM is based on a set of learned patterns that
are encoded as snapshots of body postures in the form pathway and by sequences of
complex optic flow patterns in the motion pathway.
1.1.6.1 The form pathway in the model
The form pathway recognizes BM by extracting the form information contained in
individual snapshots from sequences of body postures. Consistent with the general
organization of the visual system, the position and scale invariance as well as the sizes
of the receptive fields increase along the hierarchy. The form pathway of the model is
subdivided into four levels. The first level comprises local orientation detectors, which
Chapter I General Introduction
17
detect eight preferred orientations and two spatial scales. This stage models simple
cells in primary visual cortex corresponding to brain areas V1 and V2. The second
level in this pathway contains position and scale invariant bar detectors. They extract
local orientation information. Within their receptive field, the responses are
independent of the spatial position and the scale of contours. There are complex cells in
area V2 and cells in area V4 with these properties. The next level along this hierarchy
consists of snapshot neurons that are selective for specific body postures. Such
snapshot neurons have large receptive fields and have substantial position and scale
invariance. Neurons with such features might be located in area IT, and in area STS
and area FA. The highest level within this model hierarchy contains motion pattern
neurons. These neurons temporally smooth and summate the activity of the snapshot
neurons of the previous layer that contribute to the encoding of the same movement
pattern. Motion pattern neurons in the form pathway of this model are assumed to be
sequence selective. Therefore, their responses are restricted to a sequence of snapshots
of body postures that occur during natural movements. Neurons with these properties
might be located in area STS, the premotor cortex (area F5) and area FA.
1.1.6.2 The motion pathway in the model
The motion pathway recognizes BM by analyzing complex optic-flow patterns that are
specific to BM. In analogy to the form pathway, the motion pathway consists of a
hierarchy of neural detectors for optic flow features. Along the hierarchy, there are
increases in the receptive field sizes, invariance of the detectors and complexity of the
extracted features. In parallel to the form pathway of the model, the motion pathway is
subdivided into four layers.
The model assumes local motion detectors and component motion-selective neurons on
the first level. Neurons on this level compute a signal that is derived from local optic
flow vectors. Speed and direction-selective neurons in areas V1 and V2 and area MT
have such properties. The model contains detectors for four different motion directions
and two speed classes. The next level consists of neurons that are selective for
opponent motion. Neurons with this property might be located in areas MT, MST and
KO. Such opponent motion detectors are obtained by combining the responses of two
Chapter I General Introduction
18
adjacent subfields with selectivity to opposite directions. Activity of each subfield is
obtained by combining the responses of the local motion detectors of the first level
with the same direction preference. The detectors of the third level are selective for
complex optic flow patterns that arise for individual movements of BM patterns. These
neurons are analogous to snapshot neurons in the form pathway and may be located in
area STS and FA. Finally, the motion pattern neurons of the motion pathway in the
fourth layer summate and smooth the output signals of optic flow pattern neurons of
layer three. Moreover, motion pattern neurons are sequence selective. These motion
pattern neurons are located in area STS and F5.
1.1.6.3 Model features
A realistic neural model of BM perception in humans must fulfill several criteria. On
the one hand, the model must be able to generalize BM patterns across position, scale
and identity of an actor. On the other hand, the model must be selective enough to
recognize subtle details to derive the wide variety of information in BM patterns.
Consistent with psychophysical data (Beardsworth & Buckner, 1981; Cutting &
Kozlowski, 1977), the selectivity of the model is sufficient to identify individuals by
their gait. Concerning generalization, the model is invariant with respect to position
changes, scale changes and changes in the speed of the walker. Another property of
BM perception is view-dependence. When a point-light walker is rotated in the image
plane (Bertenthal et al., 1987; Dittrich, 1993; Pavlova & Sokolov, 2000; Shipley, 2003;
Sumi, 1984), recognition performance drops substantially. Simulations with the model
yielded the same result. Another key feature of BM perception is the robustness of the
phenomenon. Even under poor conditions, as e.g. dim illumination, the neural
architecture of the model recognizes BM.
1.1.6.4 Limitations of the model
Although the model proofs that a relatively simple, biologically plausible neural
architecture can account for many properties of BM recognition, it involves several
simplifications. One of these simplifications concerns the role of top-down attentional
Chapter I General Introduction
19
effects. There is experimental evidence for such effects on BM recognition (Cavanagh
et al., 2001; Thornton et al., 2002). Moreover, the model does not consider the
complexities of every day vision such as eye movements and shifts of attention. To
take this into consideration, the neural architecture must include top-down mechanisms
and their substrates of back projections from higher levels to basic levels of the model.
1.1.7 Functional significance of biological motion perception in higher
cognitive functions
The extraordinary ability of the visual system to derive the compelling percept of a
human person from a few moving bright dots is only one feature of BM perception. It
has been directly related to a number of higher cognitive functions. Research on the
cognitive neuroscience of social perception (Adolphs, 1999, 2001, 2003), action
understanding (Rizzolatti et al., 2001), speech perception (Hauser, Chomsky, & Fitch,
2002) and theory of mind (Blakemore & Decety, 2001; Frith & Frith, 2003; Gallese &
Goldmann, 1998) has received much attention in the last years. BM perception plays an
important role in higher cognitive processes either by providing the interface between
neural systems of perception and action (action understanding and speech perception)
or by performing the initial stages in the processing of socially relevant information in
a larger social cognition network (social perception and theory of mind).
1.1.7.1 Social perception and biological motion
Social perception refers to initial stages of information processing which culminates in
the accurate analysis of the dispositions and intentions of other individuals (Allison,
Puce, & McCarthy, 2000). As pointed out in one of the previous sections, area STS
plays an important role in the processing and analysis of BM perception. This area is
not only involved in the analysis of point-light-displays, but also contributes to the
analysis of other visual features containing social information. One of these features is
direction of gaze which is thought to provide information in social situations, express
intimacy and exercise social control (Kleinke, 1986). Direction of gaze is also an
indicator of social attention by guiding the focus of another person’s attention.
Chapter I General Introduction
20
Other visual stimuli activating area STS are mouth, hand and body movements. Mouth
movements can be broadly divided into non-speech movements and speech related
movements. Non-speech mouth movements are an important component of facial
gestures. In non-human primates it is assumed that any mouth movement which is
meaningful to another individual will preferentially activate a population of cells in the
STS-complex. One important function of speech related mouth movements is to
improve our comprehension of what is being said. Lip reading activates a region within
the STS-complex bilaterally (Calvert et al., 1997). These regions probably also play a
role in visual-auditory illusions such as the McGurk effect (McGurk & MacDonald,
1976). It has been suggested that lip reading involves regions of the STS-complex
which are distinct from those being involved in non-speech mouth movements.
Observation of hand movements activates parts of the STS-complex. Responsiveness to
hand movements is stronger if movements are goal directed such as grasping an object
(Rizzolatti, Fadiga, Matelli et al., 1996). There seems to be an advantage of the left
STS-complex in visual analysis with respect to meaningful hand movements (Grafton,
Arbib, Fadiga, & Rizzolatti, 1996; Rizzolatti, Fadiga, Matelli et al., 1996), although
one study reported bilateral activation (Grezes, Costes, & Decety, 1998). Activation of
the STS-complex is not restricted to full views of actors performing an action. Point-
light displays of goal directed hand movements also activate the left STS-complex
(Bonda et al., 1996). Observation of American Sign Language (ASL) hand movements
elicits differential activation in the STS-complex in a group of subjects not knowing
ASL and a group of subjects knowing ASL (Neville et al., 1998). STS-activity was
only observed in subjects knowing ASL. Based on this finding it was suggested that the
STS-complex is primarily activated by meaningful or communicative hand gestures.
The STS-complex also plays an important role in implied motion. Implied BM refers to
static images of an animate being performing an action (for instance a soccer player in
the act of kicking a ball). Stronger fMRI activity in the STS-complex has been found
when viewing images containing implied motion compared to images without implied
motion (Kourtzi & Kanwisher, 2000).
Social perception is one part of a larger domain of cognitive functions subserving
social communication (Adolphs, 2001). Apart from social perception, social cognition
and social behavior complement the domain of social communication. Information
Chapter I General Introduction
21
from social perception feeds into the domain of social cognition, which in turn guides
automatic and planned social behavior. With respect to the domain of social cognition,
two brain areas play an important role: the amygdala and the orbitofrontal cortex. One
of the principal functions of the amygdala seems to be the attachment of emotional
salience to sensory input (Adolphs, 1999). This can be achieved by feedforward and
feedback projections between the STS-complex and the amygdala. Such a circuit might
lead to an attentional amplification of activity of the STS-complex. Similar
mechanisms may effect interactions between the orbitofrontal cortex, the amygdala and
the STS-complex.
The role of the orbitofrontal cortex in social cognition and regulation of behavior has
been studied since the famous case report of Phineas Gage, whose decision making
abilities were severely impaired after a large orbitofrontal lesion. In spite of normal
intellectual functioning, lesions of the orbitofrontal cortex lead to stereotyped and
inappropriate social behavior and lack of concern for other individuals (Damasio,
Grabowski, Frank, Galaburda, & Damasio, 1994). Two models of orbitofrontal cortex
function in social cognition are currently discussed. One model suggests that the
orbitofrontal cortex serves to control impulsive, aggressive and violent social behavior
(Davidson, Putnam, & Larson, 2000). The somatic marker hypothesis (Damasio, 1996)
suggests that the prefrontal cortex contributes to a mechanism which acquires,
represents and retrieves values of actions. This mechanism generates representations of
somatic states which correspond to the anticipated outcome of decisions. The somatic
markers guide decision-making on the basis of the individual’s past experience with
similar situations and favor those decisions that were advantageous for the individual.
Taken together, the domain of social communication entails several component
processes: social perception, social cognition and social behavior. The social cognition
system uses information to construct a complex mental representation of the social
environment. Such information is provided by the social perception system. Processes
in the social cognition system in turn modulate effector systems, resulting in social
behavior. A complex neural machinery subserves the social communication system.
Regions in the temporal lobe such as the STS-complex and the fusiform gyrus
subserving social perception interact with a network of structures including the
amygdala and the orbitofrontal cortex subserving social cognition. These structures in
Chapter I General Introduction
22
turn provide the input to motor and premotor systems and the basal ganglia which
guide social behavior.
1.1.7.2 Action understanding and biological motion
Action understanding can be defined as the capacity to achieve the internal description
of an action and to use it to organize appropriate future behavior. Two explanatory
approaches for this ability are currently discussed: the visual hypothesis and the direct
matching hypothesis (Rizzolatti et al., 2001). The first view suggests that action
understanding is based on visual analysis of the elements constituting an action.
According to this hypothesis, associations between the action elements and inferences
about their interactions are sufficient to understand an action. In terms of neural
structures, extrastriate visual areas, the inferior temporal lobe and the STS-complex
mediate action understanding. Motor involvement is not required.
The second approach states that we understand an action when we map the visual
representation of the observed action onto our motor representation of the same action
(Rizzolatti et al., 2001). Observation of an action causes resonance in the motor system.
If this view is correct, we understand an action because the motor representation of that
action is activated in our brain. The direct matching hypothesis does not completely
rule out that other cognitive processes, as e.g. suggested by the visual hypothesis, could
also contribute to this function. Experimental evidence supports the latter view. An
action observation execution mechanism does exist in monkeys and humans, which has
a number of implications for the understanding and imitation of actions. Evidence
comes from studies applying transcranial magnetic stimulation (Fadiga, Fogassi,
Pavesi, & Rizzolatti, 1995; Gangitano, Mottaghy, & Pascual-Leone, 2001), from MEG
studies (Hari et al., 1998) and from fMRI studies (Buccino et al., 2001; Iacoboni et al.,
1999). In contrast to the monkey mirror system the human analogue seems to be more
flexible, since it reacts not only to goal-directed actions but also shows resonance
behavior to intransitive movements, i.e. movements not directed towards an object.
Rizzolatti and colleagues (Rizzolatti et al., 2001) argued that sensory binding of
different actions reflected by activity in the STS-complex may have derived from the
development of motor synergistic actions. Efference copies of actions may activate
Chapter I General Introduction
23
specific sensory targets in order to improve the control of action. As a result, such an
interaction between sensory and motor systems can be used to understand the actions of
others.
Taken together, the strong interaction between the sensory and motor systems serves as
a potential mechanism for several higher cognitive functions such as imitation learning
and action understanding. Moreover, this mechanism could also play an important role
in other fundamental functions such as speech perception and the development of a
theory of mind. The following section addresses these issues in more detail.
1.1.7.3 Speech perception and biological motion
Hauser and colleagues (2002) developed a theoretical framework for the understanding
of language. They suggested a distinction between the faculty of language in the broad
and the narrow sense. According to this view, language in the broad sense consists of
three subsystems. The first subsystem is a computational system for recursion,
providing the capacity to generate an infinite range of expressions from a finite set of
elements. This system also represents language in the narrow sense. The second
subsystem is a conceptual-intentional system including categorization, reference and
reasoning. This subsystem is involved in the acquisition of conceptual representations,
referential vocal signals and a voluntary control over signal production. The third
system is the motor-sensory system linking the action and perception system with
respect to modalities of language production and perception.
The way in which sounds conveying words are transformed into linguistic
representations in the brain is still under debate. Among several theories trying to
explain how speech is perceived, the “motor theory of speech perception”, which was
originally proposed by Liberman and colleagues (Liberman, Cooper, Shankweiler, &
Studdert-Kennedy, 1967; Liberman & Mattingly, 1985), has received much attention.
The main assumption of the motor theory is that the constituents of speech are not
sounds per se, but the articulatory gestures associated with these sounds that are shared
by the speaker and the listener. Accordingly, speech is perceived by matching these
articulatory gestures contained in listened words on the listener’s motor repertoire. This
Chapter I General Introduction
24
theory has been supported by experimental evidence of the close relation between
neural systems underlying action and perception as previously described. Further
evidence for the motor theory of speech perception is provided by the McGurk effect
(McGurk & MacDonald, 1976), which has been described in the section about social
perception. The kinematics of the face seem to play a special role in the McGurk effect,
since a point-light talking mouth aids the perception of speech in noise (Rosenblum,
Johnson, & Saldana, 1996) and interferes with audio speech perception when the
auditory and visual streams are incongruent (Rosenblum & Saldana, 1996). Speech
related and walking BM perception rely on networks which share some cortical
regions, but there are also regions that are relatively independent (Santi et al., 2003).
Santi and colleagues (2003) suggested that at the level of the STS-complex, the left
hemisphere becomes dominant in speech related BM, while the right STS maintains its
dominant role in processing whole-body BM.
The hypothesis of the specific role of BM perception of meaningful hand and mouth
movements in the development of a neural system for speech perception is also stressed
by Fadiga and Craighero (2003). A specific brain region is assumed to act as a
comparator between own and others’ motor representations. This region would allow
individuals to automatically understand the perceived action, because they are able to
reproduce the same action or the same sensory consequences of that action. In
Liberman’s motor theory of speech perception, it is necessary to have a motor
resonance system for movements of the vocal tract involved in the production of
speech sounds and perception of speech sounds. Such a mirror system may have
evolved from a matching system of motor and perceptual representations of hand
actions.
1.1.7.4 Theory of mind and biological motion
One of the most complex cognitive functions discussed in connection with BM
perception and its neuronal correlates is the ability to understand other people’s mental
states, i.e. their beliefs, desires and intentions. This capacity is described as to have a
‘theory of mind’, or mentalizing (Blakemore & Decety, 2001; Frith & Frith, 1999; U.
Frith & Frith, 2003; Gallese & Goldmann, 1998). The neuronal machinery involved in
Chapter I General Introduction
25
mentalizing is likely to have evolved from several preexisting mechanisms. Frith and
Frith (1999) argued that such mechanisms might include the ability to distinguish
animate and inanimate entities (1), the ability to share attention by following the gaze
of another agent (2), the ability to represent goal-directed actions (3), and the ability to
distinguish between actions of the self and of others (4). The neuronal structures
assumed to underlie these function are the STS-complex for the detection of animate
objects and other people’s focus of attention, inferior frontal regions for the
representation of goal directed actions, medial prefrontal regions for the representation
of mental states of the self and the decoupling mechanism that distinguishes mental
state representations from physical state representations (Frith & Frith, 1999; Frith &
Frith, 2003). The activation of these components in concert seems to be critical to
mentalizing.
Gallese and Goldmann (1998) explicitly stressed the role of an action perception
matching system in the context of mentalizing. Their simulation theory of mind reading
suggests that other people’s mental states are represented by adopting their perspective
by matching their states with resonant states of one’s own. At least one component of
mentalizing, the understanding of intention, might have evolved from such a
mechanism (Blakemore & Decety, 2001). Their approach builds upon the suggestion
that we understand other people’s actions by mapping the observed action onto our
own motor representations of the same action. A mechanism for the recognition of
other people’s intentions could work in the following way. Sensory consequences of
our own actions are predicted by a forward model mechanism that works automatically
and stores a large number of sensory predictions resulting from different motor actions.
Efference copy signals that are generated simultaneously with the motor commands
may play a special role within this forward model. If this mechanism also operated in
the reverse direction, the process used by the forward model to predict the sensory
consequences of one’s own actions could in principle also be applied to estimate motor
commands from the observation of other people’s actions. Accordingly, observation of
other people’s action activates the motor commands that guide this action. Based on the
activation of this motor command it might be possible to estimate our own intention if
we performed the same action in the same context.
Chapter I General Introduction
26
1.1.7.5 Concluding remarks on the contribution of biological motion perception to
higher cognitive functions
It is beyond the scope of this section to give a detailed description about all aspects of
social communication, action understanding, speech perception and a theory of mind.
The main focus was directed on the relation between the neural network involved in
BM perception and its critical contribution to these higher-level cognitive functions.
It is widely accepted that that the evolution of neural mechanisms involved in high
level cognitive functions builds upon more basic mechanisms which we share with
other animals. One of these important mechanisms is the neural machinery originally
dedicated to the perception of BM. In the animal kingdom the ability to fast and
efficiently detect BM increases an animal’s chance of survival directly. In humans, the
neural machinery that has evolved from this mechanism plays a more sophisticated role
because of its critical contribution to social communication, action understanding,
speech perception and mentalizing as well as the relevance of social interaction as
highly adaptive value.
1.1.8 Open issues concerning perception of biological motion
As indicated by the large number of publications dealing with this topic, the knowledge
about the neuronal basis of BM perception has substantially increased in recent years.
Nevertheless, there are still several open issues. Initial research of BM perception
concentrated on the information contained in the kinematics of movement patterns. It
was shown that a wide variety of information can be retrieved by human observers
from such visual stimuli and that the visual system can process this kind of stimuli very
efficiently. However, it is as yet unclear what kind of sensory filter the visual system
uses to allow such fast and efficient processing.
Whereas substantial knowledge is available about the neocortical structures underlying
BM perception from a number of neuroimaging studies described previously, many
issues concerning the distinct processing stages and their temporal characteristics in
humans are as yet unclear. Accordingly, no attempts have been made so far to link
distinct processing stages to distinct neural structures. As the results of psychophysical
Chapter I General Introduction
27
studies have shown, attentional resources are clearly necessary in the perceptual
analysis of BM. It is an additional open question what constitutes their neural
correlates.
The term BM refers to a visual stimulus and the visual information conveyed by the
kinematics of movement patterns, respectively. It is still an unsolved question how this
information is mentally represented. BM information might be stored object centered
and viewpoint-invariant or viewer centered and viewpoint-dependent. Both kinds of
representations indicate different neuronal mechanisms. Furthermore, it is unclear
which role the motor system plays with respect to the representation of human
movement patterns and visual representation of BM.
The role of subcortical regions in the perceptual analysis of BM perception also
remains to be clarified. Neuroimaging studies of BM perception have yielded
inconsistent results with respect to the role of the cerebellum. Clinical lesion studies,
which investigate perceptual performance in patients with distinct lesions, would
provide the opportunity to further elucidate this issue, but have as yet focused on the
role of neocortical brain structures.
1.2 Objectives of the current work
The current work aims to further investigate several issues concerning the
neuropsychological basis of BM perception. Four studies address a number of open
questions described above by using different methodological approaches. The first
study investigates specific aspects of information contained in the kinematics of
animate motion patterns using a psychophysical approach (Study 1). This study is
related to the comprehensive psychophysics literature on BM perception reviewed
previously and might give further insight into the sensory filters subserving the
detection of BM. The second study examines the temporal aspects of BM processing
and associated processing stages by the analysis of event-related potentials and source
analysis (Study 2). This study aims to further elucidate its temporal processing trying to
link distinct processing stages to distinct neural systems. The third study of this thesis
explores differential visual representations of one’s own movement patterns compared
Chapter I General Introduction
28
to other familiar movement patterns by psychophysical methods (Study 3). This study
is related to research on action perception and aims to gain further insight into the
mental representation of movement patterns and a possible interaction of perceptual
and motor representations. Finally, the fourth study investigates the role of the
cerebellum in the perceptual analysis of BM (Study 4). Using a lesion approach, this
study attempts to clarify the role of the cerebellum in BM perception. The objectives of
these studies are further specified in the following paragraphs.
1.2.1 Study 1: Biological motion as cue for the perception of size
The constant force of earth gravity directly influences the motion patterns of animate
and inanimate beings in the physical world. With respect to BM, gravity determines
periodic fluctuations between kinetic and potential energy. Therefore, gravity
determines a fixed relation between temporal (i.e. the stride frequency) and spatial
parameters (i.e. the length of a leg) of energetically optimal gait patterns. In fact it has
been proven that animals adjust their gait patterns in order to minimize the energy
required for their locomotion and that such a relation between size and stride frequency
does exist for a number of different species. It might be possible that this typical
pattern of motion created by the influence of gravity is the critical feature used by the
mammalian visual system to detect BM and process it as special category. If this
specific motion pattern plays a crucial role as sensory filter for the detection of BM, it
can be assumed that the visual system has implicit knowledge about the relationship
between temporal and spatial parameters of BM as defined by gravitational forces.
The first study* of this thesis explored, whether human observers can employ the
relation between temporal and spatial parameters as defined by gravity in order to
* The study “Biological motion as cue for the perception of size” entails two experiments which explored,
whether human observers have implicit knowledge about the relation between temporal and spatial
parameters in animate motion patterns as defined by gravitational forces. Raw data of the first experiment
were part of the diploma thesis. These data were reanalyzed and an additional correlation analysis was
performed. The second experiment was completely conducted within the PhD-project and extends the
first one by two important points. First, the mechanism for indicating perceived size was changed.
Second, a supplementary task was introduced with the goal to get a direct size estimate based on static
Chapter I General Introduction
29
retrieve size information from point-light displays of animate motion. The rational was
to induce different size percepts by manipulation of temporal parameters with respect
to the stride frequency. If observers have implicit knowledge about the influence of
gravity on the movement patterns of animate beings, an inverse quadratic relation
between the perceived size of the animate being and its actual stride frequency is
expected.
1.2.2 Study 2: Structural encoding and recognition of biological motion:
Evidence from event-related potentials and source analysis
Whereas the neural structures involved in BM perception have frequently been
examined using functional neuroimaging, only a few studies have so far attempted to
elucidate the temporal course of processing of point-light displays of BM in humans.
The second study of this PhD-project aimed to investigate, how different processing
stages involved in the perceptual analysis of BM are reflected by modulations in event-
related potentials (ERPs) in order to elucidate the time course and location of neural
processing of BM. Data analysis was carried out using conventional averaging
techniques as well as source localization with low resolution brain electromagnetic
tomography (LORETA).
Observers were presented with three stimulus classes: point-light displays of a walking
figure in normal orientation and inverted orientation and displays of scrambled motion
as control stimuli in which dots have the same motion vector as in the BM condition
but with their initial position being randomized. We predicted an inversion effect for
BM in the time window up to 200 ms after stimulus onset as usually found for faces or
other stimulus categories, for which the observer is an expert. Moreover we expected
an ERP source specific to BM in the STS-complex, concerned with the fine analysis of
motion patterns, which provide biologically relevant information and contribute to
social perception.
cues. This procedure provided the opportunity to separate static and kinematic size cues and to calculate,
how both sources of information are integrated.
Chapter I General Introduction
30
1.2.3 Study 3: Self recognition versus recognition of others by biological
motion: Viewpoint-dependent effects
It is an open question whether the mental representation of one’s own movement
pattern is different from representations of other familiar movement patterns. This
question is addressed in a psychophysical approach examining viewpoint-dependent
recognition effects. The knowledge about such recognition effects provides insight into
the mental representation and perceptual mechanisms of BM processing. It is still under
debate, whether the visual representations of objects are viewpoint-dependent (Bulthoff
& Edelman, 1992; Tarr & Bulthoff, 1995) or viewpoint-invariant (Biederman &
Gerhardstein, 1993, 1995).
Viewpoint-invariance indicates that object recognition is independent of the viewpoint
of previous exposure to the object. By contrast, the hypothesis of viewpoint-
dependence proposes superior object recognition if the object is presented in a familiar
perspective. A dissociation between the mental representation of one’s own gait pattern
and the representation of another familiar person might be based on neural correlates in
a common coding between perception and action as suggested by the direct matching
hypothesis, which was introduced in a previous section.
The third study of this thesis was conducted to address several issues. The first aim was
to investigate recognition performance of gait patterns from familiar persons
represented as point-light displays. The second objective was to elucidate viewpoint-
dependent effects in the representation of gait kinematics by exploring the influence of
viewing angle on recognition performance. The third aim was to examine a potential
dissociation between the mental representation of one’s own gait patterns and gait
patterns of other familiar persons.
1.2.4 Study 4: Cerebellar contribution to the perception of biological
motion
Whereas there is a general agreement concerning the role of the STS-complex in BM
(Bonda et al., 1996; Grezes et al., 2001; Grossman & Blake, 2001, 2002; Grossman et
Chapter I General Introduction
31
al., 2000; Servos et al., 2002; Vaina et al., 2001), the literature on subcortical regions
and the cerebellum in particular is inconsistent. Two of the above mentioned studies
reported cerebellar involvement (Grossman et al., 2000; Vaina et al., 2001), while the
others failed to detect cerebellar activity. Moreover, there are also inconsistencies
regarding the cerebellar subregion that may be involved in BM perception. Grossman
and colleagues (2000) found cerebellar activity in the anterior portion near the midline,
while Vaina and colleagues (2001) reported activity specific to BM in lateral parts of
the cerebellum.
The fourth study of the present project aimed to explore the role of the cerebellum in
BM perception by assessing BM perception in patients with selective ischemic
cerebellar lesions. More specifically, the study addresses the question whether
cerebellar activity in previous imaging studies is critically involved in the perceptual
analysis of BM or whether it is a consequence of co-activations due to a feedforward
mechanism initialized by a visual stimulus of human movements. The latter hypothesis
is supported by the strong interaction between neural systems for perception and action
as described previously and by the finding of cerebellar contribution to motor imagery
(Decety, 1996; Decety, Sjoholm, Ryding, Stenberg, & Ingvar, 1990; Hanakawa et al.,
2003; Luft, Skalej, Stefanou, Klose, & Voigt, 1998; Ryding, Decety, Sjoholm,
Stenberg, & Ingvar, 1993).
1.2.5 Concluding remarks
Taken together, the series of studies aims to further elucidate the neuronal mechanisms
underlying BM perception in humans and to gain new insights concerning the
neuropsychological basis of BM perception. The studies included in this thesis should
contribute to a better understanding of the sensory filters allowing very fast and
efficient detection of animate motion patterns. Knowledge about the temporal coarse of
BM processing allows to associate distinct processing steps to circumscribed neuronal
structures. Further insight into the mental representation of movement patterns and a
possible interaction of perceptual and motor representations might have implications
for general mechanisms in human brain functioning. Finally, knowledge about the role
Chapter I General Introduction
32
of subcortical structures with respect to BM perception might help to understand the
interplay between subcortical and cortical structures in higher visual functions.
Chapter II Biological Motion as Size Cue
33
II Study 1: Biological Motion as a Cue for
the Perception of Size
Daniel Jokisch and Nikolaus F. Troje
Summary
Animals as well as humans adjust their gait patterns in order to minimize energy
required for their locomotion. A particularly important factor is the constant force of
earth’s gravity. In many dynamic systems, gravity defines a relation between temporal
and spatial parameters. The stride frequency of an animal that moves efficiently in
terms of energy consumption depends on its size. In two psychophysical experiments,
we investigated whether human observers can employ this relation in order to retrieve
size information from point-light displays of dogs moving with varying stride
frequencies across the screen. In Experiment 1, observers had to adjust the apparent
size of a walking point-light dog by placing it at different depths in a three-dimensional
depiction of a complex landscape. In Experiment 2, the size of the dog could be
adjusted directly. Results show that displays with high stride frequencies are perceived
to be smaller than displays with low stride frequencies and that this correlation
perfectly reflects the predicted inverse quadratic relation between stride frequency and
size. We conclude that biological motion can serve as a cue to retrieve the size of an
animal and, therefore, to scale the visual environment.
Chapter II Biological Motion as Size Cue
34
2.1 Introduction
The perception of motion is a fundamental property of the visual system. One of the
most complex but also most familiar types of motion are the nonrigid movement
patterns of living organisms. For animals as well as for humans, animate motion
patterns contain a wide variety of information. Correct interpretation of this information
is an important ability. In the animal kingdom, accurate and fast movement recognition
of a prey or predator animal increases an animal’s fitness and, therefore, its chance of
survival. For humans, the ability to identify, interpret, and predict the actions of others
is of particular relevance in the context of successful social interaction that plays a
major adaptive role.
Visualizing the position of the main joints of a walking person by bright dots is enough
to convey a vivid impression of a human figure in motion. The percept collapses into a
meaningless array of unconnected dots when the walker stands still, demonstrating that
the interpretation is carried solely by the dynamics of the display (Johansson, 1973).
Observers require only 100–200 ms to organize such point-light displays into a coherent
percept (Johansson, 1976). The rudimentary information contained in point-light
displays of biological motion (BM) is sufficient even to solve sophisticated recognition
tasks. Observers are able to recognize the gender of a walking person (Barclay, Cutting,
& Kozlowski, 1978; Cutting, 1978; Kozlowski & Cutting, 1977; Mather & Murdoch,
1994; Troje, 2002), can identify friends by their gait (Cutting & Kozlowski, 1977), and
can even recognize themselves from a recorded point-light display of their own
movements (Beardsworth & Buckner, 1981). Mather and West (1993) extended the
point-light display paradigm to animations of four-legged animals and showed that
human observers can identify different animals by their movement pattern. Inversion
effects of BM displays of animal movements were investigated by Pinto and Shiffrar
(1999). The ability to perceive BM is not restricted to humans. It has been shown that
cats are able to identify point-light displays of conspecifics (Blake, 1993), that pigeons
are capable of discriminating between categories of conspecifics’ walking and pecking
when presented as point-light displays (Dittrich, Lea, Barrett & Gurr, 1998), and that
chicks and quails also have the ability to perceive point-light displays of BM of
conspecifics (Yamaguchi & Fujita, 1999). The ability of nonhuman primates to perceive
Chapter II Biological Motion as Size Cue
35
BM was indicated by the finding of single cells responding selectively to BM displays
(Oram & Perrett, 1994).
Animals as well as humans adjust their gait patterns in order to minimize the energy
required for their locomotion. The energy costs are determined by the properties of the
physical world. A particularly important factor in this context is the constant force of
earth’s gravity. For many dynamic events occurring under constant gravity conditions, a
fixed relation between temporal and spatial parameters is maintained. This relation is
particularly valid for inanimate motion systems, such as pendulum motion or ballistic
motion. However, it also seems to hold for many animate motion patterns. Therefore,
from a theoretical point of view, time can be used as an information source about spatial
scale in visually recognizable events under the influence of gravity. Several studies have
investigated the perception of scale properties in inanimate dynamic events.
Pittenger (1985, 1990) examined the perception of the scale properties in pendulum
motion. The length of a freely swinging pendulum is proportional to the square of its
period. Pittenger (1985) found that observers could estimate the length of a pendulum
when given information about its period. The estimated lengths were found to be a
linear function of actual lengths, though with wide differences in slopes among
individual observers. When viewing normal pendulums with physically correct periods
and perturbed pendulums with either shorter or longer periods, observers could rate the
naturalness of motion with a high degree of acuity (Pittenger, 1990).
The same idea has also been applied to the perception of the distance of objects in free
fall (Saxberg, 1987a, 1987b; Watson, Banks, von Hofsten, & Royden, 1992). The law
of free fall motion relates the height of a fall to the duration of the event. Analogous to
pendulum motion, the height of fall is proportional to the square of its duration. In a
simulated catching task, in which observers should predict the position where a ball
approaching along a parabolic trajectory would fall, Saxberg (1987b) tested whether
observers make use of this information. When the display contained information both
from image expansion and vertical component of free fall, observers performed this task
well, but when information of image expansion was eliminated, they failed. The authors
concluded that the latter finding demonstrated a lack of using the information mediated
by the relation between height of fall and its duration. However, Watson and colleagues
Chapter II Biological Motion as Size Cue
36
(1992) argued that this failing was based on conflicting sources of information and not
purely on the inability to retrieve the relation between height and duration of the event.
Stappers and Waller (1993) tested people’s ability to use the time of free fall of objects
as a reference to spatial scale and showed that observers reliably matched gravitational
acceleration to apparent depth in a computer simulation. Hecht, Kaiser and Banks
(1996) examined whether observers could utilize size and distance information provided
by gravitational acceleration by presenting observers with displays of the motion of
rising and falling objects. Observers were able to use the information to some extent but
were more sensitive to average velocity than to gravitational acceleration. Another study
that investigated the perception of spatio-temporal patterns of object motion (Warren,
Kim, & Husney, 1987) demonstrated observers’ ability to make accurate perceptual
judgments of elasticity of bouncing objects by detecting single period duration visually
or auditorily in absence of height information. McConnell, Muchisky and Bingham
(1998) tested observers’ ability to judge object size in event displays that eliminated all
information other than time and trajectory forms. Initially, judgment variability was
substantial, but after feedback on one event, observers performed better and generalized
training to other events. Observers were sensitive to the general form of the spatio-
temporal scaling relation, but required feedback to attune event-specific constants.
The general form of the relation between a spatial scale s and a temporal scale T in
events governed by gravity is given by
s = kt², (1)
where k is a constant factor specific to the event being considered.
The above findings document that the human visual system seems to be able to use this
quadratic relation in order to achieve size information from temporal cues. The absolute
quantitative relation expressed in the constant k, however, is not as easily obtainable.
Psychophysical studies considering the relationship between temporal and spatial
parameters as visual cues for event perception have not been restricted to inanimate
dynamic systems. In the domain of animate motion, such visual cues are proposed to
play a role in action perception. Runeson and Frykholm (1981, 1983) have shown that
the weight of an object can be readily estimated by observing another person lifting and
carrying it when the person is represented as point-light display. They concluded that
Chapter II Biological Motion as Size Cue
37
the crucial information is embedded in the kinematics of the action pattern, in which an
object’s weight is specified by the magnitude of postural adjustments relative to the
acceleration of the object. Bingham (1987, 1993c) provided further empirical evidence
for the content of information about an object’s weight in the kinematic pattern. The
studies by Runeson and Frykholm (1981, 1983) and Bingham (1987, 1993c) investigate
the ability to derive additional information from visual point-light displays of human
actions employing knowledge about the effects of gravity on objects in the physical
world. Therefore, these studies are related to our question. However, they do not
directly address the question whether temporal parameters from BM can be used as a
cue about size information of animate beings.
From a physical point of view, the relation between temporal and spatial parameters
described above is also evident in animate locomotion patterns. A simple model for a
walking biped is an inverted pendulum that idealizes the total body mass to a point mass
on a rigid mass-less leg (Alexander, 1977). More complex models consider humans and
animals as a set of coupled, articulated pendulum segments. No mechanical energy is
needed to maintain the movements of an ideal undamped pendulum because kinetic and
gravitational potential energy fluctuations are equal in amplitude and exactly 180° out
of phase. In humans, the pendulum-like mechanism conserves about 65% of the
mechanical energy from step to step at the preferred walking speed (Cavagna, Thys, &
Zamboni, 1976). Pendulum-like energy exchange diminishes at faster walking speeds
because of a mismatch in the magnitudes and phases of the fluctuations of the two
forms of mechanical energy. Thus, at non-optimal speeds, the muscles must provide
additional mechanical power. The relation between the length l and the period T of an
ideal pendulum is
²4/² πgTl = (2)
with g being gravitational acceleration. In order to obey this relation, smaller animals
have to move with a higher stride frequency f = 1/T than larger animals.
The major force that determines the pendulum-like movements during walking is
gravity, which must be at least equal to the centripetal force needed to keep the center of
mass traveling along a circular arc. The centripetal force needed is equal to mv2/L,
where m is body mass, L is leg length, and v is forward speed (Kram, Domingo, &
Ferris, 1997). The ratio between the centripetal force and the gravitational force
Chapter II Biological Motion as Size Cue
38
(mv2/L)/mg = v2/gL is the dimensionless Froude number (Alexander, 1989). Therefore,
if animals travel with equal Froude number, their speeds v are proportional to the square
root of the leg length L. If they move in dynamically similar fashion (Alexander &
Jayes, 1983), the stride length l is proportional to the leg length and hence the stride
frequency f = v/l is inversely proportional to the square root of the leg length.
Pennycuick (1975) measured the stride frequencies of African mammals moving
spontaneously in their natural habitat and found that they are in fact inversely
proportional to the square root of the stride length to a very good approximation. Thus,
the findings show that the relation between spatial and temporal scales expressed in
Equation 1 is also reflected in the locomotion patterns of animals.
In this study, we examined whether the human visual system is able to use this relation
to derive the size of an animal in the absence of other cues. To achieve this, we
presented observers with point-light displays of a dog. Varying the playback speed, we
asked observers to estimate the size of the dog. We predicted that animals are perceived
to be larger in animations presented with low stride frequency and smaller in animations
with high stride frequency. More specifically, we assumed that the relationship between
the stride frequency f of an animal and its estimated size dyns is
²1
1 fcsdyn = (3)
where c1 is a constant factor quantifying the spatio-temporal scaling relation. The
absolute value of c1 depends on gravitational acceleration and on the gait pattern (e.g.,
trotting, cantering, etc.).
However, the kinematics of the animation may not be the only source of information
about the dog’s size. Additional size cues might be contained in an animal’s posture or
proportions of body segments. For example, Pittenger and Todd (1983) have shown that
changes of static body proportions of line drawings of a human body have an effect on
perception of growth, and, therefore, also have an indirect effect on the perception of
size. Studies using other biological objects have also shown that the perception of size
can be influenced by form information. Bingham (1993a, 1993b) showed that properties
of tree form could be used to estimate the height of trees.
Chapter II Biological Motion as Size Cue
39
The size information embedded in body proportions is independent of the temporal
scaling factor and can be described as a second constant:
2csstat = (4)
dyns and stats exist simultaneously and both may contribute to a size estimate. Here, we
assume linear integration, and we introduce a factor λ accounting for the relative weight
of the two terms:
21 )1(²
1 cf
csall λλ −+= (5)
In order to test this hypothesized model, we conducted two experiments presenting
observers with point-light displays depicting a dog moving across the screen. We chose
a dog as a model because dogs cover a wide range of different sizes ensuring that size
estimations made by observers are not restricted too much by the range of possible
sizes. Our point-light dog was shown as walking through a three-dimensional scene
depicting a desert landscape. When observing the image of such a scene, the perceived
size of different objects within the scene depends, on the one hand, on the visual angle
covered by the objects and, on the other hand, on the perceived position in depth within
the scenery. As a consequence of this size-distance ambiguity, there exist two methods
to change the size of an object within the scene: (1) varying its position in depth while
maintaining a fixed visual angle or (2) showing the object at a fixed distance and
varying the size of the object’s visual angle. For both methods, the size of other objects
embedded within the scene provides an absolute reference.
In Experiment 1, observers were asked to adjust the apparent size of the dog by
changing its position in depth while maintaining its projected size on the screen, and,
therefore, its subtended visual angle. In Experiment 2, observers were allowed to
change the size of the dog directly. In Experiment 2, we also added a second task: In
addition to estimating the size of the dynamic point-light displays, observers were
required to estimate the size of a static stick-figure display.
Chapter II Biological Motion as Size Cue
40
2.2 Experiment 1
The observers’ task was to estimate the size of the dog animations. The point-light
displays were presented in a desert landscape with varying stride frequencies.
Perspective and texture gradient created a three-dimensional percept. Reference objects
(cactuses and posts) were scattered across the scene to provide size references at
different depths. With the visual angle subtended by the dog remaining constant,
observers could place the animation at different locations in depth in order to indicate
the perceived size.
2.2.1 Methods
2.2.1.1 Participants
Sixteen students (11 females and 5 males) between the ages of 20 and 39 years from the
psychology and biology departments at the Ruhr-University participated in this
experiment. They received course credit for their participation. All participants had
normal or corrected-to-normal vision. They were naive as to the objectives of this
experiment.
2.2.1.2 Stimuli
Synthetic motion data of a dog (“Animania Dog” by Credo Interactive Inc.) were
presented in saggital view as point-light displays. The display consisted of 20 dots
altogether. Three dots represented the position of each leg’s main joints (forelegs:
elbow, carpal, and phalange; hind legs: knee, tarsal, and phalange). The positions of the
pelvis and the scapula were both represented by two dots each. Two dots represented
the position of the head and two represented the position of the thoracic and coccygeal
vertebrae. Each dot had a size of 4 mm2 and was displayed in a bright green coloring.
An additional set of 20 black dots represented the shadows of the dots depicting the
dog’s body. Adding a shadow ensures that observers perceive the animal’s legs to have
contact to the ground. The point-light display had a size of 4 cm on the screen
Chapter II Biological Motion as Size Cue
41
corresponding to 4 deg of visual angle at the viewing distance of 58 cm. This distance
was fixed by using a wooden chinrest. The image sizes of the point-light displays were
held constant across all trials.
In order to determine exactly the gait pattern of our animated dog, we examined the
phase relations between the feet. The difference between various gait patterns is
described by the phase relations between the movements of the four legs. For instance,
the trot is a symmetrical gait in which diagonal pairs of legs move together. In cantering
animals, this symmetry is broken. Whereas one diagonal pair of legs moves in
synchrony, the other pair is out of phase, with the respective foreleg being ahead of the
contralateral hind leg. According to Alexander (1984), the phase difference of this
asynchronous pair is 140 deg. In our data, the phases of the legs with respect to the left
foreleg were 155, 205, and 0 deg for the right foreleg, the left hind leg, and the right
hind leg, respectively. This pattern clearly shows the asynchronous characteristic of the
canter, but the phase difference between foreleg and hind leg of the asynchronous leg
pair is smaller than described by Alexander (1984). We still term the gait pattern of our
animated dog in the following experiments as “canter,” accepting some mismatch
between the phase relation in our data and data reported in the literature.
The point-light displays were presented on a background depicting a perspective
landscape (Fig. 2.1). The landscape was designed with the software Bryce 4 by Meta
Creations. It portrayed a desert scene in which were embedded some objects (cactuses
and posts) serving as reference objects. All objects belonging to the same class had the
same size within the perspective scene (posts 1 m; cactuses 2 m), resulting in varying
image sizes on the screen according to their positions in spatial depth. Posts were
positioned in regular distances on two parallel lines. Cactuses were arranged in random
order. The lens of the camera recording this scenery was positioned 1.5 m above the
ground having a tilt angle of 8°. The scenery subtended a visual angle of 35.5 * 24.5
deg.
Chapter II Biological Motion as Size Cue
42
Fig. 2.1. Display of a dog on the perspective background. The lines connecting the dots were shown only in the stick-figure depictions of the second subtask of Experiment 2. They were omitted in Experiment 1 and in the first subtask of Experiment 2.
2.2.1.3 Procedure
Animated dogs moved across the scene from the left-hand side to the right-hand side.
The playback speed was varied systematically, resulting in five different stride
frequencies (2.54, 3.02, 3.59, 4.27, and 5.08 cycles/s). These frequencies corresponded
to 71, 84, 100, 119, and 141% of the original stride frequency. By pressing the arrow
buttons on the keyboard, participants could change the vertical position of the point-
light display on the screen and hence the perceived position in depth in 21 steps. The
physical size of the point-light display remained constant. Due to the perspective
background, each vertical screen position corresponded to one position in spatial depth,
resulting in a changed size impression. Apparent size changed from one position to the
adjacent one by factor 1.09. According to the 21 different positions, apparent size
changed altogether by a factor of 5.66 within the whole range.
The experiment took place in a separate experimental room. Animations were presented
on a 19-inch monitor (90 Hz) at a frame rate of 45 Hz. Observers were told that they
Chapter II Biological Motion as Size Cue
43
would be shown with dogs of different sizes animated as point-light displays. They
were instructed to adjust the apparent size of the dogs so that the display on the screen
looked as natural as possible. In each trial, observers were allowed to try different
positions as often as they wanted. Each time the observers hit a key to change the size,
the dog started at the initial position on the left side of the screen. A trial was completed
when the observers had selected one position and confirmed their choice by pressing the
space bar. Time for solving the task was unlimited. No feedback was given following
the size judgments. Before starting the experimental trials, observers were shown six
demonstration trials in order to familiarize them with the displays and the setup. During
those demonstration trials, the experimenter pointed out the perspective properties of
the scene and drew attention to the various sizes of the objects (posts and cactuses)
serving as reference scale.
The experiment was conducted using a one factorial repeated measures within-subjects
design. The independent variable encoded the five different stride frequencies of the
dog animation. In each condition, 11 repeated trials were presented. Each trial started
with different initial sizes covering the whole range of possible sizes. The order of the
55 trials was randomized individually for each participant.
2.2.2 Results and discussion
The effect of stride frequency on perceived size was significant as tested by an analysis
of variance (ANOVA) (F(4,60) = 11.85, p < 0.001). On average across all participants,
animated dogs moving with high stride frequency were perceived to be smaller than
dogs moving with low stride frequency (Fig. 2.2). This outcome confirms the
hypothesis that observers retrieve size information from the stride frequency that
animals use for locomotion. Recall that the instructions did not explicitly draw
observers’ attention to the stride frequency of the animated animals. According to the
instructions, observers were requested to adjust the position so that the scene looked as
natural as possible. Therefore, observers seem to use implicit knowledge to make their
size judgments.
Based on the assumptions formulated in Equation 5, the function
Chapter II Biological Motion as Size Cue
44
21 ²1 kf
ksall += (6)
was fitted to the data. Using averages across observers, the best fitting values are k1 =
141 and k2 = 35. With these values, the Equation 6 correlates with r2 = 0.96 to the means
of estimated sizes across all observers. Only 4% of the variance of the data remains
unexplained. A linear fit, on the other hand, correlates to the empirical data with r2 =
0.88, therefore leaving 12% of the variance unexplained.
Fig. 2.2. Means across all 16 observers in Experiment 1. The estimated size is plotted for each stride frequency. Error bars indicate SEM. The graph corresponds to the fit of the theoretical model. The coefficient of determination between the function and the means across all observers is r2 = 0.96.
When focusing on the patterns of results obtained from each observer, clear
interindividual differences in consideration of the spatio-temporal scaling relation
become obvious, as is indicated by the variability of k1 (Fig. 2.3). Substantial individual
differences are also evident in the correlation between the empirical data and the model
fit (Table 2.1). Out of 16 observers estimating the size of the animated dogs, 9 showed a
response pattern correlating significantly to the model fit. The response pattern of the
others failed to reach a level of significant correlation. One of the observers (T.B.)
reaching a significant level of correlation between his response pattern and the model fit
interpreted the temporal scaling factor in opposition to the expected direction. This
observer associated high stride frequencies with large sizes and low stride frequencies
with small sizes, resulting in a negative value for k1.
Chapter II Biological Motion as Size Cue
45
Fig. 2.3. Mean estimated size for each observer for each stride frequency in Experiment 1 (n = 11). Error bars indicate SEM. The graph corresponds to the fitted model to each observer individually. *p < 0.05; **p < 0.01 indicates the level of significance of the correlation between the model and individual size estimations.
Chapter II Biological Motion as Size Cue
46
Consistent size information could be retrieved by 50% of the observers in the setting
realized in Experiment 1. This outcome indicates substantial interindividual differences
in the ability to retrieve information from the spatio-temporal scaling relation. Such an
outcome might have at least two possible sources. One explanation is that some
observers neglect the spatio-temporal scaling relation in their estimations and refer only
on other size cues. Alternatively, it may be possible that some observers did not
understand the relationship between changes in vertical position and spatial depth, and,
therefore, had major problems to indicate their size impression adequately within this
experimental setup.
Table 2.1. Experiment 1: Parameters of the theoretical model (Equation 6) fitted to the data of individual participants. r2 = coefficient of determination. *p < 0.05; **p < 0.01.
Participant 1k 2k r²
F.N. 308 27 0.38**
K.S. 358 23 0.62**
L.J. 47 56 0.00
C.O. 324 14 0.75**
Z.K. 155 23 0.45**
M.H. 286 39 0.17**
I.L. 8 59 0.00
L.M. 341 16 0.66**
H.B. -26 48 005
T.R. 93 37 0.05
T.B. -111 60 0.12**
M.V. 76 37 0.08
J.B. 104 36 0.11*
R.R. -11 31 0.01
A.S. 215 17 061**
S.B. 93 44 0.04
We inferred perceived size by requiring observers to adjust the position of the dog
animation in the landscape. One objection to this task could be that the phenomenon of
Chapter II Biological Motion as Size Cue
47
visual depth compression might cause perceptual distortions of the otherwise well-
defined relation between the distance of an object and its projected size. However, as
Sedgwick (1993) points out, this would not affect frontal plane dimensions of a
projected object. In addition, the scene provides reference objects at different depths.
The observers therefore did not have to rely on distance provided by depth cues alone.
The size of the dog could be indicated simply in relation to the size of the cactuses and
posts scattered around the scene.
Moreover, from the setting realized in Experiment 1, neither the weight factor λ
providing information about the individual weights of both sources of information
(static versus dynamic) nor the constants c1 and c2 can be calculated directly, because λ
is confounded with the constant scaling factors c1 and c2 (Equation 5). The constant k1
combining λ and c1 only weakly reflects the tendency to what extent the temporal
scaling relation is considered. As a consequence of the above discussed issues, we
designed a second experiment. In this experiment, observers were allowed to directly
change the size of the dog. While this may facilitate indication of perceived size for the
observers, it also rules out any remaining concerns about depth-compression effects.
Furthermore, a second subtask was added to deal with the problem of confoundation of
the weight factor with the scaling factors.
2.3 Experiment 2
In this experiment, we changed the mechanism for indicating perceived size. Observers
could change perceived size of the dog directly by changing its projected size while its
position in spatial depth remained constant. In the supplementary task, with the goal to
get a direct size estimate based on cues independent of the stride frequency, observers
were requested to estimate apparent size of a static stick-figure depiction to derive a
direct measure of c2 in Equation 5. In combination with measurements k1 and k2
obtained from the first part of Experiment 2, this was used to derive values for λ and c1.
By this procedure, we are able to separate size information from static and dynamic
sources and to calculate how the sources of information are integrated.
Chapter II Biological Motion as Size Cue
48
2.3.1 Method
2.3.1.1 Participants
Sixteen students (8 females and 8 males) between the ages of 19 and 32 years from the
psychology department of the Ruhr-University participated in this experiment. None of
these participants had participated in Experiment 1. Participants received course credit
for their participation. All participants had normal or corrected-to-normal vision. They
were naive as to the objectives of this experiment.
2.3.1.2 Stimuli
Stimuli were identical with the ones used in Experiment 1 with the exception that rather
than displaying the dog with constant projected size at 21 different positions in depth,
this time we generated 21 differently sized dogs and displayed all of them at the same
position. The range of apparent sizes covered by this mechanism was the same as in the
previous experiment. The visual angle of the dog animation varied from 2.2 deg for the
smallest animation to 12.4 deg for the largest animation. The pixel size of the dots
describing the positions of the main joints and their shadows on the ground were
adjusted accordingly. As in Experiment 1, five different stride frequencies were used:
2.54, 3.02, 3.59, 4.27, and 5.08 cycles/s. For the second part of the experiment, we
generated a static stick-figure depiction of the point-light display on the perspective
background used before. The stick figure was positioned in the middle of the screen.
Dots belonging to adjacent joints were connected, illustrating the articulation of the
joints (Fig. 2.1).
2.3.1.3 Procedure
The procedure in the first subtask in Experiment 2 was performed similarly to the one
used in Experiment 1. The only difference was the mechanism for indicating size.
Observers’ instructions were similar to the ones in the former experiment, but were
adapted to the new procedure. Six demonstration trials preceded the 55 experimental
Chapter II Biological Motion as Size Cue
49
trials, in which observers gave their size estimates by choosing the dog with the size
that looked most natural. Observers were given no feedback following their size
judgments. The experiment was conducted using a one factorial repeated measures
within-subjects design. In each of the five different frequency conditions, 11 repeated
trials were presented. Each trial started with different initial sizes covering the whole
range of possible sizes. The order of the 55 trials was randomized individually for each
participant. Having completed the first part of the experiment, participants were
instructed about the second subtask, in which they were presented with 11 trials
showing static stick-figure displays of a dog. Observers were explicitly told that all
stick-figure displays were based on the same animal, varying only on its initial display
size and the state (i.e., the phase) of the stride cycle. Using the arrow keys on the
computer keyboard, their task was to indicate the size of the stick-figure dogs by the
same mechanism as in the first subtask.
2.3.2 Results and discussion
The results of the first part of this experiment were analyzed as in Experiment 1. Similar
to the previous experiment, on average across all observers, dogs moving with high
stride frequency were estimated to be smaller than dogs moving with low stride
frequency (Fig. 2.4). This effect was significant as tested by an ANOVA (F(4,60) =
20.67, p < 0.001).
Chapter II Biological Motion as Size Cue
50
Fig. 2.4. Means across all 16 observers in Experiment 2. The estimated size is plotted for each stride frequency. Error bars indicate SEM. The graph corresponds to the fit of the theoretical model. The coefficient of determination between the function and the means across all observers is r2 = 0.98.
This finding again supports the spatio-temporal scale hypothesis. The following
function provides the best fit between the theoretical model and the empirical data:
38²
1189 +=f
s
The coefficient of determination between this function and the means of estimated sizes
across all observers was r2 = 0.98. A linear fit correlates to the model with r2 = 0.94.
Comparing the proposed model fit with a linear fit, the proposed model leaves only 2%
of the variance unexplained, whereas the linear fit leaves 6% of the variance
unexplained. The median of the static figure size estimations of each observer in the
second subtask was taken as value for c2, representing size information independent of
any temporal scaling cue. On average across all observers, c2 assumes a value of 61.47
cm. The standard deviation of 13.90 cm is relatively small, indicating a generally
uniform behavior in this subtask. Individual measures for c2 were used to determine the
weight factor λ = 1 - k2 / c2 and the spatio-temporal scaling factor c1 = k1 * c2 /( c2 – k2)
for each observer, according to Equation 5 (Table 2.2).
Chapter II Biological Motion as Size Cue
51
Table 2.2. Experiment 2: Characteristics of the theoretical model (Equation 5) fitted to the data of individual participants. Note: k1 = λ c1; k2 = (1-λ) c2. c2 was derived from the median of the size estimations per observer given in the static stick-figure trials. r2 = coefficient of determination. *p < 0.05; **p < 0.01.
Participants 1k 2k 1c 2c λ r²
A.C. 491 20 732.84 60.00 0.67 0.71**
J.A. 186 30 413.33 55.00 0.45 0.25**
H.O. 219 38 521.43 65.43 0.42 0.29**
U.A. 204 34 340.02 84.82 0.60 0.24**
A.A. -13 69 86.67 60.00 -0.15 0.00
S.I. 68 53 566.67 60.00 0.12 0.02
J.N. 143 49 572.01 65.43 0.25 0.08
N.K. 427 12 514.46 71.34 0.83 0.81**
C.N. 184 33 408.89 60.00 0.45 0.52**
D.M. -5 70 -20.83 92.50 0.24 0.00
P.P. 97 33 440.91 42.41 0.22 0.18**
M.H. -27 43 128.57 35.67 -0.21 0.06
A.G. 205 34 427.03 65.43 0.48 0.26**
C.K. 180 29 382.98 55.00 0.47 0.36**
M.K. 283 21 435.38 60.00 0.65 0.45**
J.C. 373 33 1065.71 50.43 0.35 0.18**
The individual response patterns again showed considerable inter-individual differences
in the use of the spatio-temporal scaling factor (Fig. 2.5). In this experiment, a very
clear division into two groups became apparent. Whereas for 11 out of 16 observers the
correlation with the proposed model was highly significant (p < 0.01), there was no
correlation at all for the remaining 5 observers (p > 0.05). Showing very flat curves,
these observers did not seem to pay any attention to the different stride frequencies.
Their response patterns seemed to be completely ignorant with respect to the
independent variable (i.e., the stride frequency). Two observers (J.N. and D.M.) also
showed very large variances across similar stimulus repetitions, which indicates that
they responded in a disoriented manner. Observers from this group also gave the largest
and smallest values for the size of the statically displayed dog. Consequently, for some
of them, very low (and in two cases even negative) values for λ are obtained.
Chapter II Biological Motion as Size Cue
52
Fig. 2.5. Mean estimated size for each observer for each simulated stride frequency in Experiment 2 (n = 11). Error bars indicate SEM. The graph corresponds to the fitted model to each observer individually. *p < 0.05; **p < 0.01 indicates the level of significance of the correlation between the model and individual size estimations.
Chapter II Biological Motion as Size Cue
53
Disregarding the five participants that did not show any meaningful behavior, the results
show that the inverse quadratic relation between characteristic size and stride frequency
is employed by the visual system when estimating the size of an animal in the absence
of other cues.
2.4 General discussion of both experiments
As summarized above, previous experimental work has shown that observers are able to
judge object size in inanimate dynamic systems governed by gravity. The experiments
reported here provide the first empirical evidence that those findings can be extended to
the domain of animate motion as well. The human visual system uses the physically
determined relation between spatial and temporal scales to obtain the size of a moving
animal in the absence of other cues. In both experiments conducted to test the spatio-
temporal scale hypothesis, we found the predicted effect of stride frequency on
perceived size. Nevertheless, when investigating the individual size estimations in terms
of the parameters of the proposed model, substantial interindividual differences became
evident. These differences were more pronounced in Experiment 1 than in Experiment
2. The results obtained in the modified setting show that observers retrieved the motion-
mediated size information more efficiently. The data show less intersubject variability
and larger values for k1 when compared to Experiment 1, in which we had attempted to
provide a method for transforming observers’ size impression into a corresponding
response while maintaining a constant retinal size of the stimulus.
In the two experiments reported here, we presented to the observers a single scaling
relation between time and space with the requirement to yield judgment of spatial scale
based on temporal variations. One might argue that observers simply assign numbers to
the temporal variations without really detecting these variations as information about
scale. However, if this were the case, one would expect observers to assign the direction
of the mapping between time and space arbitrarily. Only one of 32 observers showed a
reversed correlation between perceived size and stride frequency. Moreover, we found a
quadratic relation rather than a simple linear one, which reflects the physical properties
of the temporal spatial relation. Simply assigning numbers to temporal variations would
probably lead to a linear relation instead of a quadratic one.
Chapter II Biological Motion as Size Cue
54
Altogether, seven observers in Experiment 1 and five observers in the optimized setting
in Experiment 2 neglected the temporal-spatial scaling relation by showing a random
pattern in their results. A reason for this pattern of results might be the methodological
approach. We used a method similar to Pittenger (1985), in which participants were
given only timing as information about spatial scale in pendulum motion. Pittenger’s
results were similar to the current results in that they were noisy with strong individual
differences. In a related study concerning pendulum motion (Pittenger, 1990), the
observers were given precise information about spatial scale, but the timing of the event
was manipulated to be either consistent or inconsistent with the pendulum law. Rather
than having to readjust the correct timing, observers had to judge only its correctness.
Observers performed with high accuracy on this task. According to Pittenger’s results,
observers seem to be more sensitive to violation of the temporal-spatial scaling relation
than to transforming temporal information about spatial parameters into size judgments.
A similar effect may have also played a role in our setup.
Given constant stride length, a higher stride frequency goes along with a higher
locomotion speed. One might be concerned about this confoundation of stride frequency
and locomotion speed, arguing that the current results could depend on simple
translational speed rather than on the details of the gait itself. In a previous study
(Jokisch, Midford, & Troje, 2001), we used point-light displays of BM of dog
animations, having subtracted the translational motion component. Consequently, the
position of the point-light animal remained constant in the center of the screen. Varying
the stride frequency, we found a significant effect on perceived size. Therefore, we are
confident that the crucial source conveying size information in the experiments we are
reporting here is the stride frequency itself. Nevertheless, we cannot entirely exclude
that translational speed may contribute to the size judgment. In a natural display stride
frequency, locomotion speed and stride length cannot be unconfounded. However, we
did not want to make any issue about the details of the perceptual cues used to derive
size from BM. Instead, we wanted to test whether the human visual system is able to
employ the relation between temporal and spatial scales, which is physically defined
through gravitational acceleration.
Human observers seem to be able to employ the general inverse quadratic relation
between size and stride frequency to derive information about size from temporal
Chapter II Biological Motion as Size Cue
55
parameters. In addition to this qualitative result, the measurements taken in Experiment
2 can also be used to make quantitative comparisons between the absolute size indicated
by the observers and the size of real animals that walk with the respective stride
frequencies. The relation between size and stride frequency of walking animals is
expressed by the factor c1 in Equation 3. Summarizing the results of Experiment 2, we
compute c1 as the median of the 11 observers that did respond in a consistent manner.
The resulting value amounts to 435 cm s-2. Unfortunately, the only set of data that we
are aware of which can be used to derive the spatio-temporal relation factor from natural
locomotion patterns is the one reported by Pennycuick (1975), who compared stride
frequencies and shoulder heights of 14 African quadruped mammal species for different
gait patterns. The smallest animal in this study (Thomson’s gazelle) had a shoulder
height of 60 cm; the largest one (elephant) had a shoulder height of 310 cm. From
Pennycuick’s Fig. 13, we calculated c1 to amount to 410 cm s-2 for cantering animals.
This value is very close to the one obtained from our data.
The close matching between the empirical data for cantering animals (Pennicuick, 1975)
and the data obtained in our experiments seems to imply that the human visual system
not only takes into consideration the general inverse quadratic relation between stride
frequency and size but also takes advantage from implicit knowledge about the
particular observed gait pattern. We want to note, however, that the good quantitative fit
between Pennicuick’s and our data may well be accidental. There are a number of
factors that introduce uncertainty into the absolute value of the spatio-temporal scaling
factor c1 as derived from our experiments. For instance, the perceived height of the
reference objects in the scenery may deviate from their “real” height. The posts were
intended to have a height of 1 m and the cactuses a height of 2 m. Those numbers were
given to the observers in their introduction to the experiment. However, the reference
objects may still have been perceived to be larger or smaller, changing the reference
frame used to indicate the dog’s size. Another critical point is the determination of the
constant c2 in Equation 5. In the second subtask of Experiment 2, we tried to measure
the perceived size as given by cues that are independent from stride frequency. We did
that by asking the observers to estimate the size of a static stick-figure display.
However, this procedure may not be sufficient to accurately derive the desired
information. It is still possible that a moving dog does provide cues about its size, which
are not available in the static display but which are still not depending on the stride
Chapter II Biological Motion as Size Cue
56
frequency. A last factor that adds uncertainty is the fact that living animals, even if they
try to minimize energy consumption during locomotion, are still different from
inanimate dynamic systems. In a swinging pendulum or a bouncing ball, the relation
between temporal and spatial parameters is exactly defined by gravity, because no other
forces affect these motions. In contrast, in dynamic animate systems, muscular forces
controlled by intentional behavior play an important role. They are not used only to
simply compensate for damping effects in the articulated pendulum system of the body;
they can also be used to significantly alter the motion pattern to cover a wider range of
stride frequencies within a given gait pattern.
In summary, we can state that human observers are able to employ implicit knowledge
about the general inverse quadratic relation between size and stride frequency to derive
information about the size of an animal from temporal parameters. The exact scaling of
this relation is dependent on a number of parameters that are beyond the control of our
current experiments. We are therefore critical with respect to the perfect accordance of
our data with quantitative predictions involving knowledge about the biomechanics of
particular quadruped gaits. It would be interesting, however, to measure whether the
perceived size of animals traveling with a given stride frequency changes in a
predictable way as a function of the gait pattern.
Chapter III Encoding and Recognition of Biological Motion
57
III Study 2: Structural Encoding and
Recognition of Biological Motion:
Evidence from Event-related Potentials
and Source Analysis
Daniel Jokisch, Irene Daum, Boris Suchan1 and Nikolaus F. Troje
Summary
In the present study we investigated how different processing stages involved in the
perceptual analysis of biological motion (BM) are reflected by modulations in event-
related potentials (ERP) in order to elucidate the time course and location of neural
processing of BM. Data analysis was carried out using conventional averaging
techniques as well as source localization with low resolution brain electromagnetic
tomography (LORETA). ERPs were recorded in response to point-light displays of a
walking person, an inverted walking person and displays of scrambled motion.
Analysis yielded a pronounced negativity with a peak at 180 ms after stimulus onset
which was more pronounced for upright walkers than for inverted walkers and
scrambled motion. A later negative component between 230 and 360 ms after stimulus
onset had a larger amplitude for upright and inverted walkers as compared to scrambled
walkers. In the later component, negativity was more pronounced in the right
hemisphere revealing asymmetries in BM perception. LORETA analysis yielded
evidence for sources specific to BM within the right fusiform gyrus and the right
superior temporal gyrus for the second component, whereas sources for BM in the
early component were located in areas associated with attentional aspects of visual
processing. The early component might reflect the pop-out effect of a moving dot
1 Boris Suchan contributed to this study by providing assistance with performing the LORETA analysis and discussing the results.
Chapter III Encoding and Recognition of Biological Motion
58
pattern representing the highly familiar form of a human figure, whereas the later
component might be associated with the specific analysis of motion patterns providing
biologically relevant information.
Chapter III Encoding and Recognition of Biological Motion
59
3.1 Introduction
The human visual system is very sensitive to the detection of animate motion patterns.
We can efficiently detect another living being in a visual scene, recognize human
action patterns and attribute many features of psychological, biological and social
relevance to other persons. An experimental approach for studying information from
biological motion (BM) with reduced interference from non dynamic cues is to
represent the main joints of a person’s body by bright dots against a dark background
(Johansson, 1973). From such point-light displays, observers can easily recognize a
human walker, determine his/her gender (Barclay et al., 1978; Cutting, 1978;
Kozlowski & Cutting, 1977; Mather & Murdoch, 1994; Troje, 2002a), recognize
various action patterns (Dittrich, 1993), identify individual persons (Cutting &
Kozlowski, 1977) and even recognize themselves (Beardsworth & Buckner, 1981). The
evolutionary importance of the perception of animate motion patterns has led to the
development of a specific neural machinery as shown by several brain imaging studies
using fMRI and PET (Bonda et al., 1996; Grezes et al., 2001; Grossman & Blake,
2001, 2002; Grossman et al., 2000; Servos et al., 2002; Vaina et al., 2001). These
studies report selective activation of the superior temporal sulcus (STS) to visual
stimuli consisting of BM. In addition to area STS, activation specific to BM has also
been shown in the cerebellum (Grossman et al., 2000; Vaina et al., 2001), area VP
(Servos et al., 2002), the amygdala (Bonda et al., 1996) and the occipital and fusiform
face area (Grossman & Blake, 2002). Activity in the STS-region is not induced by
meaningful and coordinated non-BM such as the pendulum movements of a
grandfather clock (Pelphrey et al., 2003). A dissociation between visual processing of
moving humans and moving manipulable objects was also supported by Beauchamp
and colleagues (2003) by showing STS activity to human point-light and video displays
in contrast to activity in the middle temporal gyrus to tool video and point-light
displays.
According to a neural model based on BM perception studies, both the dorsal and
ventral processing streams contribute to the perceptual analysis of BM (Giese &
Poggio, 2003). The ventral form pathway is thought to provide information about
sequences of body postures; the dorsal motion pathway is thought to provide
information about complex optic flow patterns. Data from both pathways are integrated
Chapter III Encoding and Recognition of Biological Motion
60
in the STS region. This region is not only involved in the perception of whole body
movements. Activation of STS is also observed during perception of movements of the
eyes, hand and mouth and even when looking at implied motion in static images
(Allison et al., 2000). With respect to face perception, STS is suggested to be involved
in the processing of dynamic aspects of faces that convey information facilitating social
communication (Haxby, Hoffman, & Gobbini, 2000).
Results from imaging studies as well as from simulations (Giese & Poggio, 2003) are
consistent with neuropsychological findings in neurological patients suffering from
focal brain lesions (Cowey & Vaina, 2000; McLeod et al., 1996; Schenk & Zihl, 1997;
Vaina, 1994; Vaina et al., 1990). These case studies provide evidence for a dissociation
between mechanisms involved in the perception of BM on the one hand, and
mechanisms involved in inanimate visual motion tasks or static object recognition tasks
on the other hand.
One approach to study the neuronal dynamics of perception of action is to measure
activity when presenting stimuli consisting of body movements in full view in which
the display initially stands still and movement onset occurs with a delay. Neural
responses to onset of movements of the mouth and the eyes (Puce et al., 2000) were
observed within 200 ms after motion onset as measured by ERPs. Facial movements
occurring on a continuously present face elicited different N170 amplitudes for mouth
opening versus closing and for eye aversion versus eyes gazing at the observer. Similar
results were found for the observation of whole body actions of others (Wheaton et al.,
2001). ERPs elicited in response to movement onset in movie sequences of body
stepping, hand closing and opening, and mouth opening and closing were selective for
specific hand and body motions.
Findings from ERP and functional imaging studies in humans are complemented by
electrophysiological studies in monkeys. Single cell recordings in the macaque superior
temporal polysensory area (STP) yielded neurons which responded selectively to the
sight of whole body movements as well as to point-light displays of BM (Oram &
Perrett, 1994). Many STS cells integrate information about the form and motion of
animate objects (Oram & Perrett, 1996). Further support for the notion that also the
temporal lobe integrates facial form and motion in humans stems from an fMRI and
Chapter III Encoding and Recognition of Biological Motion
61
ERP study of visual processing of natural and line-drawings displays of moving faces
(Puce et al., 2003). The STS and the fusiform gyrus responded selectively to both types
of face stimuli, and they evoked larger ERPs compared to control stimuli at around 200
ms post motion onset. Puce and Perrett (2003) recently concluded that specialized
visual mechanism exist in the STS complex of both humans and non-human primates
which produces selective neural responses to moving natural images of faces and
bodies. These mechanisms are also involved in the processing of point-light displays of
BM. Whereas substantial knowledge is available about the neural structures underlying
BM perception, many issues concerning the processing stages and their temporal
characteristics in humans are as yet unclear. Hirai and colleagues (2003) tried to clarify
the neural dynamics in BM perception by comparing ERPs elicited by point-light
displays of BM and scrambled motion. They report that both types of stimuli elicited
peaks at around 200 and 240 ms which were larger in the BM condition than in the
scrambled motion condition.
The aim of the present study was to further elucidate the nature, time course and
location of neural processing involved in the perception of BM by using event-related
potentials and, in addition, low resolution brain electromagnetic tomography
(LORETA). Point-light displays of whole body motion in upright and inverted
orientation served as stimuli in order to focus on the dynamic aspects of body motion
and to reduce form cues from body shape. In contrast to some previous ERP-studies
(Puce et al., 2000; Wheaton et al., 2001) and in accordance to the approach of Hirai and
colleagues (2003), in the current study form information has to be derived from the
information of the motion trajectories. Furthermore, the comparison of upright and
inverted BM aims to provide deeper insight into distinct processing stages associated
with BM since inverted displays convey the same structural information as upright BM
but the detection of an actor is substantially impaired as shown in several
psychophysical experiments (Pavlova & Sokolov, 2000; Sumi, 1984; Troje, 2003).
Control stimuli were displays of scrambled motion in which dots have the same motion
vector as in the BM condition but with their initial position being randomized. We
predicted an inversion effect for BM in the time window up to 200 ms after stimulus
onset as usually found for faces or other stimulus categories for which the observer is
an expert. Moreover we expected an ERP source specific to upright BM in the STS-
Chapter III Encoding and Recognition of Biological Motion
62
complex, concerned with the fine analysis of motion patterns which provide
biologically relevant information and contribute to social perception.
3.2 Methods
3.2.1 Participants
15 healthy volunteers (eight females, seven males; ages 20 to 35 years) participated in
this study, which was undertaken with the understanding and written consent of each
participant. The procedure was approved by the Ethics Committee of the Ruhr-
University Bochum. All participants had normal or corrected-to-normal vision.
3.2.2 Stimuli
The visual stimuli used in this study were obtained from 20 men and 20 women
walking on a treadmill, which served as models to acquire BM data. Data were
recorded in 3D space using a motion capture system (Vicon; Oxford Metrics, Oxford,
UK). A framework which allows using linear methods to transform BM data (Troje,
2002a) was applied. As result of this procedure, an “average walker” was computed
from our data set and animated as a point-light display. The dots representing the major
joints of the body were located at the ankles, the knees, the hips, the wrists, the elbows,
the shoulders, the center of the pelvis, on the sternum, and in the center of the head.
The point-light displays were presented in frontal view on a black screen either in
upright orientation, inverted orientation (180 degrees rotated in fronto-parallel plane)
or as scrambled motion. Fig. 3.1 illustrates the three categories of visual stimuli. In the
latter condition, the moving dots had the same local motion trajectories as in the
upright BM displays, but their initial starting position was randomized destroying the
spatial relation among the dots. The area in which the scrambling occurred was
matched with respect to size to the other stimulus conditions.
Chapter III Encoding and Recognition of Biological Motion
63
Fig. 3.1. Categories of stimuli: BM in upright orientation (BM), BM in inverted orientation (IBM) and scrambled motion (SCR).
3.2.3 Experimental setup
Participants were seated in a dimly lit sound attenuated cabin, with response buttons
under their right hands. A computer screen was mounted at a distance of 90 cm in front
of the participant’s eyes. At this distance the stimuli subtended a visual angle of
approximately 4.1 degree in height and 1.6 degree in width. All stimuli were presented
for 800 ms at the center of the screen; successive trials were separated by intertrial
intervals of 2000 ms in which a black screen was presented. Stimulus and motion onset
occurred simultaneously. The experiment consisted of 60 trials per condition, resulting
in 180 trials altogether which were presented in randomized order. Participants were
asked to maintain central eye fixation during the trials and to respond as quickly and
accurately as possible by pressing the right button to dot patterns representing BM (in
upright and inverted orientation) and the left button to scrambled motion. Observers
did not receive feedback on their responses. Before starting the experimental trials
observers were shown some demonstration trials of all stimulus conditions in order to
familiarize them with the display and the set up.
Chapter III Encoding and Recognition of Biological Motion
64
3.2.4 EEG-recording
EEG was recorded with Ag-AgCl electrodes mounted in an elastic cap from 30 scalp
sites (F5, FZ, F6, T7, C5, C3, CZ, C4, C6, T8, TP7, CP5, CP3, CP4, CP6, TP8, P7, P5,
P3, PZ, P4, P6, P8, PO7, O1, OZ, O2, PO8, A1, A2) according to the 10-20 system,
referenced to an electrode on the tip of the nose. EOG was recorded from above and
below the left eye as well as from the outer canthi of both eyes. Impedance was kept
below 5 kΩ. A Neuroscan Synamps System with related software was used for
recording. EEG was sampled with 200 Hz and stored on hard disk.
The EEG data were analyzed off-line using the Brain Vision Analyzer software
package. Raw data were digitally filtered with a 0.1 Hz high-pass and a 40 Hz low-pass
filter and segmented into epochs ranging from 200 ms before stimulus onset to 2000
ms after stimulus onset. After removing segments containing artifacts, ocular
correction was carried out. Artifact detection was automatically performed (criterion +-
75 µV) and visually checked, afterwards. Epochs were baseline-corrected using the
signal during the 200 ms that preceded the onset of the stimulus and averaged
according to the three experimental conditions. Thereafter grand averaged ERPs were
calculated. Since the error rate was generally very low (less than 5%), all trials were
included in the analysis.
3.2.5 Data analysis
Behavioral performance was analyzed by conducting a repeated measure one-way
ANOVA to determine effects of stimulus condition on response times. ERP effects of
experimental variables were determined by conducting repeated measures ANOVAs on
ERP peak amplitude values (N170) or ERP mean amplitude values (N 300). N170
amplitude was measured as peak amplitude in the 150-200 ms time range using an
automated procedure. N300 amplitude was defined as mean activity in the time range
between 230 and 360 ms after stimulus onset and calculated automatically. Peak
latencies were analyzed for the N170 component and for the positive component
preceding the N300 component in the time window 220-280 ms after stimulus onset.
The ANOVAs were conducted for the factors stimulus category (BM versus inverted
Chapter III Encoding and Recognition of Biological Motion
65
BM versus scrambled motion), selected electrode locations (O1, O2 versus PO7, PO8
versus P7, P8 versus TP7, TP8) and recording side (right versus left). Electrode
selection was based on the main areas of interest as suggested by previous functional
imaging studies (Bonda et al., 1996; Grezes et al., 2001; Grossman & Blake, 2001,
2002; Grossman et al., 2000; Servos et al., 2002; Vaina et al., 2001). Greenhouse-
Geisser adjustments to the degrees of freedom were performed when appropriate.
3.2.6 LORETA-Analysis
LORETA (Pascual-Marqui et al., 1999; Pascual-Marqui, Michel, & Lehmann, 1994)
calculates the current density at each voxel in the gray matter and the hippocampus of a
reference brain as a linear, weighted sum of the scalp electric potentials. LORETA
chooses the smoothest of all possible current density configurations throughout the
brain volume. This procedure only implicates that neighboring voxels should have a
maximally similar activity, no other constraints are used. LORETA-images represent
the electrical activity at each of the voxel as squared magnitude of the computed
current density.
Amplitudes of the N170 component and mean activity between 230 and 360 ms (N300)
were exported from the results as ASCII data for LORETA analysis. For each subject
and each condition, one LORETA image was generated. These images were converted
(http://www.ihb.spb.ru/~pet_lab/L2S/L2SMain.htm) for further analysis with SPM99
(http://www.fil.ion.ucl.ac.uk/spm/). In SPM99 a PET/SPECT design with a two sample
t-test was performed. The following parameters were used for the analysis: global
normalisation with proportional scaling and proportional scaling to a global mean = 50,
absolute threshold masking with an analysis threshold set to 0 and global calculation of
mean voxel value (within per image). The level of significance was set to p < 0.03.
Foci of significant differences were transformed into Talairach space (Talairach &
Tournoux, 1988) using the algorithm suggested by Brett (http://www.mrc-
cbu.cam.ac.uk/Imaging/mnispace.html) for anatomical labelling.
Chapter III Encoding and Recognition of Biological Motion
66
3.3 Results
3.3.1 Behavioral performance
Behavioral performance of correct identification of all three stimulus categories
exceeded 95% across all conditions. RTs in the upright BM condition were
significantly shorter than in the other conditions (F(2,20) = 27.66; p < 0.001). RTs in
the inverted BM condition and in the scrambled motion condition did not differ
significantly from each other as tested post hoc with Bonferroni adjusted measures
(Fig. 3.2). RT analysis is based on 11 out of 15 subjects, because of missing data in
four subjects.
Fig. 3.2. Reaction times for correct responses in the three experimental conditions. Error bars indicate SEM.
3.3.2 ERP effects
Fig. 3.3A shows grand averaged ERPs in response to the three experimental conditions.
In the latency window up to 400 ms after stimulus onset, two distinct components
emerged in all experimental conditions. The first negative component peaks on average
at a latency of 183 ms. In accordance with previous studies (Puce & Perrett, 2003; Puce
et al., 2000; Puce et al., 2003; Wheaton et al., 2001), this component was termed N170.
Chapter III Encoding and Recognition of Biological Motion
67
The later negative component is located in the time window between 230 and 360 ms
after stimulus onset. We refer to this second negative component as N300. Fig. 3.3B
shows the difference waveforms of BM minus scrambled motion as well as inverted
BM minus scrambled motion to illustrate the differential amplitudes of BM and
inverted BM compared to scrambled motion. These difference waveforms reach their
largest amplitudes earlier than the N170 peak and the N300 peak respectively.
3.3.2.1 N170 amplitude and latency
N170 amplitude was assessed individually as peak amplitude within the 150-200 ms
latency window. ANOVA yielded significant main effects of stimulus condition
(F(2,28) = 9.97; p = 0.001) and electrode location (F(2,28) = 4.42; p = 0.044) as well as
a significant interaction (F(2,28) = 3.31; p = 0.018). Differences between stimulus
conditions were tested post-hoc with Bonferroni adjusted measures yielding a
significant difference between the BM and scrambled motion conditions (p = 0.003)
and between the BM and inverted BM conditions (p = 0.029). The main effect of
electrode locations is due to generally smaller peak amplitudes at parieto-temporal
electrodes in comparison to the other sites. The interaction between stimulus condition
and electrode location reflects larger peak amplitude differences between upright BM
and the other conditions for posterior electrodes than for anterior electrodes. Analysis
of peak latencies for the N170 component (Table 3.1) did not yield significant
differences between the three conditions (F(2,28) = 0.32; p = 0.969).
Chapter III Encoding and Recognition of Biological Motion
68
Fig. 3.3. A) Grand-averaged ERPs recorded at lateral posterior electrodes (left hemisphere: O1, PO7, P7 and TP7; right hemisphere: O2, PO8, P8 and TP8) in response to BM (solid lines), inverted BM (dotted lines) and scrambled motion (dashed lines). Arrows indicate mean peak latency of the N170 and the N300 component. B) Difference waveforms obtained by subtracting ERPs to scrambled motion from ERPs to BM (solid line) and by subtracting ERPs to scrambled motion from ERPs to inverted BM (dotted line).
Chapter III Encoding and Recognition of Biological Motion
69
Table 3.1. Peak latencies for the N170 component and the positive peak preceding the N300 component (means and SEM in ms).
BM IBM SCM
Mean N170 peak latency
184.3
(2.7)
181,6
(3.1)
182.9
(3.1)
Mean positive peak latency pre N300 246.1
(3.4)
247.7
(4.9)
257,9
(3.0)
3.3.2.2 N300 amplitude and latency
The later negative component (N300) was quantified as mean amplitude within the
230-360 ms latency window. The N 300 component had a larger amplitude for upright
and inverted walkers as compared to scrambled walkers (F(2,28) = 14.90; p < 0.001). A
post-hoc test with Bonferroni adjusted measures revealed a significant difference
between the BM and scrambled motion conditions (p = 0.001) and between the
inverted BM and scrambled motion conditions (p = 0.003). In addition, there was a
significant interaction between stimulus condition and side of recording (F(2,28) =
4.21; p < 0.027). This interaction is due to larger differences in mean amplitudes
between BM and scrambled motion at right temporo-parietal and parietal electrodes.
This indicates a pronounced right-hemispheric advantage in visual processing of BM,
particularly at later processing stages.
In addition to amplitude differences, significant differences in the latencies (Table 3.1)
of the positive peak preceding the N300 component emerged (F(2,28) = 4.70; p <
0.024). The peak was delayed in the scrambled motion condition, whereas the latencies
for BM and inverted BM did not differ significantly. Moreover, there was also a
significant effect of peak amplitudes for this positive peak (F(2,28) = 18.76; p < 0.001)
with a pronounced positivity for scrambled motion. Amplitude differences between
BM and inverted BM were not significant.
Chapter III Encoding and Recognition of Biological Motion
70
3.3.2.3 Source analysis
Source analysis was performed for two reasons: To consider the data from all
electrodes during the two time windows of interest and to get an approximation of the
ERP component sources. Results of the LORETA analysis are listed in Table 3.2. It has
to be pointed out that the spatial resolution of this analysis is not as high as resolution
from fMRI experiments. Source analysis was performed separately for the N170 and
N300 component.
For the N170 component, the contrast between upright BM and scrambled motion and
the contrast between upright BM and inverted BM were calculated, since the ERP
amplitudes between those conditions differed significantly from each other. The
contrast between upright BM and scrambled motion revealed sources in the posterior
cingulate gyrus and in the left lingual gyrus. Results for the upright BM versus inverted
BM contrast yielded three distinct sources: One in the posterior cingulate gyrus, one in
the area of the subcallosal gyrus/precuneus and another in the right occipital gyrus. Fig.
3.4 illustrates schematically the location of the N170 sources for both contrasts.
Chapter III Encoding and Recognition of Biological Motion
71
Table 3.2. Results of LORETA analysis for N170 and N300 contrasts showing Talairach space coordinates, probable Brodman areas (BA) in the range of 3mm and levels of significance.
BA x y z p
N170 BM vs SCR
Lingual gyrus 10 -24 79 -7 0.016
Posterior cingulate gyrus 23, 30 -3 -44 22 0.024
N170 BM vs IBM
Subcallosal gyrus,
Precuneus -16 -64 23 0.014
Posterior cingulate gyrus 30 -3 -51 16 0.014
Middle occipital gyrus 19 46 -78 11 0.024
N300 BM vs SCR
Fusiform gyrus,
Cerebellum
6 32 -45 -15 0.001
Subcallosal gyrus,
Anterior cingulate gyrus
6 -3 9 -11 0.009
Medial frontal gyrus,
rectal gyrus
25
11
-10 16 -19 0.015
Superior temporal gyrus 52 -37 16 0.011
N300 IBM vs SCR
Inferior frontal gyrus,
Middle frontal gyrus
47
11
25 29 -12 0.004
Medial frontal gyrus,
Rectal gyrus
25
11
-3 16 -18 0.013
Anterior cingulate gyrus 25 -3 16 -6 0.013
Superior temporal gyrus 52 -37 9 0.027
Chapter III Encoding and Recognition of Biological Motion
72
Fig. 3.4: LORETA-analysis: Group comparison of absolute current density values between the upright BM condition and the scrambled motion condition (BM-SCR) and between the upright BM condition and the inverted BM condition (BM-IBM) for the N 170 peak. Three percent p-value threshold.
For the N300 component, the contrast between upright BM and scrambled motion and
the contrast between inverted BM and scrambled motion were calculated. The largest
contrast between upright BM and scrambled motion emerged for the right fusiform
gyrus. In addition, a source within the right superior temporal gyrus was estimated.
Additional sources were observed in the orbitofrontal cortex (subcallosal gyrus and
anterior cingulate gyrus; rectal gyrus and medial frontal gyrus). Results for the inverted
BM versus scrambled motion contrast yielded four distinct sources. One source was
located in the right superior temporal gyrus, the other sources were located in
orbitofrontal brain areas (rectal gyrus and medial frontal gyrus; and anterior cingulate
gyrus) and in the Inferior frontal gyrus. Locations of the N300 sources for both
contrasts are illustrated in Fig. 3.5.
Chapter III Encoding and Recognition of Biological Motion
73
Fig. 3.5: LORETA-analysis: Group comparison of absolute current density values between the upright BM condition and the scrambled motion condition (BM-SCR) and between the inverted BM condition and the scrambled motion condition (IBM-SCR) in the time range 230-360 ms after stimulus onset (N 300). Three percent p-value threshold.
LORETA analysis yielded evidence for sources specific to BM within the right
fusiform gyrus and the right superior temporal gyrus for the second component,
whereas sources specific for BM in the early component were located in areas
associated with attentional aspects of visual processing (posterior cingulate cortex).
Additional sources generating the second component were located in orbitofrontal
brain areas (anterior cingulate gyrus, medial frontal gyrus).
3.4 Discussion
The current results present clear evidence for the involvement of two distinct
processing stages in the visual analysis of BM: An early negative component (N170)
peaking at 180 ms after stimulus onset and a later negative component (N300) in the
time window between 230 and 360 ms after stimulus onset. The N170 component is
Chapter III Encoding and Recognition of Biological Motion
74
modulated differently by upright BM in comparison to inverted BM and scrambled
motion. The difference between upright and inverted BM reflects an inversion effect in
BM perception. The amplitude of the N300 component did not differ significantly
between inverted and upright BM stimuli, but was less pronounced for scrambled
motion, indicating similar processing for upright and inverted BM conditions in later
processing stages. Whereas the sources generating the N170 component are mainly
located in posterior areas near the midline, the later N300 component is generated by
sources in the superior temporal gyrus and the fusiform gyrus in the right hemisphere.
In our experimental setup, onset of stimulus presentation and onset of motion occur in
parallel. Therefore, the ERPs are evoked in response to stimulus onset as well as in
response to motion onset and consequently both processes may contribute to the neural
responses recorded in the ERPs. The neural basis of motion perception has been
studied psychophysiologically using motion-onset VEPs (Bach & Ullrich, 1994, 1997;
Hoffmann, Unsold, & Bach, 2001). Visual motion onset evokes VEP at two major
sites, the occipital/occipital-temporal sites and the vertex. At occipital sites, visual
motion onset per se evokes ERP components that are dominated by a minor positivity
(P1) around 100-130 ms and a pronounced negativity (N2) around 150-200 ms. The
negative component represents motion mechanisms as shown by its susceptibility to
motion adaptation, while the positive component is more likely associated with form-
processing mechanisms. In visual perception of faces or man-made objects like houses,
stimulus onset elicits a negative component around a latency of 170 ms (Eimer,
2000b). Taking these factors into consideration, the first negative ERP component
(N170) obtained in the present experiment may reflect the contribution of both
processes. Since both processes are inherent features of BM, the relative contribution
of each process to ERPs is difficult to estimate.
BM processing occurs with very short latencies. Visual processing needed to perform
the highly demanding task of discrimination between upright BM patterns on the one
hand and scrambled motion and inverted BM on the other hand can be achieved within
a time period of 180 ms. In static images of natural scenes processing required to
decide whether the scene contains an animal can even be performed within 150 ms, as
measured by ERP modulation (Thorpe, Fize, & Marlot, 1996). Processing speed cannot
be improved by extensive training (Fabre-Thorpe, Delorme, Marlot, & Thorpe, 2001),
Chapter III Encoding and Recognition of Biological Motion
75
and seems to be limited by the underlying neural mechanism. The present study is
concerned with motion rather than with static stimuli representing animate versus
inanimate stimuli. Given a monitor frame rate of 60 Hz, the visual system needs at least
2/60 ~ 30 ms to integrate two frames, which is a prerequisite for generation of a
coherent percept in stimuli of BM. Because of the short latencies obtained in the
present study, it seems likely that the processing of BM must be based on highly
automated feed-forward mechanisms.
Nevertheless, we cannot rule out completely that static cues have played a minor role
to perform the discrimination task. The three classes of stimuli can be discriminated on
the basis of static versions of the displays since the spatial arrangement is slightly
different for upright, inverted and scrambled stimuli. Such a discrimination of static
displays would be based purely on geometrical cues and is clearly different from the
discrimination based on the percept of a walking human figure which is only evoked by
animated displays of BM. Moreover, we found sources of the ERP signal in the second
component in brain areas which were reported to be selectively involved in BM
perception in several imaging studies (Bonda et al., 1996; Grezes et al., 2001;
Grossman & Blake, 2001, 2002; Grossman et al., 2000; Servos et al., 2002; Vaina et
al., 2001) but were not reported to be activated by static displays consisting of a few
dots which match joint positions.
The ERP effects are in accordance with psychophysical findings showing a strong
inversion effect in perception of BM (Bertenthal et al., 1987; Dittrich, 1993; Dittrich,
Troscianko, Lea, & Morgan, 1996; Mitkin & Pavlova, 1990; Pavlova & Sokolov, 2000;
Sumi, 1984; Troje, 2003) as well as in face perception, e.g. (Thompson, 1980; Troje,
2003; Valentine, 1988). As expected, the ERP effects are mirrored by behavioral data,
showing shorter response times for the detection of upright walkers in comparison to
inverted walkers and scrambled motion. Taking into account the longer response times
for inverted BM and scrambled motion, similar modulations in the early ERP-
component in response to inverted BM and scrambled motion can be expected. The
pronounced effect for upright BM, therefore, probably reflects the pop-out effect of
BM in familiar orientation that leads to shorter reaction times for the detection of this
type of visual information. This pop-out effect might be associated with the global
Chapter III Encoding and Recognition of Biological Motion
76
recognition of a human person depicted as a point-light display as previously described
(Bertenthal & Pinto, 1994).
A pop-out phenomenon resulting from perceptual experience requires the involvement
of high-level areas, since complex visual stimuli such as point-light displays of BM
must be integrated in neural populations having large receptive fields. Due to the short
latency of the N170 component, this can only be achieved by a feedforward
mechanism. This interpretation is in accordance with the reverse hierarchy theory
Hochstein and Ahissar (2002) suggesting that “vision at a glance” matches a high-
level, generalized, categorical scene interpretation by neural processing along the
feedforward hierarchy of areas leading to increasingly complex representations. High-
level spread attention associated with “vision at a glance” subserves the initial, crude
global percept of the gist of a scene. The pop-out effect is only one aspect of this crude
initial assessment. In contrast, for later “vision with scrutiny”, reverse hierarchy
routines focus attention to specific units incorporating detailed information available
there into conscious perception. Therefore, fine discrimination depends on re-entry to
low-level specific receptive fields to bind features.
In the N170 component, sources generating both the contrast between BM and inverted
BM and the contrast between BM and scrambled motion are located in the posterior
cingulate cortex. This area seems to have different attention-related functions. It has
been suggested that the cingulate region may establish a neural interface between
attention and motivation (Small et al., 2003). In addition, activity in the posterior
cingulate cortex was correlated with the speed of detecting a visual target when it was
preceded by a predictive cue (Mesulam, Nobre, Kim, Parrish, & Gitelman, 2001). We
assume that the posterior cingulate cortex might reflect high-level spread attention
subserving neural processing leading to the global percept. The finding of sources
generating the contrast between BM and the other conditions in attention-related areas
also fits well with recent behavioral data suggesting that attention is required for the
visual analysis of point-light displays (Cavanagh et al., 2001; Thornton et al., 2002).
Cavanagh and colleagues (2001) showed that discrimination of specific features of
point-light displays of BM seems to be a serial process since reaction times increased
with the number of items. The reaction time increase was attributed to increasing
attentional demands of the task. Results in a dual task paradigm to explore the role of
Chapter III Encoding and Recognition of Biological Motion
77
attention in the processing of BM (Thornton et al., 2002) suggested that, in some cases,
perception of BM can be automatic. But if strategies operating in a global, top-down
fashion are required, attentional demands play a vital role.
The N300 component might reflect processes associated with “vision with scrutiny”
according to the reverse hierarchical theory (Hochstein & Ahissar, 2002) and is
responsible for the fine analysis of BM patterns which is necessary to retrieve visual
information of social and psychological relevance. This view is in agreement with
source localization relating to the contrast between BM and scrambled motion in the
second component. These sources are located in the right superior temporal gyrus and
the right fusiform gyrus. For the contrast between inverted BM and scrambled motion,
there was only one source located in the superior temporal gyrus. In several brain
imaging studies, the STS complex as well as the fusiform face area were shown to be
involved in perception of BM. (Bonda et al., 1996; Grezes et al., 2001; Grossman &
Blake, 2001, 2002; Grossman et al., 2000; Servos et al., 2002; Vaina et al., 2001). The
present results from ERPs and source analysis are in accordance with the findings from
these brain imaging studies.
In addition to STG and FFG, we also found sources within the anterior cingulate cortex
and the orbital prefrontal cortex for the second component. Several studies have shown
that the anterior cingulate cortex is involved in response selection (Bunge, Hazeltine,
Scanlon, Rosen, & Gabrieli, 2002), monitoring of performance and conflict evaluation
(Carter et al., 1998) and regulation of attention (Posner & DiGirolamo, 1998). The
anterior cingulate cortex source may be thus more related to cognitive processes
associated with the task requirements (to make a decision between two response
alternatives) than to perceptual analysis of various stimulus classes. A number of
studies (Adolphs, 2001) provided evidence that the orbitofrontal cortex plays a crucial
role in social cognition and has strong interconnections to the STS-complex as well as
to the fusiform gyrus which are engaged in social perception. In the present study,
sources in the orbitofrontal cortex might also be related to these neural mechanisms.
The ERP correlate of the inversion effect in BM perception are opposite to inversion
effects obtained in face recognition paradigms, in which inverted faces elicit a higher
negativity in the N170 component than upright faces (Rossion & Gauthier, 2002).
Chapter III Encoding and Recognition of Biological Motion
78
When comparing upright faces with control stimuli (consisting of houses etc.), faces
elicit a more pronounced N170 effect (Eimer, 2000a, 2000b). In the present procedure,
we used scrambled motion as control stimuli. There were no differences in the early
component between scrambled and inverted BM, indicating that these two stimulus
types are processed similarly during initial processing stages. In addition, the mean
peak latency of the first component was 180 ms. This latency is longer than the peak
latencies usually reported for face perception (Bentin, Allison, Puce, Perez, &
McCarthy, 1996). This difference may be a consequence of an extended integration
time for the detection of form conveyed by motion which is not required for static
image perception.
The finding of hemispheric asymmetries in STG and FFG in later processing of BM
representing human gait patterns is also consistent with evidence from fMRI studies
(Grezes et al., 2001; Grossman et al., 2000), which reported pronounced activity in the
right hemisphere associated with the perception of BM. This finding may be related to
information useful for the recognition of individual human features. These displays
contain such sources of information and provide important cues for social interaction.
Our results are in accordance with the ERP- findings from Hirai and colleagues (2003)
who reported peaks at around 200 and 240 ms after stimulus onset which were larger in
response to BM than to scrambled motion. In contrast to our study, they used only two
experimental conditions (BM versus scrambled motion) and did not include source
analysis. Their displays were presented in profile view and were generated by a
computer algorithm. There are, however, differences concerning the latency of the first
component which was on average shorter in our paradigm than in the study by Hirai
and colleagues (2003). Our study extends this work by demonstrating inversion effects
in early visual processing reflecting an pop-out effect with upright walkers, but not in
later processing stages associated with the fine analysis of BM. In addition, our
findings offer some evidence for the brain areas associated with the two components.
A recently published paper (Pavlova et al., 2004) analyzed gamma MEG activity in
response to BM generated from a computer algorithm. Recognizable upright and non-
recognizable inverted walkers evoked enhancements in oscillatory gamma brain
activity (25-30 Hz) over the left occipital cortices as early as 100 ms from stimulus
onset. Upright BM elicited further gamma response over the parietal (130 ms) and right
Chapter III Encoding and Recognition of Biological Motion
79
temporal (170 ms) lobes. Whereas the temporal order and approximate localization of
brain areas showing synchronized firing pattern in the gamma band is in accordance
with our findings, the absolute timing of the occurrence of gamma activity is different
from the timing of the ERP-components found in our study. This difference might be
related to differences in recording techniques (MEG versus EEG) and analysis methods
(frequency analysis versus ERP).
Taken together, our findings suggest two distinct components of BM processing,
recruiting different neuronal populations. The first processing stage (N170) reflects the
generation of a global percept of the visual scene leading to a pop-out effect of upright
BM. The sensitivity to upright BM is the result of the familiarity of BM in normal
orientation. At the level of the second processing stage (N300), brain areas as STG and
FFG, which are known from fMRI studies to be involved in the fine and detailed
perceptual analysis of BM, play an important role. The evidence for fast, efficient
processing underlines the importance of perception of BM, and provides further
evidence for a specific neural network involved in processing biologically relevant
motion signals. The right-hemispheric dominance associated with BM perception
shows clear parallels to asymmetries in face perception and probably reflects the social
relevance of animate motion perception. Furthermore, STG and FFG are primarily
involved in the second ERP component (N300) and show clear hemispheric
asymmetries in the perceptual analysis of BM only during later processing stages.
Chapter IV Viewpoint-dependent Recognition of Biological Motion
80
IV Study 3: Self Recognition versus
Recognition of Others by Biological
Motion: Viewpoint-dependent Effects
Daniel Jokisch, Irene Daum and Nikolaus F. Troje
Summary
In the present study we investigated the influence of viewing angle on recognition
performance of walking patterns of one’s own person and familiar individuals such as
friends or colleagues. Viewpoint-dependent recognition performance was tested in two
groups of twelve persons who know each other very well. Participants’ motion data
were acquired by recording their walking patterns in three-dimensional space using a
motion capture system. Size normalized point-light displays of biological motion of
these walking patterns, including one’s own, were presented to the same group
members on a computer screen in frontal view, half profile view and profile view.
Observers were requested to assign the person’s name to the individual gait pattern
being presented without receiving feedback.
Whereas recognition performance of the own walking patterns was viewpoint
independent, recognition rate for other familiar individuals was better for frontal and
half profile view than for profile view. Viewpoint-dependent recognition effects for
other people might be due to selective attention to approaching people, leading to
preferential exposure to frontal and half profile views of gait patterns. The finding of
viewpoint independent representation of own movement patterns might be related to a
crossmodal transfer from motor to visual representations.
Chapter IV Viewpoint-dependent Recognition of Biological Motion
81
4.1 Introduction
One of the most biologically salient events are animate motion patterns. Humans can
efficiently detect another living being in a visual scene and retrieve many features of
psychological, biological and social relevance. The ability to identify, interpret, and
predict the actions of others is of particular relevance and essential for successful social
interaction. Visualizing the positions of the main joints of a walking human body by
bright dots against a dark background (Johansson, 1973) yields information from
biological motion (BM) with reduced interference form non-dynamic cues. From such
point-light displays, observers can easily recognize a human walker within 200 ms
(Johansson, 1976), determine his/her gender (Barclay et al., 1978; Cutting, 1978;
Kozlowski & Cutting, 1977; Mather & Murdoch, 1994; Troje, 2002a) and recognize
various action patterns (Dittrich, 1993). Dynamic cues from walking patterns also
contain sufficient information to recognize identity, if observers are familiar with the
persons to be presented (Cutting & Kozlowski, 1977), and even to recognize oneself
from a recorded point-light display of one’s own movements (Beardsworth & Buckner,
1981). Recognition performance in the latter studies was significantly above chance
level, but information from BM failed to provide a cue for identity as reliably as facial
information or voice information. When presented with gait patterns from six familiar
persons, recognition performance varied between 35–40% for correct identifications of
other persons, whereas the recognition rates for the own gait pattern was 60%.
Perceptual analysis of BM is performed by a specific neuronal network (Bonda et al.,
1996; Grezes et al., 2001; Grossman & Blake, 2001, 2002; Grossman et al., 2000;
Servos et al., 2002; Vaina et al., 2001) which involves both the dorsal motion pathway
as well as the ventral form pathway (Giese & Poggio, 2003). Data from both pathways
are integrated in a region around the superior temporal sulcus. Whereas there exist a
large body of literature about the impressive ability of the visual system to derive a
coherent percept of a human body from a small number of moving dots, the principles
underlying information encoding and retrieval are not yet fully understood.
One important aspect is the viewpoint from which a walker is seen and its influence on
our ability to extract information from BM. The knowledge about viewpoint-dependent
Chapter IV Viewpoint-dependent Recognition of Biological Motion
82
recognition effects may provide insight into the mental representations and perceptual
mechanisms of BM processing. There is an ongoing debate whether visual
representations of objects are viewpoint-dependent (Bulthoff & Edelman, 1992; Tarr &
Bulthoff, 1995) or viewpoint-invariant (Biederman & Gerhardstein, 1993, 1995).
Viewpoint-invariance indicates that object recognition is independent upon the
viewpoint of previous exposure to the object, whereas viewpoint-dependence results in
better object recognition when presented in a familiar perspective.
The Recognition-by-Components approach of viewpoint-invariance (Biederman &
Gerhardstein, 1993) is restricted to inanimate objects which fulfill specific criteria
(objects must be decomposable into viewpoint-invariant parts, so-called “geons”;
structural descriptions of different objects must be distinctive; identical structural
descriptions over different viewpoints). Other approaches (Bulthoff & Edelman, 1992)
support that viewpoint-invariance is valid for all object classes, independent of specific
features. To reconcile both theories, viewpoint-independent recognition may occur at a
basic level, whereas viewpoint-dependence applies to a subordinate level. This view is
supported by Foster and Gilson (2002). They showed that both image-based as well as
structural representations can play a role, dependent on the object class and the level of
object specificity. Recognition processes based on localized features seem to be more
viewpoint-dependent, and generalization is limited (Bulthoff & Edelman, 1992).
Consistent with the latter theory, viewpoint-dependence has been shown for unfamiliar
faces (Hill & Bruce, 1996; Hill, Schyns, & Akamatsu, 1997; Troje & Bulthoff, 1996)
and to familiar faces, with slightly longer response times to profile views than to frontal
views (Bruce, Valentine, & Baddeley, 1987). Observers’ performance is poorer at
recognizing their own profile (which is an unfamiliar view for one’s own face)
compared with a frontal view, whereas there is no difference in response time between
frontal and profile views of faces of highly familiar individuals (Troje & Kersten,
1999). Taken together, these results provide strong evidence for the viewpoint
dependency of recognition of identity.
In the domain of BM there is so far only one study on viewpoint-dependent recognition
of identity in an artificial learning paradigm (Troje et al., in press). This study yielded
an overall advantage of frontal views compared to profile and half-profile views.
Chapter IV Viewpoint-dependent Recognition of Biological Motion
83
Change of viewpoint from training to test resulted in a performance decrease.
Viewpoint dependence has also been investigated in the context of gender classification
based on BM information (Mather & Murdoch, 1994; Troje, 2002a). As in the above
mentioned study, observers derived more information about the gender of a walker
from frontal view.
It is as yet unclear, whether the results from Troje et al (in press) also apply to
ecological settings. In other words, little is known about the representation of gait
dynamics of familiar persons such as colleagues and friends with whom we interact in
daily life. As we usually do not see our own gait patterns from a third person view, it is
also of interest whether there is a dissociation between the mental representation of our
own gait patterns and the representation of another familiar person. This question is of
special relevance, since evidence from neurophysiological and imaging studies suggest
a common coding between perception and action (Blakemore & Decety, 2001; Decety
& Grezes, 1999; Rizzolatti & Fadiga, in press; Rizzolatti et al., 2001) which may have
distinctive implications for the neuronal representation of the own movement patterns.
The direct matching hypothesis (Rizzolatti et al., 2001) postulates that we understand
actions by mapping the visual representation of observed actions on the motor
representations of the same action. According to this view, observation of an action
induces resonance in the motor system of the observer. In the premotor cortex of
monkeys “mirror neurons” were found that discharge when the monkey performs
specific hand actions and also when it observes another individual performing the same
action (Gallese et al., 1996). There is evidence that a “mirror system”, similar to that
described in the monkey, also exists in humans. In contrast to the monkey mirror
system, the human analogue is more flexible since it reacts not only to goal directed
actions but also shows resonance behavior to intransitive i.e. not object directed
actions. Evidence for such a flexible mirror system comes from studies applying
transcranial magnetic stimulation (Fadiga et al., 1995; Gangitano et al., 2001), MEG-
studies (Hari et al., 1998) and from functional brain imaging (Buccino et al., 2001;
Iacoboni et al., 1999).
The present study addressed three issues. The first aim was to investigate recognition
of walking patterns from familiar individuals such as friends or colleagues represented
as point-light displays (PLD) within a larger sample of different gait patterns and with
Chapter IV Viewpoint-dependent Recognition of Biological Motion
84
a more sophisticated presentation technique than in previous studies. The second aim
was to elucidate viewpoint-dependent effects in the representation of gait dynamics by
exploring the influence of viewing angle on recognition performance. The third aim
was to examine differences in the representation of one’s own gait pattern and gait
patterns of other persons.
4.2 Methods
4.2.1 Stimuli.
Two groups of 12 participants each served as models to acquire the motion data. All of
them were staff at the Ruhr-University of Bochum (10 females, 14 males; ages 21 to 42
years). Individuals in each group knew each other well from daily interaction in the
working environment. Motion data of the participants were acquired by recording their
walking patterns in three-dimensional space using a motion capture system equipped
with 9 CCD-cameras (Oxford Metrics, Vicon 512). Participants were instructed to walk
at a comfortable speed through the capture volume which was 7 m long. A set of 41
retroreflective markers was attached to their bodies. The system tracks the positions of
the markers with a spatial accuracy in the range of 1 mm and a temporal resolution of
120 Hz. From these 41 markers the trajectories of 15 “virtual” markers positioned at
major joints of the body were computed. Commercially available software
(Bodybuilder, Oxford Metrics) for biomechanical modeling was used to obtain the
respective computations. Translational motion was subtracted and the data were
normalized in size. Eventually, they were animated as point-light displays such that the
walkers seemed to walk as if on a treadmill. Fitting a Fourier series to the data (Troje,
2002b) the displays could be looped continuously to allow a variable presentation time.
The dots representing the major joints of the body were located at the ankles, the knees,
the hips, the wrists, the elbows, the shoulders, the center of the pelvis, on the sternum,
and in the center of the head.
The displays were presented in frontal view (FV, 0°), half profile view (HV, 30°) and
profile view (PV, 90°) as white dots on a black computer screen (Fig. 4.1). The walkers
subtended 6.4 deg of visual angle at the viewing distance of 90 cm. They were
Chapter IV Viewpoint-dependent Recognition of Biological Motion
85
computed in real time on a frame by frame basis and synchronized with the 60 Hz
refresh rate of the 19” CRT monitor to ensure smooth, regular motion. Stimuli were
presented using Matlab with the Psychophysics Toolbox extensions (Brainard, 1997;
Pelli, 1997).
Fig. 4.1. Categories of stimuli: Point-light display in frontal view (FV), half-profile view (HV) and profile view (PV).
4.2.2 Participants
Twenty of the 24 subjects who supplied the motion data participated as observers in the
experiment (9 females, 11 males; ages 21 to 42 years) which was undertaken with the
understanding and written consent of each participant. All subjects had worked in one
of two different laboratories at least for six weeks, saw each other daily and knew each
other well by name.
Chapter IV Viewpoint-dependent Recognition of Biological Motion
86
4.2.3 Procedure
Before the experiment was started the procedure was explained in detail to the
observers and they were shown a list of names of all people to be presented, including
their own. Three blocks of 36 trials (12 gait patterns x 3 orientations) were presented in
randomized order. Consecutive blocks were separated by a short break. Stimulus
presentation time was not limited.
Each display remained on the screen until the observers indicated that they had
recognized the gait pattern by pressing a response button. Then a list containing name
buttons of all persons being presented appeared on the screen. Observers were asked to
indicate the name of the person by button press. Then the next trial started. Observers
did not receive feedback about their responses.
4.2.4 Data analysis
Overall recognition performance was analyzed using a one-way ANOVA to determine
effects of viewing angle on percentage of correct identifications. Only blocks 2 and 3
were included in the analysis. The first block served to familiarize the subjects with the
displays and to show them the whole range of different gait patterns in the sample.
Greenhouse-Geisser adjustments to the degrees of freedom were performed when
appropriate. Because of the experimental design, self recognition and recognition of
others are represented asymmetrically in this analysis. Recognition performance of
one’s own gait pattern and of gait patterns of others were therefore analyzed separately
in order to elucidate viewpoint-dependent effects for each group. For recognition of
others, a further ANOVA was conducted. By contrast, recognition of one’s own gait
pattern was analyzed non-parametrically because recognition rate was not distributed
normally.
4.3 Results
On average, mean scores for correct identification were 28.5% for frontal view, 26.9%
for half profile view, and 19.4% for profile view. Analysis of overall recognition
Chapter IV Viewpoint-dependent Recognition of Biological Motion
87
performance by a repeated measure ANOVA revealed a significant effect of viewing
angle on recognition performance (F(2,38) = 7.10, p = 0.003). For frontal views (p =
0.012) and half profile views (p = 0.018) recognition performance was significantly
better in comparison to profile views as tested post hoc with Bonferroni adjusted
measures (Fig. 4.2).
Fig. 4.2. Overall percentage of correct identification for frontal (FV), half-profile (HV) and profile view (PV). The dashed line indicates chance level.
The separate comparison of recognition performance of one‘s own gait and of gait
patterns of other persons indicated that only recognition of others’ gait patterns was
viewpoint-dependent (F(2,38) = 7.57, p = 0.002). Rates of correct identification were
28.6% (FV), 26.6% (HV) and 18.4% (PV). Again, for frontal views (p = 0.008) and
half profile views (p = 0.018) percentage of correct identification was significantly
better in comparison to profile views.
By contrast, recognition performance of one’s own gait pattern was almost at the same
level in all viewing conditions. Observer identified their own gait pattern correctly in
27.5% of the trials in the frontal view condition and in 30% of the trials in the half
profile view as well as in the profile view condition (Fig. 4.3). Statistical analysis
0
5
10
15
20
25
30
35
FV HV PV
Cor
rect
iden
tific
atio
n [%
]
Chapter IV Viewpoint-dependent Recognition of Biological Motion
88
revealed no significant effect of viewing angle on recognition rate (χ²(2) = 0.08, p =
0.960). Statistical power of the χ²-test was analyzed post-hoc by means of the software
G-Power (Erdfelder, Faul, & Buchner, 1996). The following parameters were used for
analysis: effect size ω was set to 0.33, which indicates a medium effect size according
to Cohen (1988) and corresponds to a population effect of 10%; the α-level was set to
0.05. For these parameters the statistical power was 0.91 and, therefore, the
nullhypothesis was accepted.
Fig. 4.3. Overall percentage of correct identification for frontal (FV), half-profile (HV) and profile view (PV) separated for self recognition and recognition of others. Number of trials are different for self recognition versus recognition for others. The dashed line indicates chance level.
4.4 Discussion
The current results present further evidence that kinematic cues from BM provide
information about personal identity which can be transferred from real life experience
to reduced point-light displays of BM. Recognition performance in this study was
found to be three times higher than chance level. Nevertheless, information from BM
0
5
10
15
20
25
30
35
FV HV PV
Cor
rect
iden
tific
atio
n [%
]
Others Self
Chapter IV Viewpoint-dependent Recognition of Biological Motion
89
failed to provide a highly reliable cue for individual identification, if walking patterns
of familiar persons had not been seen before as point-light displays. The process
involved in individual recognition of identity from BM is clearly different from other
processes which derive information about identity like face recognition. In comparison
to earlier studies (Beardsworth & Buckner, 1981; Cutting & Kozlowski, 1977) absolute
recognition rates in the current study was found to be lower. However, in these studies,
the number of walkers were lower, too. If recognition performance is considered with
respect to chance level performance, the differences between the current study and the
previous studies become marginal and even reverse. Recognition of others is 3.0 times
higher and self recognition is 3.51 times higher than chance level in the current study
as compared to ratios of 2.27 in the study by Cutting and Kozlowski (1977), and 3.48
(self recognition) and 1.89 (recognition of others) in the study by Beardsworth and
Buckner (1979). Moreover, simple size which was not accounted for in the older
studies could not be used as cue in the current study since stimuli were normalized with
respect to the walker’s size.
As concerns the role of the viewpoint in recognition of identity of other people,
individual features of gait dynamics can be extracted more efficiently when seen in
frontal or half profile view. This result is in accordance with the findings by Troje and
colleagues (in press). The viewpoint dependency might be due to attention being
automatically drawn to approaching people, resulting in increased exposure to frontal
and half profile views of gait patterns. This finding supports the hypothesis of a viewer
centered representation of BM information from other individuals. In contrast to
viewpoint-dependent recognition for familiar individuals, recognition of the own
walking patterns was found to be independent of the viewing angle.
As suggested by the direct matching hypothesis (Rizzolatti et al., 2001), which assumes
a common coding between perception and action, a different mechanism as in
recognition of others might contribute to the extraction of information of the own
movement pattern. This view is supported by the fact that it is quite unusual to watch
one’s own movements from a third-person perspective. As a consequence, we have
little experience with visual feedback from our own locomotion movements.
Exceptions might be rare situations in which one walks towards a mirror when looking
at the own mirror image or watching video sequences showing the own person
Chapter IV Viewpoint-dependent Recognition of Biological Motion
90
walking. Nevertheless, recognition performance of one’s own gait was at the same
level in the frontal and half profile view condition as in the condition to recognize
others. Moreover, recognition performance from profile view was as good as from
other viewing angles and, therefore, exceeded the rate of correct identifications of other
familiar persons. For the recognition of movement patterns of familiar people,
observers have to rely on stored representations of gait kinematics of those persons and
compare them with the actual kinematics provided by the point-light displays. By
contrast, when individuals observe their own movement patterns, they refer to motor
representations associated with their own gait patterns. Referring to motor
representations in order to compare them with visual representations of movements
requires the transfer of BM information from the motor or action system to the visual
system or vice versa.
Such motor representations about the metrics of the own movements are clearly stored
in three-dimensional space. In order to compare the visual representations of a sample
of different gait patterns including one’s own with the kinematics of the own
movements, we assume that the three-dimensional motor representation of the own
kinematics is aligned to any two-dimensional visual representation independent of the
viewpoint of the gait pattern. If there is an exact match between both representations,
the own gait is successfully recognized. In case there is now exact match between
visual and motor representations, the identity of the walker has to be determined on the
basis of the visual representation of the familiar gait patterns.
Comparing the present findings on viewpoint dependency in BM perception with those
from face perception (Troje & Kersten, 1999), there is an important distinction.
Whereas for face perception an advantage of frontal view in comparison to profile view
emerges for the recognition of the own person, a similar effect was not observed for
BM. For recognition of other familiar persons the reverse pattern emerged: person
identification did not vary with angle in faces, whereas there is a clear frontal view
advantage for BM perception. This dissociation supports again the assumption that
information conveyed by the motor systems contribute to the perception and
recognition of the own movements. Nevertheless, neither the visual information nor the
information from motor representation is perfect given the substantial error rate.
Information from BM in everyday life is used for different purposes, such as estimating
Chapter IV Viewpoint-dependent Recognition of Biological Motion
91
the smoothness and attractiveness of the movements of a possible partner or for the
inference about a person’s emotions and personality traits from the way he or she
moves. Deriving information from motor cognition in the context of self recognition,
on the other hand, might depend on the precision of the own body scheme or the degree
of experience with physical exercise.
We can confirm earlier findings on person identification from BM. Even though error
rates are rather high, performance is way above chance level. For recognition of
familiar persons the viewing angle plays an important role. Identity information can be
extracted more reliably from frontal and half profile view. Finally, recognition of one’s
own movements is independent of the viewing angle. We hypothesize that this reflects
a cross-modal transfer between visual and motor representations according to the direct
matching hypothesis.
Chapter V Biological Motion Perception in Cerebellar Patients
92
V Study 4: Differential Involvement of
the Cerebellum in Biological and Coherent
Motion Perception
Daniel Jokisch, Nikolaus F. Troje, Benno Koch1, Michael Schwarz2 and
Irene Daum
Summary
Perception of biological motion (BM) is a fundamental property of the human visual
system. It is as yet unclear which role the cerebellum plays with respect to the
perceptual analysis of BM represented as point-light displays. Imaging studies
investigating BM perception revealed inconsistent results concerning cerebellar
contribution. The present study aims to explore the role of the cerebellum in the
perception of BM by testing the performance of BM perception in patients suffering
from circumscribed cerebellar lesions and comparing their performance with an age-
matched control group.
Perceptual performance was investigated in an experimental task testing the threshold
to detect BM masked by scrambled motion and a control task testing detection of
motion direction of coherent motion masked by random noise. Results show clear
evidence for a differential contribution of the cerebellum to the perceptual analysis of
coherent motion perception compared to BM. Whereas the ability to detect BM masked
by scrambled motion was unaffected in the patient group, their ability to discriminate
direction of coherent motion in random noise was substantially affected. We conclude
that intact cerebellar function is not a prerequisite for a preserved ability to detect BM.
Since the dorsal motion pathway as well as the ventral form pathway contribute to the
1 Benno Koch contributed to this study by providing assistance with pre-selecting patients on the basis of clinical and MRI characteristics and by performing neurological examinations 2 Michael Schwarz contributed to the discussion of the design and discussion of the results.
Chapter V Biological Motion Perception in Cerebellar Patients
93
visual perception of BM, the question remains open, whether cerebellar dysfunction
affecting the dorsal pathway is compensated for by the not affected ventral pathway or
whether perceptual analysis of BM is performed completely without cerebellar
contribution.
Chapter V Biological Motion Perception in Cerebellar Patients
94
5.1 Introduction
Motion patterns characteristic of living beings are termed biological motion (BM).
Detection of such motion patterns is a fundamental property of the human visual
system. Humans can efficiently detect another living being in the visual environment,
and are able to retrieve many features from its kinematics. An experimental approach to
uncouple information from BM from other non dynamic sources of information is to
represent the main joints of a person’s body by bright dots against a dark background
(Johansson, 1973). Employing this point-light display technique, observers can easily
recognize a human walker, determine his/her gender (Barclay et al., 1978; Cutting,
1978; Kozlowski & Cutting, 1977; Mather & Murdoch, 1994; Troje, 2002a), recognize
various action patterns (Dittrich, 1993), identify individual persons (Cutting &
Kozlowski, 1977) and even recognize themselves (Beardsworth & Buckner, 1981).
The highly adaptive value of an efficient perception of animate motion patterns is
reflected by a specific neural machinery performing perceptual analysis of such kind of
visual information (Bonda et al., 1996; Grezes et al., 2001; Grossman & Blake, 2001,
2002; Grossman et al., 2000; Servos et al., 2002; Vaina et al., 2001). Neuroimaging
studies report selective activation of the superior temporal sulcus (STS) to visual
stimuli consisting of BM. In addition to area STS, activation specific for BM has also
been shown in the cerebellum (Grossman et al., 2000; Vaina et al., 2001), area VP
(Servos et al., 2002), the amygdala (Bonda et al., 1996), the occipital and fusiform face
area (Grossman & Blake, 2002) and the premotor cortex (Saygin et al., 2004). Results
from these studies and from computational modelling (Giese & Poggio, 2003) are
consistent with neuropsychological findings in neurological patients suffering from
focal cortical brain lesions (Cowey & Vaina, 2000; McLeod et al., 1996; Schenk &
Zihl, 1997; Vaina, 1994; Vaina et al., 1990).
The cerebellum has traditionally been viewed as a brain structure subserving skilled
motor behavior. While recent work has suggested a much broader functional role of the
cerebellum with contributions to a wide range of cognitive and perceptual functions
(for review Daum, Snitz, & Ackermann, 2001; Justus & Ivry, 2001), the role of the
cerebellum in BM perception is unclear. The neuroimaging literature on the role of the
cerebellum with respect to the perception of BM is inconsistent, with some studies
Chapter V Biological Motion Perception in Cerebellar Patients
95
reporting cerebellar involvement (Grossman et al., 2000; Vaina et al., 2001), while
others failed to detect cerebellar activity associated with BM perception (e.g. Grezes et
al., 2001; Grossman & Blake, 2001; Servos et al., 2002). Moreover, there are some
inconsistencies regarding the cerebellar substructure which may be involved in BM
perception. Grossman and colleagues (2000) found cerebellar activity in the anterior
portion near the midline, whereas Vaina and colleagues (2001) reported activity
specific to BM in lateral parts of the cerebellum.
The current study aims to elucidate the functional role of the cerebellum in perception
of BM using a lesion approach i.e. examining the perceptual performance of patients
with selective cerebellar lesions. Within this context, a particular issue of interest was
the differential cerebellar contribution to visual processing of BM relative to motion
perception per se.
5.2 Methods
Two experimental tasks were administered in order to explore the functional role of the
cerebellum in the perception of BM and to compare its involvement in non-BM
perception. A group of patients with selective ischemic cerebellar lesions was examined
in these tasks. The patients’ perceptual performance was compared to the performance
of an age-matched control group. Perceptual performance was assessed by determining
the threshold for the detection of masked BM and masked non-BM. In the BM task, the
presence or absence of a point-light walker that was masked by dots consisting of
scrambled motion had to be detected. In the non-BM task, observers had to detect the
motion direction of coherently moving dots that were masked by random noise dots.
5.2.1 Participants
Seven cerebellar patients and seven healthy control subjects participated in the
investigation. The patients (ranging from 27 to 68 years, mean age 45.6 years) suffered
from a cerebellar infarction of either the posterior inferior cerebellar artery (PICA), the
anterior inferior cerebellar artery (AICA) or the superior cerebellar artery (SupCA) in
Chapter V Biological Motion Perception in Cerebellar Patients
96
the post-acute state. Main cerebellar symptoms in the acute stage included ataxia,
dysmetria, dysarthria, and impairments of fine motor coordination. Table 5.1 presents a
summary of relevant clinical information. The examination was carried out between 16
and 47 months (mean 27.4 months) after the ischemic event. At this time, patients
suffered only from residual motor impairments.
Table 5.1. Summary of relevant patient information. Asterisks indicate patients being able to solve the control task. (AICA = anterior inferior cerebellar artery; PICA = posterior inferior cerebellar artery; SupCA = superior cerebellar artery)
Patient Sex Hemisphere Lesion
type
Location Time since lesion
(months)
Pat 1 M R AICA medial 47
*Pat 2 M L PICA medial 23
*Pat 3 F LR SupCA lateral 20
Pat 4 F LR SupCA lateral 28
Pat 5 F R AICA medial 29
*Pat 6 M L PICA medio-basal 16
Pat 7 M L PICA medial 29
Patients were extensively screened in neuropsychological functioning. Their present
state IQ was assessed by the subtests similarities and picture completion of the short
German version from the Wechsler Intelligence Scale (Dahl, 1972). According to these
subtests, their mean IQ was 113.6 and, therefore, in the average to upper average range.
Patients’ ability to scan the visual field was tested with the subtest visual scanning of a
widely used German attention test battery (Zimmermann & Fimm, 1993). In this
subtest, patients’ performance for search accuracy ranged between percentile score 14
and percentile score 58. This pattern of performance revealed no specific impairment in
visual scanning in our sample of patients.
Healthy control subjects were recruited by advertisement to match the patients with
respect to age (ranging from 24 to 67 years, mean age 45.1 years) and sex. All
participants had normal or corrected to normal vision. The examination was undertaken
Chapter V Biological Motion Perception in Cerebellar Patients
97
with the understanding and written consent of each participant. The study had been
approved by the ethics committee of the Ruhr-University Bochum.
5.2.2 Stimuli
Stimuli in all experimental tasks were presented using Matlab with the Psychophysics
Toolbox extension (Brainard, 1997; Pelli, 1997). In all experimental trials, stimuli were
presented for a duration of 200 ms in order to preclude an effect of fixation shifts.
5.2.2.1 Biological motion detection
Perception of BM was tested with stimuli of point-light walkers in frontal view masked
by noise dots consisting of scrambled motion. Stimuli were presented as white dots on a
black screen (Fig. 5.1). The mask dots had the same local motion trajectories as the dots
defining the point-light displays, but the spatial relation among the dots was removed
by randomizing their initial starting position.
Fig. 5.1. Depiction of the stimuli used in the BM task: Dots connected by lines represent the point-light walker. Remaining dots represent scrambled motion. Lines are only drawn in the figure depiction for the sake of clarity.
Chapter V Biological Motion Perception in Cerebellar Patients
98
Three male individuals served as walking models for the construction of the point-light
displays. Motion data of the models were acquired by recording their walking patterns
in three-dimensional space using a motion capture system equipped with 9 CCD-
cameras (Oxford Metrics, Vicon 512). Models were instructed to walk at a comfortable
speed through the capture volume which was 7 m long. A set of 41 retroreflective
markers was attached to their bodies. The motion capture system tracks the three-
dimensional trajectories of the markers with spatial accuracy in the range of 1mm and a
temporal resolution of 120 HZ. From the original 41 markers the trajectories of 15
“virtual” markers positioned at major joints of the body were computed. Commercially
available software (Bodybuilder, Oxford Metrics) for biomechanical modeling was
used to perform the respective computations. Translational motion was subtracted such
that the walkers appeared to walk on a treadmill.
Degree of difficulty of the detection task was manipulated by varying the number of
mask dots from 0 to 60 dots in steps of five dots. Accordingly, thirteen different
degrees of difficulty were obtained. In half of the trials the walker was present, in the
other half of the trials the walker was absent and replaced by the same number of
scrambled dots. Mask dots and random dots were displayed in an area subtending 7.4 x
7.4 degree visual angle. Within the display area, the position of the point-light displays
as well as the positions of the mask dots were chosen randomly. The walkers subtended
5.5 x 1.5 deg of visual angle at the viewing distance of 57 cm. Point-light displays of
walkers were computed in real time on a frame by frame basis and synchronized with
the 60 Hz refresh rate of the 15” monitor to ensure smooth, regular motion.
5.2.2.2 Coherent motion detection
Perception of non-BM was tested with displays of coherent motion in random noise
which were matched with respect to size to the BM stimuli (7.4 x 7.4 deg). Displays
consisted of 200 white dots with a size of 0.05 x 0.05 deg presented on a black screen.
A specified percentage of the dots moved coherently at a speed of 6 deg/second either
to the right or to the left hand side (Fig. 5.2). Signal dots had a limited lifetime of five
frames. After the end of lifetime the dots disappeared and reappeared on the screen at a
location opposite to direction of movement. The mask dots were positioned randomly
Chapter V Biological Motion Perception in Cerebellar Patients
99
within the display area and had a limited lifetime of two frames in which they were
displayed stationary. After disappearing they reappeared at a new location. The
percentage of signal dots was varied between 65% and 5% in steps of 5% resulting in
thirteen different degrees of difficulty.
Fig. 5.2. Depiction of the stimuli used in the control task (coherent motion in random noise): Direction of coherently moving dots is illustrated by arrows. Remaining dots represent random noise.
5.2.3 Procedure
The experiment was carried out in the Klinikum Dortmund and in the Institute of
Cognitive Neuroscience of the Ruhr-University Bochum. All participants were seated
in front of a 15’’ monitor at a distance of 57 cm with response buttons under their right
hands. Stimuli in both tasks were presented at the center of the screen for 200 ms in
order to preclude eye movements. Successive trials were separated by intertrial
intervals of 2000 ms during which a black screen was presented. Before each trial a
fixation cross was presented for 2000 ms.
Both experiments comprised three blocks of 52 trials each (13 degrees of severity x 4
repetitions) resulting in 156 single trials per experimental task. Trials within each block
were presented in random order. The two experimental tasks were presented in
Chapter V Biological Motion Perception in Cerebellar Patients
100
counterbalanced order. Participants were asked to keep central eye fixation and to
respond as accurately as possible by pressing one of the response buttons. Instructions
stressed accuracy rather than speed of responding. Observers did not receive feedback
on their responses. Before starting the experimental trials participants were shown
demonstration trials in order to familiarize them with the display and the setup.
5.2.4 Data analysis
Experimental data were analyzed in two consecutive steps separately for each
experiment. First, the detection threshold for each subject was determined for both
experimental paradigms. The likelihood to respond correctly by chance was 50% in
both experimental tasks. The threshold was defined as the signal to noise ratio needed
to perform correctly at 75% of the trials per degree of severity. To achieve this, a
sigmoidal curve (Boltzmann function) was fitted to the experimental data with the
upper asymptote fixed at 100% performance and the lower asymptote fixed at chance
level corresponding to 50% performance. Fig. 5.3 illustrates this procedure for a single
subject.
Subsequently, a group comparison of the thresholds of the subjects of the experimental
and control group was performed by a t-test for independent measures separately for
both experiments. In addition, the reaction times were recorded. A group comparison of
the median reaction time per subjects was performed by a t-test for independent
measures separately for both experiments.
Chapter V Biological Motion Perception in Cerebellar Patients
101
Fig. 5.3. Illustration of the procedure to determine the detection threshold for a single subject in BM detection (A) and coherent motion detection (B).
Chapter V Biological Motion Perception in Cerebellar Patients
102
5.3 Results
5.3.1 Biological motion detection
On average, the control group reached the threshold criterion in the BM paradigm at a
noise level of 23.3 masking dots. Cerebellar patients showed a similar performance
reaching the criterion at a noise level of 22.2 masking dots (Fig. 5.4). A group
comparison between experimental and control group revealed no significant differences
of the threshold to detect BM in the current paradigm (t(12) = 0.183; p = 0.858).
Fig. 5.4. BM task: Perceptual threshold as number of mask dots for the detection of a point-light walker masked by scrambled motion. Error bars indicate SEM.
5.3.2 Coherent motion detection
In the control task comprising the detection of the direction of coherent motion in
random noise, control subjects reached the criterion at a signal to noise ratio of 34.40%.
By contrast, only three out of seven cerebellar patients were able to solve the task even
at the highest signal to noise ratio of 65% presented in the experiment. In order to
compare both groups statistically, for those patients who did not succeed in solving the
Chapter V Biological Motion Perception in Cerebellar Patients
103
task, a threshold value of 70% signal to noise ratio was submitted to analysis. Important
to note is that this conservative procedure underestimates the magnitude of the
impairment in direction detection of coherent motion in random noise in the patient
group. Applying this procedure, the patient group needed on average a signal to noise
ratio of 61% to fulfill the criterion (Fig. 5.5). A group comparison revealed a significant
difference between experimental and control group (t(12) = 2.757; p = 0.020).
Fig. 5.5. Control task: perceptual threshold as signal to noise ratio for direction detection of coherent motion masked by random noise. Error bars indicate SEM.
5.3.3 Relation between biological and coherent motion perception
In order to examine the relation between performance in BM detection and coherent
motion detection, a correlation analysis was calculated. For the patient group, Pearson’s
correlation coefficient between performance in both tasks was –0.293 (p = 0.524). For
the control group, Pearson’s correlation coefficient was 0.116 (p = 0.805). The
correlation between performance in BM and coherent motion detection failed to reach
significance in both groups, although Pearson’s correlation coefficient was slightly
higher in the patient group. According to these results, BM perception and coherent
motion perception seem to be independent processes.
Chapter V Biological Motion Perception in Cerebellar Patients
104
5.3.4 Reaction times
In addition to accuracy, the reaction times were analyzed (Fig. 5.6). For coherent
motion detection, a group comparison revealed longer reaction times for patients (989
ms) than for controls (451 ms) (t(12) = 3.454; p = 0.005). Separate analyses of correct
and error trials yielded significant group differences for correct trials (p = 0.005) and a
trend towards significant differences for incorrect trials (p = 0.107). A different pattern
emerged for BM detection. Reaction times for patients (864 ms) and controls (707 ms)
did not differ significantly (t(12) = 1.2; p = 0.253). This is true for correct trials (p =
0.175) as well as for incorrect trials (p = 0.267).
Fig. 5.6. Reaction times in the BM detection task (BM) and the coherent motion detection task (CM) for each group. Error bars indicate SEM.
5.4 Discussion
The objective of the current study was to elucidate the differential contribution of the
cerebellum to the perceptual analysis of BM by examining perceptual performance of
patients with selective cerebellar lesions. Previous imaging studies revealed
inconsistent results with respect to cerebellar activation in BM perception (Bonda et al.,
1996; Grezes et al., 2001; Grossman & Blake, 2001, 2002; Grossman et al., 2000;
0100200300400500600700800900
100011001200
CMPatients
CMControls
BMPatients
BMControls
Rea
ctio
n TI
me
[ms]
Chapter V Biological Motion Perception in Cerebellar Patients
105
Servos et al., 2002; Vaina et al., 2001). It is difficult to estimate on the basis of
neuroimaging studies alone which brain regions are critically involved in specific
aspects of cognitive function, since multiple co-activations are usually observed when
applying this method. As a consequence, neuropsychological studies of patients with
selective lesions to different regions play an important role in the evaluation of the
distinct nature of information processing in each brain region.
The present results show clear evidence for a differential contribution of the cerebellum
to the perceptual analysis of coherent motion perception on the one hand and BM on
the other hand. Whereas the perception of coherent motion in random noise was
substantially affected in our patients with selective cerebellar lesions, the ability to
perceive BM camouflaged by scrambled motion was unaffected. In addition, we did not
observe significant correlations between the perceptual threshold for BM detection and
coherent motion detection in each group. Moreover, patients’ higher threshold for
coherent motion detection corresponds to longer reaction times. When comparing
overall performance in the present BM detection task with performance in other studies
exploring BM detection using scrambled motion as mask (Bertenthal & Pinto, 1994),
the detection threshold obtained in the present work is substantially higher. This
difference is probably due to the very short presentation time in the present study (200
ms) compared to 1000 ms in the study by Bertenthal and Pinto (1994).
The finding of an impairment in the detection of movement direction in the control
paradigm confirms previous reports (Ivry & Diener, 1991; Nawrot & Rizzo, 1995,
1998). Results from a study comparing perceptual judgments of the velocity of moving
stimuli and the position of static stimuli in cerebellar patients (Ivry & Diener, 1991)
showed selective impairments for the discrimination of moving stimuli. Further support
for the notion of cerebellar involvement in motion perception was given by Nawrot and
Rizzo (1995, 1998), who showed that midline cerebellar lesions can cause visual
motion perception deficits in tasks such as detecting the direction of dot movements in
a masking paradigm. These deficits occur during the acute stage as well as in the
chronic stage of lesions. The primary interest of the current study was to explore
whether there is a differential contribution of the cerebellum to BM perception as
compared to motion perception per se.
Chapter V Biological Motion Perception in Cerebellar Patients
106
The general framework, that is, presentation time and general procedure was identical
in both experimental tasks and cannot explain the performance deficits of the patients in
the control task. This is particularly true for the role of eye movements. In order to
control the influence of eye movements a very short presentation time of 200 ms was
chosen. Within this short time period it is almost impossible to initiate eye movements.
If ocular motor problems or defective fixation had played a major role in these tasks,
performance in both tasks would have been affected to similar extent. The cerebellum
plays a critical role in motor control, with the lateral regions mediating movement
planning and programming, while the medial regions contributing to the execution of
movements (Dichgans & Diener, 1984). Accordingly, the most prominent symptoms
after cerebellar dysfunction are impairments in motor control. For this reason, accuracy
rather than reaction time was stressed in the instructions of the experiments. Since
significant reaction time differences between patients and controls were observed only
in the coherent motion detection paradigm, these differences cannot be attributed to
motor impairments.
To understand the current results of intact perception of BM in patients with selective
cerebellar lesions, the consideration of the cortical network involved in the perception
of BM might provide deeper insight. Both, the dorsal motion pathway as well as the
ventral form pathway contribute to the perceptual analysis of BM (for review Giese &
Poggio, 2003). Findings from imaging studies are complemented by computational
simulations modeling key experimental findings with respect to BM perception (Giese
& Poggio, 2003) and neuropsychological studies examining patients suffering from
selective cortical lesions (Cowey & Vaina, 2000; MacLeod, 1988; Schenk & Zihl,
1997; Vaina, 1994; Vaina et al., 1990). These case studies provide evidence for a
dissociation between mechanisms involved in the perception of BM on the one hand
and mechanisms involved in inanimate visual motion tasks or static object recognition
tasks on the other hand. Patients LM (MacLeod, 1988) and AF (Vaina et al., 1990) who
have bilateral lesions involving the posterior visual pathway showed severe deficits in
visual motion perception but can nevertheless recognize human action patterns
presented as point-light displays. Patients with bilateral ventral lesions involving the
posterior temporal lobes such as patients EW (Vaina, 1994), who suffered from
prosopagnosia and object agnosia, could identify BM in point-light animations as well.
On the other hand, there is patient AL (Cowey & Vaina, 2000) who is hemianopic and
Chapter V Biological Motion Perception in Cerebellar Patients
107
suffers from visual perceptual impairments in her intact hemifield as consequence of an
additional lesion in the ventral extrastriate cortex. AL fails to recognizing BM displays
despite intact static form perception and motion detection. This pattern of impairments
makes sense when assuming that the lesion in the intact hemifield includes the STS-
complex which receives input both from the ventral and dorsal visual stream.
Given these case studies and computational simulations, it is reasonable to assume that
detection of BM can be achieved by the ventral or dorsal visual stream alone if the
STS-complex is still intact. From this point of view it might be possible that the
cerebellum facilitates perceptual analysis of BM in the dorsal visual stream.
Nevertheless, dysfunctional cerebellar processing would not necessarily lead to a
significant impairment in BM perception, since dysfunctions of the dorsal visual stream
could be compensated for by intact processing of the ventral visual stream.
Alternatively, one might argue that the perceptual analysis of BM is completely
performed by neocortical structures without any cerebellar contribution to BM
perception at all. This view is also in accordance with our present findings. Moreover,
there is empirical evidence that the cerebellum becomes not only active during
execution of movement sequences but also during motor imagery in tasks such as
imagination of complex movement sequences (Decety, 1996; Decety et al., 1990;
Hanakawa et al., 2003; Luft et al., 1998; Ryding et al., 1993). Activity in response to
point-light displays of BM as observed in some imaging studies (Grossman et al., 2000;
Vaina et al., 2001) might be a result of such feedforward mechanisms.
The variety of neural connections of the cerebellum to cortical areas provide the
neuroanatomical basis for cerebellar contributions to a variety of perceptual tasks. The
cerebellum projects from lateral parts via the dentate nucleus and the thalamus to
several neocortical structures, among them the prefrontal cortex, the superior temporal
sulcus and the parietal cortex (Schmahmann & Pandya, 1997). These regions project
back to the cerebellum via the pontine nuclei. The lateral cerebellum was shown to be
engaged during the acquisition and discrimination of somatosensory information (Gao
et al., 1996). It was suggested that the lateral cerebellum may be specifically active
during motor, perceptual and cognitive performances because of the requirement to
process sensory data. In the view of Bower (1997), the cerebellum is assumed to
facilitate the efficiency with which other brain structures perform their own function,
Chapter V Biological Motion Perception in Cerebellar Patients
108
and therefore, it is considered useful but not imperative for many different kinds of
brain functions. The view of a general contribution of the cerebellum to the acquisition
of sensory data is inconsistent with the present findings, since the ability of cerebellar
patients to perceive BM was spared. Therefore, cerebellar function with respect to
sensory data acquisition must be more specific.
Keele and Ivry (1990) have put forward the idea that the cerebellum has the function of
an internal clock which measures time intervals in the millisecond range. Such an exact
timing of very short intervals subserves motor as well as non-motor functions. The
demonstration of the role of the cerebellum in visual perceptual functions that require
velocity perception (Ivry & Diener, 1991; Nawrot & Rizzo, 1995, 1998) was also
interpreted to be in accordance with the timing hypothesis. Similarly, deficits in speech
perception (Ackermann, Graber, Hertrich, & Daum, 1999) and classical conditioning
(Daum et al., 1993; Topka, Valls-Sole, Massaquoi, & Hallett, 1993; Woodruff-Pak,
Papka, & Ivry, 1996) have also been discussed in relation to impaired timing in
cerebellar patients.
Recently the timing hypothesis of cerebellar function has been modified by
differentiating event timing from emergent timing (Ivry, Spencer, Zelaznik, &
Diedrichsen, 2002; Spencer, Zelaznik, Diedrichsen, & Ivry, 2003). Event timing is
defined as a form of representation in which the temporal goals are explicitly
represented. In contrast, emergent timing reflects temporal consistencies that arise
through the control of other parameters. Whereas the cerebellum is involved in tasks
requiring explicit temporal representation (event timing), it seems to be less important
in emergent timing which requires other control parameters not associated with the
cerebellum.
The timing hypothesis, especially in its modified form, can best explain our present
findings. An exact timing is necessary in order to detect coherent motion in random
noise. Considering motion as spatial displacement per time unit, the direct link between
the accurate representation of small time units and motion perception becomes obvious.
In the case of BM, the motion information only mediates the form of a human observer.
A precise timing in the millisecond range seems to be unnecessary with respect to this
perceptual demand. Nevertheless, the timing hypothesis may not explain all deficits
Chapter V Biological Motion Perception in Cerebellar Patients
109
seen in cerebellar patients. Thier and colleagues (Thier, Haarmeier, Treue, & Barash
1999) tried to identify the nature of visual impairments resulting from cerebellar
dysfunction by a set of experiments. Their results support the presence of visual deficits
in cerebellar disease, but in contrast to previous studies, they provide evidence against a
common, simple denominator that can explain the deficits in both motion perception
and position discrimination.
Previous studies (Ivry & Diener, 1991; Nawrot & Rizzo, 1998, 1995) reported that
visual motion perception is linked to the medial rather than the lateral cerebellum. The
impairment pattern found in our sample of cerebellar patients does not support this
view. Patients in our study failed to show such a clear distinction between more medial
and more lateral located lesions with respect to perceptual performance in the control
task. Three out of four patients who showed deficits in direction discrimination of
coherently moving dots had lesions primarily affecting medial parts whereas only one
patient had a lesion primarily affecting lateral parts. On the other hand, the three
patients who were able to solve the task had lesions in medial parts, lateral parts, and
medio-basal parts. This result pattern shows only a slight tendency that intact
processing in the medial cerebellum is necessary for normal motion perception.
Taken together, functional integrity of the cerebellum is not required for BM detection.
Taking into consideration that both the dorsal visual stream as well as the ventral visual
stream contribute to the perception of BM, it might be possible that processing in the
dorsal motion pathway is affected by cerebellar dysfunction but processing in the
ventral form pathway can compensate for this deficit. An impairment would thus not
emerge on the behavioral level when the ventral form pathway is intact.
Chapter VI General Discussion
110
VI General Discussion
The present thesis aimed to shed further light on the neuropsychological basis of BM
perception. The four studies described in the previous chapters investigated several
open issues with that respect. In order to answer these unsolved questions the studies
included in this thesis used different methodological approaches focusing on several
aspects of BM perception. The objectives of the studies were to find out more about the
content of information contained in the kinematics of animate motion patterns
(Study1), the temporal aspects of BM processing and associated distinct processing
stages (Study 2), differential visual representations of one’s own movement patterns
compared to other familiar movement patterns (Study 3) and the role of the cerebellum
in the perceptual analysis of BM (Study 4). In the following paragraphs, the different
experimental approaches are described in a nutshell and the main findings of the four
studies are briefly summarized and concluding discussed. The last paragraph provides
an overview about the contribution of this thesis to a better understanding of the
neuropsychological basis of BM perception. Finally, an outlook is given about possible
future directions of research on BM perception.
The first study of this thesis investigated whether information from BM, i.e. from the
kinematics of locomotion patterns can serve as cue for the size of animate beings. The
fact that the constant force of earth gravity directly influences the motion patterns of
animate and inanimate beings in the physical world is the theoretical basis for this idea.
With respect to BM, gravity determines periodic fluctuations between kinetic and
potential energy. Therefore, gravity determines a fixed relation between temporal (i.e.
the stride frequency) and spatial parameters (i.e. the length of a leg) of energetically
optimal gait patterns. In fact, it has been proven that animals adjust their gait patterns in
order to minimize the energy required for their locomotion and that such a relation
between size and stride frequency does exist for a number of different species.
In the two experiments included in this study, the temporal parameters of animal
locomotion represented as point-light displays were manipulated in order to examine
their influence on size estimations of human observers. The results indicated that this
Chapter VI General Discussion
111
procedure had a significant effect on observers’ size judgments in the expected
direction. Displays with high stride frequency were perceived to be smaller than
displays with low stride frequency. Therefore, it was concluded that human observers
are able to retrieve size information from the kinematics of animate motion. Moreover,
size judgments of the observers were consistent with an inverse quadratic relation
between size and stride frequency rather than a simple linear relation. This finding
shows that observers did not make their judgments according to a simple general rule
associating a higher stride frequency to a smaller size of an animal, but that they seem
to have an implicit knowledge about the exact property of this relation.
The impact of the current findings goes beyond the fact that kinematics of BM patterns
contain cues for the size of animate objects, which can be derived by human observers.
Deriving this information requires a highly developed visual system and extensive
experience with this kind of visual stimulation. One might speculate that the typical
pattern generated by the influence of gravity on BM is one of the crucial features used
by the mammalian visual system to detect BM and process it as special category. As
consequence, this specific motion pattern, i.e. the optimized periodic fluctuations of
kinetic and potential energy might play a crucial role as sensory filter for the detection
of BM. An efficient sensory filter for BM detection plays a very important role in the
animal kingdom as well as in the development of the human species, since prays or
predators must be detected as fast as possible to have the chance to react optimally to
this potential life threatening danger or source of food.
The second study elucidated the temporal course and location of neural processing of
the perceptual analysis of BM. Observers were presented with point-light displays of a
walking figure in normal orientation, inverted orientation and displays of scrambled
motion while recording EEG. Data analysis was carried out using conventional
averaging techniques to obtain the event-related potentials as well as source
localization with low resolution brain electromagnetic tomography (LORETA).
Application of this methodological approach (ERP and LORETA) provides the
advantage to measure brain activity with a very precise temporal resolution and
additional localization of the sources generating the distinct ERP-components with an
acceptable spatial resolution.
Chapter VI General Discussion
112
The present findings suggest two distinct components of BM processing, recruiting
different neuronal populations. The first processing stage (N170) reflects the
generation of a global percept of the visual scene leading to a pop-out effect of upright
BM. The sensitivity to upright BM is the result of the familiarity of BM in normal
orientation. Moreover, the finding of sources in the posterior cingulate cortex, which
has different attention-related functions, fits well with behavioral data suggesting that
attention is required for the visual analysis of point-light displays of BM. At the level
of the second processing stage (N300), brain areas as the superior temporal gyrus and
the fusiform gyrus, which are known from fMRI studies to be involved in the fine and
detailed perceptual analysis of BM, play an important role.
The spatial resolution obtained by LORETA is lower than that obtained by fMRI but its
temporal resolution is much higher. Nevertheless, it makes sense to directly compare
the findings reported here with neuroimaging findings from previous studies using
similar stimulus categories. The present findings are generally consistent with those
from neuroimaging studies. Therefore, the validity of the results with respect to
localization as calculated by source analysis is supported. Moreover, current findings
extend evidence from fMRI-studies, since activity in distinct brain areas can directly be
related to distinct time windows.
The evidence for fast, efficient processing underlines the importance of perception of
BM, and provides further evidence for a specific neural network involved in processing
biologically relevant motion signals. The right-hemispheric dominance associated with
BM perception shows clear parallels to asymmetries in face perception and probably
reflects the social relevance of animate motion perception. Furthermore, the superior
temporal gyrus and fusiform gyrus are primarily involved in the second ERP
component and show clear hemispheric asymmetries in the perceptual analysis of BM
only during later processing stages.
The third study of this thesis addressed the question whether the mental representation
of one’s own movement pattern is different from representations of movement patterns
from other familiar persons. Using a psychophysical approach, viewpoint-dependent
recognition effects were examined in order to gain deeper knowledge of the mental
representation and perceptual mechanisms of BM processing. Observers were
Chapter VI General Discussion
113
presented with point-light display animations of the walking patterns of familiar
persons and one’s own person, shown from three different viewpoints.
The current results presented further evidence that kinematic cues from BM provide
information about personal identity which can be transferred from real life experience
to reduced point-light displays of BM. Whereas recognition performance of one’s own
walking pattern was viewpoint independent, recognition rate for other familiar persons
was better for frontal and half profile view than for profile view. Viewpoint-dependent
recognition effects for other people might be due to selective attention to approaching
people leading to preferential exposure to frontal and half profile views. The finding of
a viewpoint-independent representation of one’s own movement patterns might be
related to a crossmodal transfer from motor to visual representations.
Therefore, these results are consistent with the hypothesis that humans understand an
action by mapping the visual representation of the observed action onto their motor
representation of the same action. Such a mechanism would give an exact explanation
of the observed result pattern. Mapping the visual representation on the motor
representation would provide an advantage only for the identification of the own gait
pattern, since for this stimulus the action representation and the visual representation
refer to exactly the same individual movement pattern and, therefore, match perfectly.
Moreover, the motor representation is assumed to be stored in a three dimensional
mode explaining viewpoint-independent recognition of the own movement pattern. In
contrast, mapping the visual representation of another person’s gait pattern would
support the correct recognition of the movement as walking but would not provide any
advantage for the identification of the identity of the person whose gait pattern is seen.
The observer has to rely on the stored visual representation of the familiar gait pattern
and to compare it with the stimulus gait pattern. Viewpoint dependency is probably due
to a different degree of experience for different perspectives.
The neuroimaging literature on the role of the cerebellum with respect to its role in the
perceptual analysis of BM has been inconsistent. The fourth study of the present thesis
explored the role of the cerebellum in the perception of BM by assessing the
performance of BM perception in patients with distinct ischemic cerebellar lesions.
Perceptual performance was investigated in an experimental task testing the threshold
Chapter VI General Discussion
114
to detect BM masked by scrambled motion and a control task testing detection of
motion direction of coherent motion masked by random noise. Results revealed clear
evidence for a differential contribution of the cerebellum to the perceptual analysis of
coherent motion perception compared to BM. Whereas the ability to detect BM masked
by scrambled motion was unaffected in the patient group, their ability to discriminate
direction of coherent motion in random noise was substantially affected. Based on this
finding it was concluded that intact cerebellar function is not a prerequisite for a
preserved ability to detect BM. Since the dorsal motion pathway as well as the ventral
form pathway contribute to the visual perception of BM, it cannot definitely be stated,
whether cerebellar dysfunction affecting the dorsal pathway is compensated for by the
not affected ventral pathway or whether perceptual analysis of BM is performed
completely without cerebellar contribution.
Taken together, the results of the studies included in this thesis contributed to a better
understanding of the neuropsychological basis of BM perception with respect to a
number of open questions: First, it was shown that the human visual system can extract
information about the size of animate beings from the kinematics of their movement
patterns. The relation between temporal and spatial parameters of the kinematics of
movement patterns as defined by natural laws has been suggested as crucial feature of a
sensory filter for the detection of animate beings. Second, the temporal coarse of
processing of the visual analysis of BM was clarified and distinct processing stages
could be linked to distinct neuronal structures providing new insights about the
underlying neural mechanisms. Third, the mental representation of movement patterns
of other people and one’s own person were investigated. Evidence for a common
coding of perception and action was provided and, therefore, the assumption of a direct
matching between visual representations and action representations was further
supported. Fourth, the role of the cerebellum in the perceptual analysis of BM was
explored in a lesion approach, clarifying inconsistent results from neuroimaging
studies. Cerebellar dysfunction was found not to affect detection of BM.
The present findings have several implications for theories on BM perception. It is
widely accepted that BM is processed as a special category i.e. that animate motion is
processed differently than inanimate motion. Based on results from Study 1 it was
suggested that the crucial feature making BM unique is the periodic dissipation
Chapter VI General Discussion
115
between potential and kinetic energy under the influence of gravity associated with
animate motion patterns. The development of a sensory filter sensitive for this specific
motion style during evolution might be the reason for the very fast and efficient
processing of BM.
Based on the findings from Study 2 two processing stages associated with the
perceptual analysis of BM processing were suggested. The STS-complex and the
fusiform face area were mainly involved in the second processing stage, whereas
attention-related areas seem to play an important role in the first component. One might
speculate that the later component reflects the fine analysis of BM. In contrast, the first
stage might reflect a pop-out effect of BM caused by activation of the neural correlate
of a sensory filter for the BM features as specified above.
Theories on action perception have proposed a strong interaction between visual and
motor representations of movement patterns. This view was supported by
psychophysical findings from Study 3 showing a viewpoint invariant representation of
one’s own movement pattern compared to movement patterns from other persons.
Therefore, perception of motor actions does not seem to depend on purely sensory
processing exclusively, but might be facilitated by the activation of premotor
representations.
The analysis of the role of the cerebellum with respect to biological motion showed
that intact cerebellar function is not a prerequisite for a preserved ability to detect BM.
This finding has some implications for the understanding of the neuronal correlates of
BM perception. Since motion perception per se, a function of the dorsal visual stream,
is affected by cerebellar dysfunction, intact BM perception with distinct cerebellar
lesions can be either explained by intact processing in the ventral stream compensating
for impaired processing in the dorsal stream or by no cerebellar involvement in BM
perception.
Findings from the studies included in this thesis might stimulate future research of BM
in several directions. The hypothesis concerning the features of a sensory filter for the
detection of BM has received first evidence. Such a sensory filter might be specified by
the periodic dissipation of potential and kinetic energy. It would be interesting to
Chapter VI General Discussion
116
design different classes of stimuli for psychophysical experiments in order to isolate
the crucial features for the sensory filter of BM. Next, it would be interesting to
investigate the neural activity elicited by these stimulus classes in neuroimaging
experiments in order to explore the neural correlates of such a sensory filter.
Differences in the mental representations of one’s own movement pattern and
movement patterns of other persons have been shown by a psychophysical approach in
this thesis. It would be interesting to examine the neural correlates of this difference in
representations by a neuroimaging approach. When the predictions based on the
psychophysical findings are correct, BM animations of one’s own movement patterns
are assumed to elicit a stronger premotor involvement than perception of movement
patterns from other persons. Information from BM can be used for different purposes,
e.g. for action understanding or for social perception. It would be interesting to know to
what extent BM perception for different purposes recruits the same neural structures or
relies on different neural networks. Similarly, hemispheric asymmetries associated with
BM perception for different purposes would be interesting to explore in future research.
Chapter VII References
117
VII References
Ackermann, H., Graber, S., Hertrich, I., & Daum, I. (1999). Cerebellar contributions to
the perception of temporal cues within the speech and nonspeech domain. Brain
and Language, 67(3), 228-241.
Adolphs, R. (1999). Social cognition and the human brain. Trends in Cognitive Science,
3(12), 469-479.
Adolphs, R. (2001). The neurobiology of social cognition. Current Opinion in
Neurobiology, 11(2), 231-239.
Adolphs, R. (2003). Investigating the cognitive neuroscience of social behavior.
Neuropsychologia, 41(2), 119-126.
Alexander, R. M. (1977). Mechanics and scaling of terrestrial locomotion. In T. J.
Pedley (Ed.), Scale Effects in Animal Locomotion (pp. 93-110). New York:
Academic Press.
Alexander, R. M. (1984). The gaits of bipedal and quadrupedal animals. The
International Journal of Robotics Research, 3(49-59).
Alexander, R. M. (1989). Optimization and gaits in the locomotion of vertebrates.
Physiological Reviews, 69(4), 1199-1227.
Alexander, R. M., & Jayes, A. S. (1983). A dynamic similarity hypothesis for the gaits
of quadrupedal mammals. Journal of Zoological Society of London, 201, 135-
152.
Allison, T., Puce, A., & McCarthy, G. (2000). Social perception from visual cues: role
of the STS region. Trends in Cognitive Science, 4(7), 267-278.
Bach, M., & Ullrich, D. (1994). Motion adaptation governs the shape of motion-evoked
cortical potentials. Vision Research, 34(12), 1541-1547.
Bach, M., & Ullrich, D. (1997). Contrast dependency of motion-onset and pattern-
reversal VEPs: interaction of stimulus type, recording site and response
component. Vision Research, 37(13), 1845-1849.
Chapter VII References
118
Barclay, C. D., Cutting, J. E., & Kozlowski, L. T. (1978). Temporal and spatial factors
in gait perception that influence gender recognition. Perception &
Psychophysics, 23, 145-152.
Battelli, L., Cavanagh, P., & Thornton, I. M. (2003). Perception of biological motion in
parietal patients. Neuropsychologia, 41(13), 1808-1816.
Beardsworth, T., & Buckner, T. (1981). The ability to recognize oneself from a video
recording of one's movements without seeing one's body. Bulletin of the
Psychonomic Society, 18, 19-22.
Beauchamp, M. S., Lee, K. E., Haxby, J. V., & Martin, A. (2003). FMRI responses to
video and point-light displays of moving humans and manipulable objects.
Journal of Cognitive Neuroscience, 15(7), 991-1001.
Beintema, J. A., & Lappe, M. (2002). Perception of biological motion without local
image motion. Proceedings of the National Academy of Sciences, 99(8), 5661-
5663.
Bentin, S., Allison, T., Puce, A., Perez, E., & McCarthy, G. (1996).
Electrophysiological studies of face perception in humans. Journal of Cognitive
Neuroscience, 8(6), 551-565.
Bertenthal, B. I., & Pinto, J. (1994). Global processing of biological motions.
Psychological Science, 5(4), 221-225.
Bertenthal, B. I., Proffitt, D. R., & Kramer, S. J. (1987). Perception of biological motion
by infants: implementation of various processing constraints. Journal of
Experimental Psychology: Human Perception and Performance, 13, 577-585.
Bertenthal, B. I., Proffitt, D. R., Spetner, N. B., & Thomas, M. A. (1985). The
development of infant sensitivity to biomechanical motions. Child Development,
56(3), 531-543.
Biederman, I., & Gerhardstein, P. C. (1993). Recognizing depth-rotated objects:
evidence and conditions for three-dimensional viewpoint invariance [published
erratum appears in Journal of Experimental Psychology: Human Perception and
Performance 1994 Feb;20(1):80]. Journal of Experimental Psychology: Human
Perception and Performance, 19(6), 1162-1182.
Chapter VII References
119
Biederman, I., & Gerhardstein, P. C. (1995). Viewpoint-dependent mechanisms in
visual object recognition - reply to Tarr and Bulthoff (1995). Journal of
Experimental Psychology: Human Perception & Performance, 21(6), 1506-
1514.
Bingham, G. P. (1987). Kinematic form and scaling: further investigations on the visual
perception of lifted weight. Journal of Experimental Psychology: Human
Perception & Performance, 13(2), 155-177.
Bingham, G. P. (1993a). Perceiving the size of trees: biological form and the horizon
ratio. Perception & Psychophysics, 54(4), 485-495.
Bingham, G. P. (1993b). Perceiving the size of trees: Form as information about scale.
Journal of Experimental Psychology: Human Perception and Performance, 19,
1139-1161.
Bingham, G. P. (1993c). Scaling judgments of lifted weight: Lifter size and the role of
the standard. Ecological Psychology, 5, 31-.64.
Blake, R. (1993). Cats perceive biological motion. Psychological Science, 4(1), 54-57.
Blakemore, S. J., & Decety, J. (2001). From the perception of action to the
understanding of intention. Nature Reviews Neuroscience, 2(8), 561-567.
Bonda, E., Petrides, M., Ostry, D., & Evans, A. (1996). Specific involvement of human
parietal systems and the amygdala in the perception of biological motion.
Journal of Neuroscience, 16(11), 3737-3744.
Bower, J. B. (1997). Control of sensory data acquisition. In J. D. Schmahmann (Ed.),
The cerebellum and cognition (Vol. 41, pp. 490-513). San Diego, London:
Academic Press.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10(4), 433-436.
Bruce, V., Valentine, T., & Baddeley, A. D. (1987). The basis of the 3/4 view advantage
in face recognition. Applied Cognitive Psychology, 1, 109-120.
Buccino, G., Binkofski, F., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, V., et al.
(2001). Action observation activates premotor and parietal areas in a
somatotopic manner: an fMRI study. European Journal of Neuroscience, 13(2),
400-404.
Chapter VII References
120
Bulthoff, H. H., & Edelman, S. (1992). Psychophysical support for a two-dimensional
view interpolation theory of object recognition. Proceedings of the National
Academy of Sciences, 89, 60-64.
Bülthoff, I., Bülthoff, H. H., & Sinah, P. (1998). Top-down influences on stereoscopic
depth-perception. Nature Neuroscience, 1(3), 254-257.
Bunge, S. A., Hazeltine, E., Scanlon, M. D., Rosen, A. C., & Gabrieli, J. D. (2002).
Dissociable contributions of prefrontal and parietal cortices to response
selection. Neuroimage, 17(3), 1562-1571.
Calvert, G. A., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C.,
McGuire, P. K., et al. (1997). Activation of auditory cortex during silent
lipreading. Science, 276(5312), 593-596.
Caramazza, A., & Shelton, J. R. (1998). Domain-specific knowledge systems in the
brain: the animate-inanimate distinction. Journal of Cognitive Neuroscience.,
10(1), 1-34.
Carter, C. S., Braver, T. S., Barch, D. M., Botvinick, M. M., Noll, D., & Cohen, J. D.
(1998). Anterior cingulate cortex, error detection, and the online monitoring of
performance. Science, 280(5364), 747-749.
Cavagna, G. A., Thys, H., & Zamboni, A. (1976). The sources of external work in level
walking and running. Journal of Physiology, 262(3), 639-657.
Cavanagh, P., Labianca, A. T., & Thornton, I. M. (2001). Attention-based visual
routines: sprites. Cognition, 80(1-2), 47-60.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York:
Erlbaum: Hillsdale.
Cowey, A., & Vaina, L. M. (2000). Blindness to form from motion despite intact static
form perception and motion detection. Neuropsychologia, 38(5), 566-578.
Cutting, J. E. (1978). Generation of synthetic male and female walkers through
manipulation of a biomechanical invariant. Perception, 7(4), 393-405.
Cutting, J. E., & Kozlowski, L. T. (1977). Recognizing friends by their walk: Gait
perception without familiarity cues. Bulletin of the Psychonomic Society, 9(5),
353-356.
Chapter VII References
121
Dahl, G. (1972). Reduzierter Wechsler Intelligenztest (Short Version of the Wechsler
Intelligence Test). Meisenheim: Hain.
Damasio, A. R. (1996). The somatic marker hypothesis and the possible functions of the
prefrontal cortex. Philosophical Transactions of the Royal Society of London
Series B, 351(1346), 1413-1420.
Damasio, H., Grabowski, T., Frank, R., Galaburda, A. M., & Damasio, A. R. (1994).
The return of Phineas Gage: clues about the brain from the skull of a famous
patient. Science, 264(5162), 1102-1105.
Daum, I., Schugens, M. M., Ackermann, H., Lutzenberger, W., Dichgans, J., &
Birbaumer, N. (1993). Classical conditioning after cerebellar lesions in humans.
Behavioral Neuroscience, 107(5), 748-756.
Daum, I., Snitz, B. E., & Ackermann, H. (2001). Neuropsychological deficits in
cerebellar syndromes. International Review in Psychiatry, 13, 268-275.
Davidson, R. J., Putnam, K. M., & Larson, C. L. (2000). Dysfunction in the neural
circuitry of emotion regulation--a possible prelude to violence. Science,
289(5479), 591-594.
Decety, J. (1996). Do imagined and executed actions share the same neural substrate?
Brain Research - Cognitive Brain Research, 3(2), 87-93.
Decety, J., & Grezes, J. (1999). Neural mechanisms subserving the perception of human
actions. Trends in Cognitive Sciences, 3(5), 172-178.
Decety, J., Sjoholm, H., Ryding, E., Stenberg, G., & Ingvar, D. H. (1990). The
cerebellum participates in mental activity: tomographic measurements of
regional cerebral blood flow. Brain Research, 535(2), 313-317.
Dichgans, J. E., & Diener, H. C. (1984). Clinical evidence for functional
compartmentalisation of the cerebellum. In J. Bloedel, J. Diechgans & W. Precht
(Eds.), Cerebellar functions (pp. 126-147). Berlin: Springer.
Dittrich, W. H. (1993). Action categories and the perception of biological motion.
Perception, 22(1), 15-22.
Chapter VII References
122
Dittrich, W. H., Lea, S. E. G., Barrett, J., & Gurr, P. R. (1998). Categorization of natural
movements by pigeons - visual concept discrimination and biological motion.
Journal of the Experimental Analysis of Behaviour, 70, 281-299.
Dittrich, W. H., Troscianko, T., Lea, S. E., & Morgan, D. (1996). Perception of emotion
from dynamic point-light displays represented in dance. Perception, 25(6), 727-
738.
Dittrich, W. H., Troscianko, T., Lea, S. E. G., & Morgan, D. (1996). Perception of
emotion from dynamic point-light displays represented in dance. Perception, 25,
727-738.
Eimer, M. (2000a). Effects of face inversion on the structural encoding and recognition
of faces. Evidence from event-related brain potentials. Brain Research -
Cognitive Brain Research, 10(1-2), 145-158.
Eimer, M. (2000b). Event-related brain potentials distinguish processing stages
involved in face perception and recognition. Clinical Neurophysiology, 111(4),
694-705.
Erdfelder, E., Faul, F., & Buchner, A. (1996). GPOWER: A general power analysis
program. Behavior Research Methods, Instruments and Computers, 28, 1-11.
Fabre-Thorpe, M., Delorme, A., Marlot, C., & Thorpe, S. (2001). A limit to the speed of
processing in ultra-rapid visual categorization of novel natural scenes. Journal of
Cognitive Neuroscience, 13(2), 171-180.
Fadiga, L., & Craighero, L. (2003). New insights on sensorimotor integration: from
hand action to speech perception. Brain and Cognition, 53(3), 514-524.
Fadiga, L., Fogassi, L., Pavesi, G., & Rizzolatti, G. (1995). Motor facilitation during
action observation: a magnetic stimulation study. Journal of Neurophysiology,
73(6), 2608-2611.
Foster, D. H., & Gilson, S. J. (2002). Recognizing novel three-dimensional objects by
summing signals from parts and views. Proceedings of the Royal Society of
London Series B, 269(1503), 1939-1947.
Fox, R., & McDaniel, C. (1982). The perception of biological motion by human infants.
Science, 218(4571), 486-487.
Chapter VII References
123
Frith, C. D., & Frith, U. (1999). Interacting minds--a biological basis. Science,
286(5445), 1692-1695.
Frith, U., & Frith, C. D. (2003). Development and neurophysiology of mentalizing.
Philosophical Transactions of the Royal Society of London Series B, 358(1431),
459-473.
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the
premotor cortex. Brain, 119, 593-609.
Gallese, V., & Goldmann, A. (1998). Mirror neurons and the simulation theory of mind-
reading. Trends in Cognitive Sciences, 2(12), 493-501.
Gangitano, M., Mottaghy, F. M., & Pascual-Leone, A. (2001). Phase-specific
modulation of cortical motor output during movement observation. Neuroreport,
12(7), 1489-1492.
Gao, J. H., Parsons, L. M., Bower, J. M., Xiong, J., Li, J., & Fox, P. T. (1996).
Cerebellum implicated in sensory acquisition and discrimination rather than
motor control. Science, 272(5261), 545-547.
Giese, M. A., & Poggio, T. (2003). Neural mechanisms for the recognition of biological
movements. Nature Reviews Neuroscience, 4(3), 179-192.
Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and
action. Trends in Neurosciences, 15(1), 20-25.
Grafton, S. T., Arbib, M. A., Fadiga, L., & Rizzolatti, G. (1996). Localization of grasp
representations in humans by positron emission tomography. 2. Observation
compared with imagination. Experimental Brain Research, 112(1), 103-111.
Grezes, J., Costes, N., & Decety, J. (1998). Top-down effect of strategy on the
perception of human biological motion: A PET investigation. Cognitive
Neuropsychology, 15(6-8), 553-582.
Grezes, J., Fonlupt, P., Bertenthal, B. I., Delon-Martin, C., Segebarth, C., & Decety, J.
(2001). Does perception of biological motion rely on specific brain regions?
Neuroimage, 13(5), 775-785.
Grossman, E. D., & Blake, R. (2001). Brain activity evoked by inverted and imagined
biological motion. Vision Research, 41(10-11), 1475-1482.
Chapter VII References
124
Grossman, E. D., & Blake, R. (2002). Brain Areas Active during Visual Perception of
Biological Motion. Neuron, 35(6), 1167-1175.
Grossman, E. D., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G., et al.
(2000). Brain areas involved in perception of biological motion. Journal of
Cognitive Neuroscience, 12(5), 711-720.
Hanakawa, T., Immisch, I., Toma, K., Dimyan, M. A., Van Gelderen, P., & Hallett, M.
(2003). Functional properties of brain areas associated with motor execution and
imagery. Journal of Neurophysiology, 89(2), 989-1002.
Hari, R., Forss, N., Avikainen, S., Kirveskari, E., Salenius, S., & Rizzolatti, G. (1998).
Activation of human primary motor cortex during action observation: a
neuromagnetic study. Proceedings of the National Academy of Sciences, 95(25),
15061-15065.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty of language: what is it,
who has it, and how did it evolve? Science, 298(5598), 1569-1579.
Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural
system for face perception. Trends in Cognitive Sciences, 4(6), 223-233.
Hecht, H., Kaiser, M. K., & Banks, M. S. (1996). Gravitational acceleration as a cue for
absolute size and distance? Perception & Psychophysics, 58(7), 1066-1075.
Hill, H., & Bruce, V. (1996). Effects of lighting on the perception of facial surfaces.
Journal of Experimental Psychology: Human Perception and Performance,
22(4), 986-1004.
Hill, H., Schyns, P. G., & Akamatsu, S. (1997). Information and viewpoint dependence
in face recognition. Cognition, 62(2), 201-222.
Hirai, M., Fukushima, H., & Hiraki, K. (2003). An event-related potentials study of
biological motion perception in humans. Neuroscience Letters, 344(1), 41-44.
Hochstein, S., & Ahissar, M. (2002). View from the top: hierarchies and reverse
hierarchies in the visual system. Neuron, 36(5), 791-804.
Hoffman, D. D., & Flinchbaugh, B. E. (1982). The interpretation of biological motion.
Biological Cybernetics, 42(3), 195-204.
Chapter VII References
125
Hoffmann, M. B., Unsold, A. S., & Bach, M. (2001). Directional tuning of human
motion adaptation as reflected by the motion VEP. Vision Research, 41(17),
2187-2194.
Howard, R. J., Brammer, M., Wright, I., Woodruff, P. W., Bullmore, E. T., & Zeki, S.
(1996). A direct demonstration of functional specialization within motion-
related visual and auditory cortex of the human brain. Current Biology, 6(8),
1015-1019.
Iacoboni, M., Woods, R. P., Brass, M., Bekkering, H., Mazziotta, J. C., & Rizzolatti, G.
(1999). Cortical mechanisms of human imitation. Science, 286, 2526-2528.
Ivry, R. B., & Diener, H. C. (1991). Impaired velocity perception in patients with
lesions of the cerebellum. Journal of Cognitive Neuroscience, 3(4), 355-366.
Ivry, R. B., Spencer, R. M., Zelaznik, H. N., & Diedrichsen, J. (2002). The cerebellum
and event timing. Annals of the New York Academy of Science, 978, 302-317.
Jellema, T., & Perrett, D. I. (2003a). Cells in monkey STS responsive to articulated
body motions and consequent static posture: a case of implied motion?
Neuropsychologia, 41(13), 1728-1737.
Jellema, T., & Perrett, D. I. (2003b). Perceptual history influences neural responses to
face and body postures. Journal of Cognitive Neuroscience, 15(7), 961-971.
Johansson, G. (1973). Visual perception of biological motion and a model for its
analysis. Perception & Psychophysics, 14(2), 201-211.
Johansson, G. (1976). Spatio-temporal differentiation and integration in visual motion
perception. Psychological Research, 38, 379-393.
Jokisch, D., Midford, P. E., & Troje, N. F. (2001). Biological motion as a cue for the
perception of absolute size. [Abstract] Journal of Vision, 1(3), 357a,
http://journalofvision.org/1/3/357/, doi:10.1167/1.3.357.
Justus, T. C., & Ivry, R. B. (2001). The cognitive neuropsychology of the cerebellum.
International Review in Psychiatry, 13, 276-282.
Kandel, E. R., Schwartz, J. H., & Jessell, T. M. (2000). Principles of neural science
(Fourth Edition ed.): McGraw-Hill.
Chapter VII References
126
Keele, S. W., & Ivry, R. (1990). Does the cerebellum provide a common computation
for diverse tasks? A timing hypothesis. Annals of the New York Academy of
Science, 608, 179-207; discussion 207-111.
Kleinke, C. L. (1986). Gaze and eye contact: a research review. Psychological Bulletin,
100(1), 78-100.
Kohler, E., Keysers, C., Umilta, M. A., Fogassi, L., Gallese, V., & Rizzolatti, G. (2002).
Hearing sounds, understanding actions: action representation in mirror neurons.
Science, 297(5582), 846-848.
Kourtzi, Z., & Kanwisher, N. (2000). Activation in human MT/MST by static images
with implied motion. Journal of Cognitive Neuroscience, 12(1), 48-55.
Kozlowski, L. T., & Cutting, J. E. (1977). Recognizing the sex of a walker from a
dynamic point-light display. Perception & Psychophysics, 21(6), 575-580.
Kram, R., Domingo, A., & Ferris, D. P. (1997). Effect of reduced gravity on the
preferred walk-run transition speed. Journal of Experimental Biology, 200(Pt 4),
821-826.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967).
Perception of the speech code. Psychological Review, 74(6), 431-461.
Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception
revised. Cognition, 21(1), 1-36.
Luft, A. R., Skalej, M., Stefanou, A., Klose, U., & Voigt, K. (1998). Comparing
motion- and imagery-related activation in the human cerebellum: a functional
MRI study. Human Brain Mapping, 6(2), 105-113.
MacLeod, C. M. (1988). Forgotten but not gone: Savings for pictures and words in
long-term memory. Journal of Experimental Psychology Learning, Memory, and
Cognition, 14(2), 195-212.
Mather, G., & Murdoch, L. (1994). Gender discrimination in biological motion displays
based on dynamic cues. Proceedings of the Royal Society of London Series B,
258, 273-279.
Chapter VII References
127
Mather, G., Radford, K., & West, S. (1992). Low-level visual processing of biological
motion. Proceedings of the Royal Society of London Series B, 249(1325), 149-
155.
Mather, G., & West, S. (1993). Recognition of animal locomotion from dynamic point-
light displays. Perception, 22(7), 759-766.
McConnell, D. S., Muchisky, M. M., & Bingham, G. P. (1998). The use of time and
trajectory forms as visual information about spatial scale in events. Perception
& Psychophysics, 60(7), 1175-1187.
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature,
264(5588), 746-748.
McLeod, P., Dittrich, W., Driver, J., Perrett, D., & Zihl, J. (1996). Preserved and
impaired detection of structure from motion by a "motion-blind" patient. Visual
Cognition, 3(4), 363-391.
Mesulam, M. M., Nobre, A. C., Kim, Y. H., Parrish, T. B., & Gitelman, D. R. (2001).
Heterogeneity of cingulate contributions to spatial attention. Neuroimage, 13(6
Pt 1), 1065-1072.
Mitkin, A. A., & Pavlova, M. A. (1990). Changing a natural orientation: Recognition of
biological motion pattern by children and adults. Psychologische Beitraege,
32(1-2), 28-35.
Nawrot, M., & Rizzo, M. (1995). Motion perception deficits from midline cerebellar
lesions in human. Vision Research, 35(5), 723-731.
Nawrot, M., & Rizzo, M. (1998). Chronic motion perception deficits from midline
cerebellar lesions in human. Vision Research, 38(14), 2219-2224.
Neville, H. J., Bavelier, D., Corina, D., Rauschecker, J., Karni, A., Lalwani, A., et al.
(1998). Cerebral organization for language in deaf and hearing subjects:
biological constraints and effects of experience. Proceedings of the National
Academy of Sciences, 95(3), 922-929.
Oram, M. W., & Perrett, D. I. (1994). Responses of anterior superior temporal
polysensory (STPa) neurons to "biological motion" stimuli. Journal of Cognitive
Neuroscience, 6(2), 99-116.
Chapter VII References
128
Oram, M. W., & Perrett, D. I. (1996). Integration of form and motion in the anterior
superior temporal polysensory area (STPa) of the macaque monkey. Journal of
Neurophysiology, 76(1), 109-129.
Pascual-Marqui, R. D., Lehmann, D., Koenig, T., Kochi, K., Merlo, M. C., Hell, D., et
al. (1999). Low resolution brain electromagnetic tomography (LORETA)
functional imaging in acute, neuroleptic-naive, first-episode, productive
schizophrenia. Psychiatry Research, 90(3), 169-179.
Pascual-Marqui, R. D., Michel, C. M., & Lehmann, D. (1994). Low resolution
electromagnetic tomography: a new method for localizing electrical activity in
the brain. International Journal of Psychophysiology, 18(1), 49-65.
Pavlova, M., Lutzenberger, W., Sokolov, A., & Birbaumer, N. (2004). Dissociable
cortical processing of recognizable and non-recognizable biological movement:
analysing gamma MEG activity. Cerebral Cortex, 14(2), 181-188.
Pavlova, M., & Sokolov, A. (2000). Orientation specificity in biological motion
perception. Perception & Psychophysics, 62(5), 889-899.
Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: transforming
numbers into movies. Spatial Vision, 10(4), 437-442.
Pelphrey, K. A., Mitchell, T. V., McKeown, M. J., Goldstein, J., Allison, T., &
McCarthy, G. (2003). Brain activity evoked by the perception of human
walking: controlling for meaningful coherent motion. Journal of Neuroscience,
23(17), 6819-6825.
Pennycuick, C. J. (1975). On the running of the gnu (Connochaetes taurinus) and other
animals. Journal of Experimental Biology, 63, 775-799.
Pinto, J., & Shiffrar, M. (1999). Subconfigurations of the human form in the perception
of biological motion displays. Acta Psychologica, 102(2-3), 293-318.
Pittenger, J. B. (1985). Estimation of pendulum length from information in motion.
Perception, 14(3), 247-256.
Pittenger, J. B. (1990). Detection of violations of the law of pendulum motion:
Observers' sensitivity to the relation between period and length. Ecological
Psychology, 2(1), 55-81.
Chapter VII References
129
Pittenger, J. B., & Todd, J. T. (1983). Perception of growth from changes in body
proportions. Journal of Experimental Psychology: Human Perception and
Performance, 9(6), 945-954.
Pollick, F. E., Paterson, H. M., Bruderlin, A., & Sanford, A. J. (2001). Perceiving affect
from arm movement. Cognition, 82(2), B51-61.
Posner, M. I., & DiGirolamo, G. J. (1998). Executive attention: Conflict, target
detection, and cognitive control. In R. Parasuraman (Ed.), The attentive brain
(pp. 401-423). Cambridge, Massachusetts: MIT Press.
Puce, A., & Perrett, D. (2003). Electrophysiology and brain imaging of biological
motion. Philosophical Transactions of the Royal Society of London Series B,
358(1431), 435-445.
Puce, A., Smith, A., & Allison, T. (2000). ERPs evoked by viewing facial movements.
Cognitive Neuropsychology, 17(1-3), 221-239.
Puce, A., Syngeniotis, A., Thompson, J. C., Abbott, D. F., Wheaton, K. J., & Castiello,
U. (2003). The human temporal lobe integrates facial form and motion: evidence
from fMRI and ERP studies. Neuroimage, 19(3), 861-869.
Regolin, L., Tommasi, L., & Vallortigara, G. (1999). Discrimination of point-light
animation sequences by newborn chicks. Perception, 28 Supplement, 23.
Rizzolatti, G., & Fadiga, L. (in press). The mirror-neuron system and action recognition.
In H. J. Freund, M. Jeannerod & M. Hallett (Eds.), Higher-order motor
disorders: from Neuroanatomy and Neurobiology to Clinical Neurology. New
York: Oxford university Press.
Rizzolatti, G., Fadiga, L., Gallese, V., & Fogassi, L. (1996). Premotor cortex and the
recognition of motor actions. Cognitive Brain Research, 3(2), 131-141.
Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., et al.
(1996). Localization of grasp representations in humans by PET: 1. Observation
versus execution. Experimental Brain Research, 111(2), 246-252.
Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms
underlying the understanding and imitation of action. Nature Reviews
Neuroscience, 2(9), 661-670.
Chapter VII References
130
Rosenblum, L. D., Johnson, J. A., & Saldana, H. M. (1996). Point-light facial displays
enhance comprehension of speech in noise. Journal of Speech and Hearing
Research, 39(6), 1159-1170.
Rosenblum, L. D., & Saldana, H. M. (1996). An audiovisual test of kinematic primitives
for visual speech perception. Journal of Experimental Psychology: Human
Perception and Performance, 22(2), 318-331.
Rossion, B., & Gauthier, I. (2002). How does the brain process upright and inverted
faces? Behavioural and Cognitive Neuroscience Reviews, 1(1), 62-74.
Runeson, S., & Frykholm, G. (1981). Visual perception of lifted weight. Journal of
Experimental Psychology: Human Perception and Performance, 7(4), 733-740.
Runeson, S., & Frykholm, G. (1983). Kinematic specification of dynamics as an
informational basis for person-and-action perception: Expectation, gender
recognition, and deceptive intention. Journal of Experimental Psychology:
General, 112(4), 585-615.
Ryding, E., Decety, J., Sjoholm, H., Stenberg, G., & Ingvar, D. H. (1993). Motor
imagery activates the cerebellum regionally. A SPECT rCBF study with 99mTc-
HMPAO. Brain Research - Cognitive Brain Research, 1(2), 94-99.
Santi, A., Servos, P., Vatikiotis-Bateson, E., Kuratate, T., & Munhall, K. (2003).
Perceiving biological motion: dissociating visible speech from walking. Journal
of Cognitive Neuroscience, 15(6), 800-809.
Saxberg, B. V. (1987a). Projected free fall trajectories. I. Theory and simulations.
Biological Cybernetics, 56(2-3), 159-175.
Saxberg, B. V. (1987b). Projected free fall trajectories. II. Human experiments.
Biological Cybernetics, 56(2-3), 177-184.
Saygin, A. P., Wilson, S. M., Hagler, D. J., Jr., Bates, E., & Sereno, M. I. (2004). Point-
light biological motion perception activates human premotor cortex. Journal of
Neuroscience, 24(27), 6181-6188.
Schenk, T., & Zihl, J. (1997). Visual motion perception after brain damage: II. Deficits
in form-from-motion perception. Neuropsychologia, 35(9), 1299-1310.
Chapter VII References
131
Schmahmann, J. D., & Pandya, D. N. (1997). The cerebrocerebellar system. In J. D.
Schmahmann (Ed.), The cerebellum and cognition (Vol. 41, pp. 31-60). San
Diego, London: Academic Press.
Sedgwick, H. A. (1993). The effects of viewpoint on the virtual space of pictures. In S.
R. Ellis, M. K. Kaiser & A. Grunwald (Eds.), Pictorial communication in virtual
and real environments. New York: Taylor & Francis.
Servos, P., Osu, R., Santi, A., & Kawato, M. (2002). The Neural Substrates of
Biological Motion Perception: an fMRI Study. Cerebral Cortex, 12(7), 772-782.
Shiffrar, M., & Freyd, J. J. (1990). Apparent motion of the human body. Psychological
Science, 1(4), 257-264.
Shiffrar, M., & Freyd, J. J. (1993). Timing and apparent motion path choice with human
body photographs. Psychological Science, 4, 379-384.
Shipley, T. F. (2003). The effect of object and event orientation on perception of
biological motion. Psychological Science, 14(4), 377-380.
Small, D. M., Gitelman, D. R., Gregory, M. D., Nobre, A. C., Parrish, T. B., &
Mesulam, M. M. (2003). The posterior cingulate and medial prefrontal cortex
mediate the anticipatory allocation of spatial attention. Neuroimage, 18(3), 633-
641.
Spencer, R. M., Zelaznik, H. N., Diedrichsen, J., & Ivry, R. B. (2003). Disrupted timing
of discontinuous but not continuous movements by cerebellar lesions. Science,
300(5624), 1437-1439.
Stappers, P. J., & Waller, P. E. (1993). Using the free fall of objects under gravity for
visual depth estimation. Bulletin of the Psychonomic Society, 31(2), 125-127.
Sumi, S. (1984). Upside-down presentation of the Johansson moving light-spot pattern.
Perception, 13(3), 283-286.
Talairach, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas of the human brain:
Thieme Medical Publishers.
Tarr, M. J., & Bulthoff, H. H. (1995). Is human object recognition better described by
geon structural descriptions or by multiple views? Comment on Biederman and
Chapter VII References
132
Gerhardstein (1993). Journal of Experimental Psychology: Human Perception
and Performance, 21(6), 1494-1505.
Thier, P., Haarmeier, T., Treue, S., & Barash, S. (1999). Absence of a common
functional denominator of visual disturbances in cerebellar disease. Brain, 122,
2133-2146.
Thompson, P. (1980). Margaret Thatcher -- A new illusion. Perception, 9, 483-484.
Thornton, I. M., Rensink, R. A., & Shiffrar, M. (2002). Active versus passive
processing of biological motion. Perception, 31(7), 837-853.
Thornton, I. M., & Vuong, Q. C. (2004). Incidental processing of biological motion.
Current Biology, 14(12), 1084-1089.
Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual
system. Nature, 381(6582), 520-522.
Topka, H., Valls-Sole, J., Massaquoi, S. G., & Hallett, M. (1993). Deficit in classical
conditioning in patients with cerebellar degeneration. Brain, 116 ( Pt 4), 961-
969.
Troje, N. F. (2002a). Decomposing biological motion: a framework for analysis and
synthesis of human gait patterns. Journal of Vision, 2(5), 371-387.
Troje, N. F. (2002b). The little difference: Fourier based gender classification from
biological motion. In R. P. Würtz & M. Lappe (Eds.), Dynamic Perception (pp.
115-120). Berlin: Aka Verlag.
Troje, N. F. (2003). Reference frames for orientation anisotropies in face recognition
and biological-motion perception. Perception, 32(2), 201-210.
Troje, N. F., & Bulthoff, H. H. (1996). Face recognition under varying poses: the role of
texture and shape. Vision Research, 36(12), 1761-1771.
Troje, N. F., & Kersten, D. (1999). Viewpoint-dependent recognition of familiar faces.
Perception, 28(4), 483-487.
Troje, N. F., Westhoff, C., & Lavrov, M. (in press). Person identification from
biological motion: Effects of structural and dynamic cues. Perception &
Psychophysics.
Chapter VII References
133
Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. L. Ingle,
J. W. Mansfield & M. A. Goodale (Eds.), Advances in the Analysis of Visual
Behavior (pp. 549-596). Cambridge, MA: MIT Press.
Vaina, L. M. (1994). Functional segregation of color and motion processing in the
human visual cortex: clinical evidence. Cerebral Cortex, 4(5), 555-572.
Vaina, L. M., Lemay, M., Bienfang, D. C., Choi, A. Y., & Nakayama, K. (1990). Intact
"biological motion" and "structure from motion" perception in a patient with
impaired motion mechanisms: A case study. Visual Neuroscience, 5(4), 353-
369.
Vaina, L. M., Solomon, J., Chowdhury, S., Sinha, P., & Belliveau, J. W. (2001).
Functional neuroanatomy of biological motion perception in humans.
Proceedings of the National Academy of Sciences, 11, 11.
Valentine, T. (1988). Upside-down faces: a review of the effect of inversion upon face
recognition. British Journal of Psychology, 79(Pt 4), 471-491.
Warren, W. H., Jr., Kim, E. E., & Husney, R. (1987). The way the ball bounces: visual
and auditory perception of elasticity and control of the bounce pass. Perception,
16(3), 309-336.
Watson, J. S., Banks, M. S., von Hofsten, C., & Royden, C. S. (1992). Gravity as a
monocular cue for perception of absolute distance and/or absolute size.
Perception, 21(1), 69-76.
Webb, J. A., & Aggarwal, J. K. (1982). Structure from motion of rigid and jointed
objects. Artificial Intelligence, 19(1), 107-130.
Wheaton, K. J., Pipingas, A., Silberstein, R. B., & Puce, A. (2001). Human neural
responses elicited to observing the actions of others. Visual Neuroscience, 18(3),
401-406.
Woodruff-Pak, D. S., Papka, M., & Ivry, R. B. (1996). Cerebellar involvement in
eyeblink classical conditioning in humans. Neuropsychology, 10, 443-458.
Yamaguchi, M. K., & Fujita, K. (1999). Perception of biological motion by newly
hatched chicks and quail. Perception, 28 Supplement, 23-24.
Chapter VII References
134
Zimmermann, P., & Fimm, B. (1993). Testbatterie zur Aufmerksamkeitsprüfung.
Würselen: PSYTEST.
135
List of Partial Publications
Jokisch, D., Daum, I., Suchan, B., & Troje, N. F. (in press). Structural encoding and
recognition of biological motion: Evidence from event-related potentials and
source analysis. Behavioural Brain Research.
Jokisch, D., Daum, I., & Troje, N. F. (submitted). Self recognition versus recognition of
others by biological motion: Viewpoint-dependent effects. Perception.
Jokisch, D., & Troje, N. F. (2003). Biological motion as a cue for the perception of size.
Journal of Vision, 3(4), 252-264.
Jokisch, D., Troje, N. F., Koch, B., Schwarz, M., & Daum, I. (submitted). Differential
involvement of the cerebellum in biological and coherent motion perception.
European Journal of Neuroscience.
136
Declaration
I guarantee that I have written this dissertation autonomously and without any
illegitimate aids, the references and aids used are cited in their entity. This dissertation
has not been submitted to another faculty, it has not been published yet with the
exception of the partial publications listed below. I guarantee that I will not publish the
dissertation before completion of the promotion procedure.
I have complied with the regulations laid down in the latest version of the “Guidelines
for Good Scientific Practice and Procedural Principles for Dealing with Suspected
Infringements in Academic Research Work”
137
Acknowledgments
I would like to acknowledge many people for helping me during my doctoral work.
First of all I would like to thank my supervisors Prof. Dr. Irene Daum and Prof. Dr.
Nikolaus F. Troje. Throughout my doctoral work they supported me with their great
engagement, their excellent knowledge and their analytical skills.
I would like to thank the International Graduate School of Neuroscience of the Ruhr-
University Bochum for providing great research and education opportunities and for
funding my research and attendance at international conferences.
I am grateful to my fellow colleagues at the Department of Neuropsychology and the
Biomotion-Laboratory of the Institute of Cognitive Neuroscience for their assistance
and fruitful discussions concerning my work. Special thanks to my fellow graduate-
student Christian Bellebaum for proof reading and helpful comments on a preliminary
version of my dissertation. Thanks to Cord Westhoff for programming some of the
stimuli used in my experiments.
I would like to thank my fellow graduate-students at the International Graduate School
of Neuroscience for their support during the entire period of our studies.
Last but not least I am especially grateful to my girl-friend, my family and all my
friends for their understanding, patience and for helping me to keep my life in balance.
138
Curriculum Vitae
Personal Data
Name: Daniel Jokisch
Date of birth: 22.07.1975
Place of birth: Bottrop, Germany
Nationality: German
Address: Institute of Cognitive Neuroscience
Department of Neuropsychology
Ruhr-University Bochum
Universitätsstr. 150, 44780 Bochum
e-mail: [email protected]
Private address: Am Gartenkamp 18, 44807 Bochum
Educational background
1982-1995 Primary school and grammar school in Bottrop, degree
“Allgemeine Hochschulreife”
1995-1996 Alternative civilian service at the “Ambulanz Hilfe für das
autistische Kind” in Bottrop, Germany
1996-1999 Psychology student at the University of Trier, Germany
139
October 1998 Intermediate diploma in psychology (cumulative grade “sehr
gut”)
1999-2001 Psychology student at the Ruhr-University Bochum, Germany;
student assistant at the Department of Biopsychology
2000-2001 Diploma thesis “Visuelle Wahrnehmung von absoluter Größe in
biologischer Bewegung”
October 2001 Diploma in Psychology “with distinction”
2001-2004 PhD-Student at the “International Graduate School of
Neuroscience“ of the Ruhr-University Bochum
since October 2004 Research assistant at the Department of Neuropsychology of the
Institute of Cognitive Neuroscience of the Ruhr-University
Bochum