“Agency and motor representations: new perspectives on intersubjectivity”*
Vittorio
Gallese
Istituto di Fisiologia Umana. Università
di Parma, Italy
*This paper includes abridged and slightly revised versions of several papers whose refs. are given at the end. The relevant bibliography can be retrieved from these papers.
-I. A new perspective on the motor system: the ventral premotor area F5 of the monkey.
Convergent anatomical evidence (see Matelli
and Luppino 1997) shows that the ventral premotor cortex (referred to also
as inferior area 6) is composed of two distinct areas, designated as F4
and F5 (Matelli et al., 1985). Area F5 occupies the rostralmost part of
inferior area 6, extending rostrally within the posterior bank of the inferior
limb of the arcuate sulcus. Area F5 is reciprocally connected with the
hand field of the primary motor cortex (Matelli et al. 1986) and has direct,
although limited projections to the upper cervical segments of the spinal
cord (He et al. 1993). Intracortical microstimulation evokes in F5 hand
and mouth movements at thresholds generally higher than in the primary
motor cortex (Gentilucci et al. 1988; Hepp-Reymond et al. 1994). The functional
properties of F5 neurons were assessed in a series of single unit recording
experiments (Rizzolatti et al., 1981; Okano and Tanji, 1987; Rizzolatti
et al. 1988). These experiments showed that the activity of F5 neurons
is correlated with specific distal motor acts and not with the execution
of individual movements.
An important distinction to be made is
that between movement and motor act: what makes of a movement a motor act
is the presence of a goal. Using the effective motor act as the classification
criterion, the following classes of neurons were described: “Grasping neurons”,
“Holding neurons”, “Tearing neurons”, and “Manipulation neurons”.
The most interesting aspect of F5 neurons
is that they code movement in quite abstract terms. What is coded is not
simply a parameter such as force or movement direction, but rather the
relationship, in motor terms, between the agent and the object of the action.
F5 neurons become active only if a particular type of action (e.g. grasp,
hold, etc.) is executed to achieve a particular type of goal (e.g. to take
possession of a piece of food, to throw away an object, etc.).
The metaphor of a “motor vocabulary” has
been introduced (Gentilucci and Rizzolatti 1990; Rizzolatti et al. 1988)
in order to conceptualize the function of these neurons. This vocabulary
collects various “words”, each constituted by groups of neurons related
to different motor acts. The hierarchical value of these “words” can be
different: some of them indicate the general goal of the action (e.g. grasp,
hold, tear). Some other “words” concern the way in which a particular action
has to be executed, e.g. to grasp with the index finger and the thumb (precision
grip). Another group of “words” deals with the temporal phases in which
the action to be performed can be segmented (e.g. hand aperture phase).
The presence in the motor system of a “vocabulary”
of motor acts allows a much simpler selection of a particular action within
a given context. Within the context of a motor “vocabulary”, motor action
can be conceived as a simple assembly of words, instead of being described
in the less economical terms of the control of individual movements. This
radically new way to conceive the function of the motor system opened the
possibility, that will become clearer in the following sections, to tackle
from a neurobiological perspective cognitive aspects of behavior such as
intersubjectivity.
Canonical neurons
Since most grasping actions are executed
under visual guidance, it is extremely interesting to elucidate the relationship
between the features of 3D visual objects and the specific "words" of the
motor vocabulary. In this logic the appearance of a graspable object in
the visual space will retrieve immediately the appropriate ensemble of
"words". This process, in neurophysiological terms, implies that the same
neuron must be able not only to code motor acts, but also to respond to
the visual features triggering them.
A considerable percentage of F5 grasping
neurons, "Canonical neurons", respond to the visual presentation of objects
of different size and shape in absence of any detectable movement (Rizzolatti
et al. 1988; Jeannerod et al.1995; Murata et al. 1997). Very often a strict
congruence has been observed between the type of grip coded by a given
neuron and the size or the shape of the object effective in triggering
its visual response. The most interesting aspect, however, is the fact
that in a considerable percentage of neurons the congruence is observed
between the high selectivity for a given type of executed grip and the
selectivity for the visual presentation of a group of objects that, although
differing in shape, nevertheless all "afford" the same type of grip, which
is identical to the motorically coded one.
A first conclusion that can be drawn from
these data is that it is extremely difficult to conceptualize the function
of F5 canonical grasping neurons in purely sensory or motor terms. At this
stage objects seem to be processed in relational terms. In other words,
by means of a neural network, a series of physical entities, 3D objects,
are identified and differentiated not in relation to their mere physical
appearance, but in relation to the effect of the interaction with an acting
agent.
Mirror neurons
A second class of grasping-related neurons,
mirror neurons, has been described in area F5 of the macaque monkey (Gallese
et al. 1996; Rizzolatti et al. 1996a). These neurons, although sharing
with canonical neurons the same motor properties, sharply differ for the
nature of their visual properties. Mirror neurons are not activated during
the observation of objects but only during the observation of an agent
(a human being or a monkey) acting in a purposeful way with his hand or
his mouth upon objects. Neither the sight of the object alone or of the
agent alone are effective in driving these neurons. Mimicking the action
in the absence of the target object, or using a tool to execute the object-related
action are similarly ineffective in driving mirror neurons' activity.
In a relevant percentage of mirror neurons
a strict congruence between the observed action effective in triggering
the neural visual response and the executed action effective in driving
the motor response has been observed. In other words, the observed action
performed by another individual, evokes in the observer the same neural
pattern that occurs during its active execution of that action. Grasping,
holding, manipulating or tearing objects are the actions that, both when
observed and executed, most frequently activate these neurons.
On the basis of their functional properties
mirror neurons can be considered as constituting an action observation/execution
matching system. What is the link between acting and observing someone
else acting? This link is constituted by the presence in both instances
of a goal. This goal is recognized and "understood" by the observer by
mapping it on a shared motor representation. Again, as in the case of canonical
neurons, the motor system exhibits a double function. On one side it supervises
action execution, and on the other it "validates" in motor terms what is
perceived.
II. Agency: Objects, Actions and their "Representations".
Agency -and the related concept of motor representation- is particularly interesting from a neurophysiological point of view. The motor system is mastering not only the actual expression of its domain-feature: movement; the motor system masters also its own representation.
All different aspects of action, from its intention to its execution or observation, are part of a single representation-execution continuum. This continuum provides a unified but at the same time very flexible frame of reference that allows, by means of a process of analogy, to map the non-self onto the self.
Both canonical and mirror neurons become
active motorically only when a given movement pattern of the hand is aimed
to a certain target in a certain way to achieve a certain goal. Within
this context, a goal could be conceptually defined as the explanation in
teleological terms of a willed relational attitude. I posit that this relational
attitude is used also to give, at a pre-conceptual level, (well before
the development of the linguistic competence), preliminary "intentional"
coherence to the array of visual stimuli we are exposed to. An object,
as coded by canonical neurons, is transformed from a physical textured
pattern of given shape, size and color into something that acquires its
meaning in virtue of being constituted as the target of an action. The
physical object becomes an intentional object. At this stage, the nervous
system elaborates a code that "classifies" the objects of the external
world according to their relational value for the acting subject. The object
ceases to exist by itself but acquires a meaning in virtue of its relation
to the acting subject.
There is undoubtedly a similarity between this
“intentional” model of object coding and the “pragmatic” type of coding
considered as the distinctive hallmark of the visual processing occurring
in the dorsal stream (see Jeannerod 1994). Within the “pragmatic” model,
however, the emphasis is put onto the physical properties of a given object
that can “afford” a given pragmatic relation with the acting agent, while
the “intentional” model stresses more the relevant role of the acting subject
in determining the meaning of the physical world (see also Bermudez 1995).
The same relational attitude is applied
when observing other behaving individuals. The observer begins to "understand"
the observed behavior of a third party when this process of "motor equivalence"
between action observation/execution is established by means of a shared
motor representation.
This notion of motor representation configures
a pre-conceptual level of analysis of information, which is deeply rooted
in the intrinsically relational nature of the motor system. From this perspective,
agency constitutes the key for understanding, both in phylogenetic and
ontogenetic terms, how our knowledge of the world is built.
-III. Mirror neurons and mindreading.
Mind-reading is the activity of representing
specific mental states of others, e.g., their perceptions, goals, beliefs
or expectations, and the like. It is generally agreed that all normal humans
develop the capacity to represent mental states in others, a system of
representation often called folk psychology. Whether non-human primates
also deploy folk psychology is more controversial (, but it certainly has
not been precluded. The hypothesis explored here is that MNs are part of
--perhaps a rudimentary part of-- the folk psychologizing mechanism.
Like imitation learning, mind-reading could
make a contribution to inclusive fitness. Detecting another agent's goals
and/or other inner states can be useful to an observer because it helps
him anticipate the agent's future actions, which may be cooperative, non-cooperative,
or even threatening. Accurate understanding and anticipation enable the
observer to adjust his responses appropriately.
It is conceivable that externally generated
MN activity serves the purpose of "retrodicting" the target's mental state,
moving backwards from the observed action. Let us interpret internally
generated activation in MNs as constituting a plan to execute a certain
action, e.g., the action of holding a certain object, grasping it, or manipulating
it. When the same MNs are externally activated -- by observing a target
agent execute the same action -- MN activation still constitutes a plan
to execute this action. But in the latter case the subject of the MN activity
knows (visually) that the observed target is concurrently performing this
very action. So we assume that he "tags" the plan in question as belonging
to that target. In fact, externally generated MN activity does not normally
produce motor execution of the plan in question. Externally generated plans
are largely inhibited, or taken "off line", precisely as Simulation Theory
postulates. So MN activity seems to be nature's way of getting the observer
into the same "mental shoes" as the target, which is exactly what the conjectured
simulation heuristic is all about.
However, in the monkey case the attributer
does not go back to a distal goal or set of beliefs. He only goes back
to a motoric plan. Still, this seems to be a "primitive" use of simulation
which bears a resemblance to the motor theory of speech perception advocated
by Liberman, in which the common link between the sender and the receiver
is not sound but the neural mechanism, shared by both, allowing the production
of phonetic gestures.
The speculative suggestion that we have
put forward (see Gallese and Goldman 1998) is that a "cognitive continuity"
seems to exist within the domain of intentional states attribution between
non-human primates and humans, and that MNs represent its neural correlate.
This continuity is grounded in the ability to detect goals in the observed
behavior of conspecifics. The capacity to understand action goals, already
present in non-human primates, relies on a process, which matches the observed
behavior onto the action plans of the observer. It is true, as pointed
out by Meltzoff and Moore, that the understanding of action goals does
not imply a full grasp of mental states like belief, or desire. Action
goals understanding nevertheless constitute a necessary phylogenetic stage
within the evolutionary path leading to the fully developed mind-reading
abilities of human beings.
References
Gallese, V., Fadiga, L., Fogassi, L. and Rizzolatti, G. Action recognition in the premotor cortex. Brain 119: 593-609, 1996.
Fadiga, L. and Gallese, V. Action representation and language in the brain. Theoretical Linguistics, 23: 267-280, 1997.
Rizzolatti, G. and Gallese, V. From action to meaning. In: Les Neurosciences et la Philosophie de l'Action. J.-L. Petit (ed.). Librairie Philosophique J. Vrin, Paris, 1997.
Gallese, V. and Goldman, A. Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences, 12: 493-501, 1998.
Gallese, V. Agency and the self model. Consc. Cogn. 8: 837-839, 1999.
Gallese, V. From grasping to language: mirror neurons and the origin of social communication. In: Towards A Science of Consciousness. S. Hameroff, A. Kazniak and D. Chalmers (eds.), MIT Press,2000.
Gallese, V. The acting subject: towards the neural basis of social cognition. In: Neural Correlates of Consciousness - Empirical and Conceptual Questions. T. Metzinger (ed.), pp. 325-333, MIT Press, 2000.
Rizzolatti, G., Fogassi, L. and Gallese, V. Cortical mechanisms subserving object grasping and action recognition: a new view on the cortical motor functions. In: The new Cognitive Neurosciences, 2nd Edition, M. Gazzaniga (ed.), pp. 539-552, MIT Press, 2000.
Rizzolatti, G., Fadiga, L., Fogassi, L. and Gallese, V. From mirror neurons to imitation: facts and speculations. In : W. Prinz and A. Meltzoff (eds.), The Imitative Mind: Development, Evolution and Brain Bases, Cambridge University Press, 2000 (in press)