from sensation to cognition brain 1998

Upload: constanza-silva-martel

Post on 02-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 From Sensation to Cognition Brain 1998

    1/40

    Brain (1998), 121, 10131052

    R E V I E W A R T I C L E

    From sensation to cognitionM.-Marsel Mesulam

    The Cognitive Neurology and Alzheimers Disease Center, Correspondence to: M. Mesulam, Cognitive Neurology and Departments of Neurology and Psychiatry and Behavioral Alzheimers Disease Center, Northwestern UniversitySciences, Northwestern University Medical School, Medical School, 320 East Superior Street, 11450,Chicago, USA Chicago, IL 60611, USA. E-mail: [email protected]

    SummarySensory information undergoes extensive associativeelaboration and attentional modulation as it becomesincorporated into the texture of cognition. This processoccurs along a core synaptic hierarchy which includesthe primary sensory, upstream unimodal, downstreamunimodal, heteromodal, paralimbic and limbic zones of the cerebral cortex. Connections from one zone to anotherare reciprocal and allow higher synaptic levels to exerta feedback (top-down) inuence upon earlier levels of processing. Each cortical area provides a nexus for theconvergence of afferents and divergence of efferents. Theresultant synaptic organization supports parallel as wellas serial processing, and allows each sensory event toinitiate multiple cognitive and behavioural outcomes.Upstream sectors of unimodal association areas encodebasic features of sensation such as colour, motion, formand pitch. More complex contents of sensory experiencesuch as objects, faces, word-forms, spatial locations andsound sequences become encoded within downstreamsectors of unimodal areas by groups of coarsely tunedneurons. The highest synaptic levels of sensory-fugalprocessing are occupied by heteromodal, paralimbic andlimbic cortices, collectively known as transmodal areas.The unique role of these areas is to bind multiple unimodalandother transmodal areas into distributed but integratedmultimodal representations. Transmodal areas in the

    midtemporal cortex, Wernickes area, the hippocampalentorhinal complex and the posterior parietal cortexprovide critical gateways for transforming perception intorecognition, word-forms into meaning, scenes and eventsinto experiences, and spatial locations into targets forexploration. All cognitive processes arise from analogousassociative transformations of similar sets of sensoryinputs. The differences in the resultant cognitive operationare determined by the anatomical and physiologicalproperties of the transmodal node that acts as the criticalgateway for the dominant transformation. Interconnectedsets of transmodal nodes provide anatomical and

    Oxford University Press 1998

    computational epicentres for large-scale neurocognitivenetworks. In keeping with the principles of selectivelydistributed processing, each epicentre of a large-scalenetwork displays a relative specialization for a specicbehavioural component of its principal neurospycho-logical domain. The destruction of transmodal epicentrescauses global impairments such as multimodal anomia,neglect and amnesia, whereas their selective disconnectionfrom relevant unimodal areas elicits modality-specicimpairments such as prosopagnosia, pure word blindnessand category-specic anomias. The human brain containsat least ve anatomically distinct networks. The networkfor spatial awareness is based on transmodal epicentresin the posterior parietal cortex and the frontal eye elds;the language network on epicentres in Wernickes andBrocas areas; the explicit memory/emotion network onepicentres in the hippocampalentorhinal complex andthe amygdala; the face-object recognition network onepicentres in the midtemporal and temporopolar cortices;and the working memory-executive function network onepicentres in the lateral prefrontal cortex and perhapsthe posterior parietal cortex. Individual sensorymodalities give rise to streams of processing directed totransmodal nodes belonging to each of these networks.The delity of sensory channels is actively protectedthrough approximately four synaptic levels of sensory-

    fugal processing. The modality-specic cortices at thesefour synaptic levels encode the most veridicalrepresentations of experience. Attentional, motivationaland emotional modulations, including those related toworking memory, novelty-seeking and mental imagery,become increasingly more pronounced within downstreamcomponents of unimodal areas, where they help to createa highly edited subjective version of the world. Theprefrontal cortex plays a critical role in these attentionaland emotional modulations and allows neural responsesto reect the signicance rather than the surfaceproperties of sensory events. Additional modulatory

  • 8/10/2019 From Sensation to Cognition Brain 1998

    2/40

    1014 M.-M. Mesulam

    inuences are exerted by the cholinergic and mono-aminergic pathways of the ascending reticular activatingsystem. Working memory, one of the most prominentmanifestations of prefrontal cortex activity, prolongs theneural impact of environmental and mental events in a

    way that enriches the texture of consciousness. Thesynaptic architecture of large-scale networks and themanifestations of working memory, novelty-seekingbehaviours and mental imagery collectively help to loosenthe rigid stimulusresponse bonds that dominate thebehaviour of lower animal species. This phylogenetictrend has helped to shape the unique properties of human

    Keywords : cerebral cortex; language; memory; consciousness; attention; neural networks

    Abbreviations : AIT anterior inferotemporal cortex; ARAS ascending reticular activating system; BA Brodmannarea; LIP lateral intraparietal sulcus; MST medial superior temporal area; PIT posterior inferotemporal cortex

    IntroductionA major task of the CNS is to congure the way in whichsensory information becomes linked to adaptive responsesand meaningful experiences. Small brains cope with thischallenge by shifting the detection of signicant events toperipheral organs. In the frog, for example, the inner ear isselectively tuned to spectral and temporal properties of species-specic mating calls (Capranica, 1978) and theretinotectal projection contains bug perceiver bres thatrespond best when a dark object, smaller than a receptiveeld, enters that eld, stops, and moves about intermittently

    (Lettvin et al. , 1959). Although this type of automatic patterndetection maximizes the limited processing capacity of theCNS, it also restricts the range of events that can be identiedand the exibility with which they become interpreted.

    Inexible bonds between sensation and action lead toinstinctual and automatic behaviours that are resistant tochange, even when faced by negative consequences. Forexample, frogs whose optic nerve has been cut and allowedto fully regenerate after a 180 rotation of the eye willrepeatedly snap at mud and moss on the ground whenpresented with a y above the head (Sperry, 1965); a turkeyhen, whose protective maternal instincts dictate an attack onany moving object that fails to utter the characteristic peepof her chicks, will peck her own newly hatched progeny todeath if she is made deaf (Schleidt and Schleidt, 1960); aherring gull whose eggs have been displaced to an adjacentand clearly visible site will proceed to incubate the emptiedoriginal nest and ignore the clutch of eggs lying right nextto her (Tinbergen, 1951); and rats with crossed sensorynerves in the hind limbs, one of which is inamed, will hopon three legs to protect the healthy rather than the sore foot(Sperry, 1965).

    Advanced mammals with an intact CNS are less vulnerableto the emergence of such inexible patterns. With theexception of some autonomic, brainstem and spinal reexes,

    the behaviour of primates displays a much greater latitude

    consciousness and to induce the emergence of second-order (symbolic) representations related to language.Through the advent of language and the resultant abilityto communicate abstract concepts, the critical pacemakerfor human cognitive development has shifted from the

    extremely slow process of structural brain evolution tothe much more rapid one of distributed computationswhere each individual intelligence can becomeincorporated into an interactive lattice that promotes thetransgenerational transfer and accumulation of know-ledge.

    in translating sensation into action, so that identical sensoryevents can potentially trigger one of many different reactions,depending on the peculiarities of the prevailing context. Astimulus that deserves to be approached in one setting mayneed to be avoided in another; highly desirable consummatoryacts may need to be postponed in the presence of danger;the same glass may appear half-full or half-empty dependingon mood; and a petite madeleine can trigger a spectrum of reactions ranging from brief salivation to a torrent of wordsthat can hardly be contained by seven volumes of compact

    prose. This loosening of stereotyped stimulusresponselinkages endows the organism with the biological freedomto choose one of many potentially available responses. Theresultant coarse mapping of responses onto circumstancescreates a setting where the rules of competitive selection canoperate to promote rapid change and adaptation.

    Streams of sensory processing in amphibians, reptiles andbirds are kept on a short leash: the synaptic interval betweenthe onset of a sensory event and its closure in the formof multimodal convergence, perceptual recognition,motivational valuation and action is brief, usually of theorder of one or two synapses (Ingle, 1975; Northcutt, 1978;Gorlick et al. , 1984; Gaillard, 1990). The CNS of advancedmammals displays major modications in this plan of organization. The delity of sensory encoding is enhancedby reserving vast areas of the cerebral neocortex for modality-specic processing. The opportunities for subsequentintegration are enhanced by the development of extensivemultimodal association areas. And an intrinsic bias emergesto pursue novelty and exibility rather than sameness andstereotypy. The emergence of behavioural exibility inmammals can be attributed to an expansion of the synapticbridge that links sensation to action and recognition. Anexcessive length in such a synaptic bridge could have led tounacceptable delays in reacting to the environment, whereas

    too many lanes could have undermined the capacity for

  • 8/10/2019 From Sensation to Cognition Brain 1998

    3/40

    From sensation to cognition 1015

    integration. As will be shown below, primate evolutionreects a compromise between such extremes: the synapticchain that links sensation to action has been lengthened, butnot by too many synapses; and the bridge between the twohas been broadened by the introduction of numerous parallel

    lines of communication, but with ample opportunities forintegrative interaction among individual channels.The neural systems that bridge the gap between sensation

    and action provide the substrates for intermediary orintegrative processing. The behavioural outcome of intermediary processing is known as cognition, and includesthe diverse manifestations of memory, emotion, attention,language, thought and consciousness. The synaptic volumededicated to intermediary processing shows a marked increasein phylogeny and occupies the great majority of the cerebralcortex in advanced primates and cetaceans. Theseintermediary areas of the brain enable identical stimuli totrigger different responses depending on the situationalcontext, past experience, present needs and contemplatedconsequences.

    The neurons that support intermediary processing arelocated predominantly within the association and limbicareas of the cerebral cortex. The importance of the cerebralcortex to behaviour varies from species to species. In thesh, frog and pigeon, decortication produces little change insensation or locomotion (Ferrier, 1876; Northcutt, 1978).Extensive neocortical lesions in hamsters, cats and monkeyscause considerably lesser and more reversible functionaldecits than analogous lesions in humans (Lashley, 1952;MacLean, 1982). Furthermore, cortical lesions may be far

    more effective than tectal lesions in causing decits suchas hemispatial neglect in monkeys, whereas the converserelationship is seen in cats (Mesulam et al. , 1977). Theseobservations point to the existence of a trend towards aprogressive corticalization of function, and suggest thatcognition, dened as the behavioural outcome of intermediaryprocessing in the cerebral cortex, becomes established as anobligatory rather than a facultative correlate of informationprocessing in the course of evolution.

    The synaptic arrangement of neural pathways involved inintermediary processing provides biological constraints thatshape the nature of cognition and comportment. The goal of this review is to sketch some of the principles that guide theorganization of these pathways. Several limitations willbecome apparent in the course of the following account.First, much of the discussion will be limited to the cerebralcortex and will omit the very important contributions of subcortical structures such as the thalamus, striatum,claustrum and brainstem. Second, only visual and auditorystreams of processing will be discussed in any detail. Third,although domains such as perception, language, memory,attention and emotion will be addressed, this will be done ina very limited fashion, to provide highly selective examplesof common principles that link sensation to cognition. Fourth,although several phylogenetic trends will be mentioned,

    including numerous references to the CNS in the frog, only

    the monkey and human brains will be discussed in anyserious detail. Fifth, the subject matter will lead to a discussionof consciousness, but only in the form of a brief incursioninto realms of immense philosophical complexity.

    Behavioural neuroanatomy of the cerebralcortex Functional zonesThe human cerebral cortex contains ~20 billion neurons(Pakkenberg and Gundersen, 1997). The absence of clearanatomical demarcations has encouraged the development of numerous independent approaches to the subdivision of thecerebral cortex. The resultant maps can be divided into twogroups: those based primarily on structural (architectonic)features, and those based primarily on functional afliations.Proponents of one school have constructed a wide variety of cortical maps, ranging in complexity from the map of Exner (1881), which boasted hundreds of sharply delineatedsubdivisions, to the more modest and also more widelyaccepted ones of Brodmann (1909), the Vogts (Vogt andVogt, 1919), von Economo (Economo and Koskinas, 1925)and Flechsig (1920). The second school is more difcult toidentify since few of its proponents have produced systematicsurveys of the entire brain. Members of this second schoolinclude theoreticians of brain function such as Campbell(1905), Broca (1878), Abbie (1942), Filimonoff (1947),Yakovlev (1959) and Sanides (1970). The thinking of thisschool has led to the subdivision of the cerebral cortex into

    the ve major functional subtypes that will be describedbelow: primary sensorymotor, unimodal association,heteromodal association, paralimbic and limbic (Mesulam,1985 b). The principal factual base for this parcellationis derived from anatomical, physiological and behaviouralexperiments in macaque monkeys. The homologies to thehuman brain have been inferred from comparative cyto-architectonics, electrophysiological recordings, functionalimaging, and the behavioural effects of focal lesions. Unlessstated otherwise, the descriptions in this section can beassumed to apply to the brains of both species.

    The primary sensory and motor cortices are easilydelineated on cytoarchitectonic and functional grounds. Theprimary visual cortex [also known as V1, striate cortex,calcarine cortex or Brodmann area (BA) 17] covers the banksof the calcarine ssure; the primary auditory cortex (alsoknown as A1, or BA 4142) covers Heschls gyrus on theoor of the Sylvian cistern; the primary somatosensory cortex(also known as S1 or BA 3b) covers the anterior ank of thepostcentral gyrus; the primary gustatory cortex is probablylocated in the fronto-insular junction in BA 43; and theprimary motor cortex (also known as M1) includes BA 4and probably also a posterior rim of BA 6 in the precentralgyrus. The primary sensory areas provide an obligatory portalfor the entry of sensory information into the cortical circuitry,

    whereas the primary motor cortex provides a major gateway

  • 8/10/2019 From Sensation to Cognition Brain 1998

    4/40

    1016 M.-M. Mesulam

    for relaying complex motor programmes into bulbar andspinal motoneurons. The primary sensory areas (wheredensely packed small neurons give rise to a distinctivekoniocortical architecture) and the primary motor cortex(where large pyramidal neurons give rise to a distinctive

    macropyramidal architecture) represent the most highlydifferentiated and specialized subdivisions of the cerebralcortex (Mesulam, 1985 b).

    Sensory-fugal processing starts with the transfer of information from primary sensory areas to unimodalassociation areas. These areas full three criteria: (i) theirmajor source of sensory-fugal projections is located inprimary sensory areas and other unimodal association areasin that modality; (ii) the constituent neurons respond tostimulation predominantly, if not exclusively, in that particularsensory modality; and (iii) lesions lead to behavioural decitsconned to tasks under the control of that particular sensorymodality. Unimodal areas can be divided into upstream anddownstream components: upstream areas are only onesynapseaway from the relevant primary sensory area whereasdownstream areas are at a distance of two or more synapticunits from the corresponding primary area.

    The unimodal visual association cortex can be divided intoan upstream peri-striate component which includes areasBA 1819, and a downstream temporal component whichincludes the inferotemporal regions (BA 2120) in themonkey, and the fusiform, inferior temporal and perhapsparts of the middle temporal gyri in humans. 1 The unimodalauditory association cortex covers the superior temporal gyrus(BA 22) and perhaps also parts of the middle temporal gyrus

    (BA 21) in the human (Creutzfeldt et al. , 1989). Theconnectivity of the monkey brain would suggest that theposterior parts of the superior temporal cortex (BA 22)display the properties of upstream auditory association cortexwhereas the more anterior parts of this gyrus and the dorsalbanks of the superior temporal sulcus may t the designationof downstream auditory association cortex (Pandya andYeterian, 1985). In the monkey brain, rostral BA 5 representsthe upstream component of somatosensory unimodalassociation cortex, whereas caudal BA 5 and BA 7b mayrepresent its downstream components (Pandya and Yeterian,1985). In the human, unimodal somatosensory associationcortex may include BA 5, parts of BA 7 and perhaps BA 2.The subdivision of unimodal auditory and somatosensoryassociation cortices into upstream and downstream areasremains to be elucidated in the human brain. Premotor regions(anterior BA 6 and caudal BA 8) full the role of motorassociation areas because they provide the principal corticalinput into the primary motor cortex.

    The next stage of sensory-fugal processing occurs in

    1 Approximately half of middle temporal gyrus (MTG) neurons respond tospeech (Creutzfeldt et al ., 1989; Vincent et al. , 1997). However, some tasksbased on visually presented faces, words and objects can also activate theMTG (Vandenberghe et al. , 1996; Gorno Tempini et al. , 1997; Schultz et al. ,1997). At this time it is not clear whether all of the MTG should be classiedas heteromodal cortex or whether it should be divided into sections of unimodal auditory, heteromodal, and unimodal visual cortex.

    heteromodal association areas, which are characterized bythree criteria: (i) they receive convergent inputs fromunimodal areas in more than one modality; (ii) unit recordingsshow that constituent neurons respond to stimulation in morethan one sensory modality or that neurons responding to one

    modality are interspersed with those that respond to another;(iii) lesions always yield multimodal behavioural decits.The monkey brain has heteromodal areas in the prefrontalcortex (BA 9, 10, 45, 46, anterior BA 8, anterior BA 1112), in the inferior parietal lobule (parts of BA 7), in thebanks of the superior temporal sulcus (junction of BA 22with BA 21), and in the parahippocampal region (Pandyaand Yeterian, 1985). In the human brain the analogous realmsof the heteromodal cortex are located in the prefrontal cortex,the posterior parietal cortex (posterior BA 7, BA 3940),parts of the lateral temporal cortex (probably correspondingto parts of BA 37 and BA 21) and portions of theparahippocampal gyrus (parts of BA 3536). The unimodaland heteromodal areas are characterized by a six-layeredhomotypical architecture. The columnarization and laminardifferentiation of neurons is more conspicuous in unimodalthan heteromodal areas. If primary sensory and motor areasconstitute the most highly specialized and differentiated partsof the cortex, unimodal and heteromodal areas occupy thetwo subsequent levels of differentiation.

    A further stage of sensory-fugal processing occurs in agroup of areas designated paralimbic. These areas provide azone of gradual cytoarchitectonic transition between thehomotypical isocortex and the more primitive allocortexof core limbic structures. The primate brain contains ve

    paralimbic regions: the caudal orbitofrontal cortex (caudalBA 1112, BA 13), the insula (BA 1416), the temporalpole (BA 38), the parahippocampal gyrus (BA 27, 28 andparts of BA 35), and the retrosplenial-cingulate-parolfactorycomplex (BA 2326 and BA 2933). These paralimbicformations can be divided into two groups. The temporopolar-insular-orbitofrontal regions merge into each other andconstitute the olfactocentric subdivision of the paralimbicbelt because they provide a transition between the olfactoryallocortex and the homotypical cortex. The amygdala is themajor core limbic structure associated with this set of paralimbic regions. The parahippocampal and posteriorcingulate regions constitute the hippocampocentricsubdivision of the paralimbic zone because they provide atransition between the hippocampal formation (including itsinduseal rudiment) and the homotypical cortex. These twosubdivisions collectively form a gapless paralimbic ringwhich encircles the medial and basal components of thecerebral hemispheres (Mesulam and Mufson, 1985).

    The last cortical stage in the sensory-fugal stream of information processing occurs within ve core limbicformations: the hippocampal complex, the amygdaloidcomplex, the prepiriform olfactory cortex, the septal areaand the substantia innominata. These regions are characterizedby two properties: they display a primitive allocortical

    architecture (palaeocortical in the case of olfactory areas,

  • 8/10/2019 From Sensation to Cognition Brain 1998

    5/40

    From sensation to cognition 1017

    Fig. 1 The straight arrows illustrate monosynaptic sensory-fugal neural connections in the visual andauditory modalities. The thick arrows represent more massive connections than the thin arrows. Thebroken arrows illustrate motor output pathways. The latter are not discussed in this review.

    archicortical in the case of the hippocampus, and corticoidin the case of the amygdala, septum and substantiainnominata), and they have massive reciprocal connectionswith the hypothalamus.

    The entire cortical surface can thus be divided into vefunctional zones which collectively display a continuousspectrum of cytoarchitectonic differentiation from the mosthighly differentiated primary sensorymotor areas to the leastdifferentiated limbic structures. This architectonic hierarchyis paralleled by a relative hierarchy (or polarization) of connectivity. Figure 1 illustrates the sensory-fugal gradient of

    connectivity which sequentially conveys sensory informationabout the extrapersonal environment from the primary sensoryto the upstream unimodal, downstream unimodal,heteromodal, paralimbic and limbic areas of the monkeybrain (Pandya et al. , 1981; Mesulam and Mufson, 1985;Pandya and Yeterian, 1985; Van Essen, 1985; Moran et al. ,1987; Morecraft et al. , 1992; Aggleton, 1993). Someconnections, as in the case of the projection from the posteriorsuperior temporal gyrus to the entorhinal cortex (Amaralet al. , 1983), cross levels. However, such connections arenot as prominent as those that extend between two adjacentlevels. Figure 1 is based on the organization of visual andauditory pathways. Although somatosensory pathways followmany of the same principles of organization, they also displayunique properties such as the existence of monosynapticconnections between primary sensory and primary motorareas. Olfactory and gustatory pathways have been excludedfrom this analysis because they represent chemical sensesand follow a different plan of organization, reecting a closerrelationship to the internal than to the external milieu.

    As noted above, only the constituents of the limbic zonehave massive reciprocal monosynaptic connections with thehypothalamus, a nuclear complex that functions as theprincipal coordinator of the homeostatic, autonomic andhormonal aspects of the internal milieu. In keeping with this

    connectivity, the behavioural afliations of the limbic zone

    are polarized towards the internal milieu and deal with theregulation of emotion, motivation, memory and autonomicendocrine function. The opposite pole of this spectrum isoccupied by the primary sensorymotor areas, which displaythe most highly differentiated cytoarchitecture and which arepolarized towards the extrapersonal world rather than theinternal milieu. The zones of unimodal, heteromodal andparalimbic areas are inserted between these two extremesand act as neural bridges that link the inside to the outsideworld so that the needs of the internal milieu can bedischarged according to the opportunities and restrictions

    presented by the extrapersonal environment. Within thecontext of these behavioural afliations, the unimodal andheteromodal areas are most closely involved in perceptualelaboration and motor planning, whereas the paralimbicareas play a more critical role in channelling emotionand motivation to behaviourally relevant intrapsychic andextrapersonal targets. This functional landscapeof the cerebralcortex provides the basic template for linking sensation tocognition in the primate brain.

    Connections and multimodal convergenceThe literature of the 1940s and 1950s expressed considerableambivalence about the status of the cerebral cortex in mentalfunctioning and tended to ignore the seminal contributionsof Hughlings Jackson, Ferrier, Charcot, Dejerine, Wernickeand Liepmann. Wilder Peneld (1938), for example, hadattributed the highest levels of integration to thalamic activityand Karl Lashley (1952) had raised serious doubts about theexistence of functional specializations in the associationcortex, even in the monkey. The modern resurgence of interest in the cerebral cortex can be traced to the publicationof the disconnexion syndromes papers by NormanGeschwind (1965). Based on a comprehensive review of the literature, Geschwind suggested that the critical neural

    substrate of mental function revolved around precisely

  • 8/10/2019 From Sensation to Cognition Brain 1998

    6/40

    1018 M.-M. Mesulam

    organized cortico-cortical pathways that interconnectbehaviourally specialized regions of the cerebral cortex. Thisinference was particularly interesting, since very little wasknown then about the details of cortical connectivity. Atabout the same time, new methods based on the selective

    silver impregnation of degenerated axons were becomingavailable and enabled a detailed investigation of cortico-cortical connectivity in monkeys. The basic organization of these connections was described in two classic papers, oneby Pandya and Kuypers (1969) and the other by Jonesand Powell (1970). These papers outlined an orderly andhierarchical connectivity, mostly consistent with Geschwindsaccount, for linking sensory cortices to primary, secondaryand sometimes even tertiary modality-specic associationareas, which in turn sent convergent projections to multimodalsensory association zones. Such multimodal sensoryconvergence areas were identied in the posterior parietal,lateral prefrontal and temporal cortices of the rhesus monkey.

    The eld of neuroscience had been primed to anticipatesuch a sequential organization through the work of Hubeland Wiesel (1965), who had demonstrated a hierarchy of simple, complex and hypercomplex neurons in the primaryvisual cortex, each successive level encoding a morecomposite aspect of visual information. The discoveries of Pandya and Kuypers and of Jones and Powell seemed to beextending this serial and convergent organization from therealms of the primary sensory cortex and sensation to thoseof the association cortex and cognition. A great deal of emphasis was placed on the pivotal role of multimodalconvergence as the nal and supreme site of integration for

    all aspects of mental function, including the storage of memories, the formation of concepts and the acquisition of language (Geschwind, 1965; Pandya and Kuypers, 1969;Jones and Powell, 1970; Van Hoesen et al. , 1972; Pandyaand Seltzer, 1982; Mesulam, 1985 b).

    While the importance of serial processing and multimodalconvergence to cognitive function was widely accepted, somepotentially serious computational limitations of such anarrangement were also acknowledged (Rumelhart andMcClelland, 1986; Goldman-Rakic, 1988; Mesulam, 1990).The surfacing of these concerns coincided with thedevelopment of far more powerful neuroanatomical methods,based on the intra-axonal transport of horseradish peroxidaseand tritiated amino acids, which started to show that thesensory-fugal ow of information was more complicatedthan previously surmised. The classic description of visuo-fugal pathways, for example, had assumed that informationfrom V1 (BA 17) is transferred successively and serially toa rst-order visual association area, V2 (BA 18), then to asecond-order visual association area corresponding to BA 19,and nally to a third-order association area in theinferotemporal cortex. The newer methods revealed asomewhat different picture. The entire extent of V1 didindeed project to V2 (BA 18) in a topographically well-ordered fashion. However, V1 and V2 then gave rise to

    multiple parallel pathways that projected to numerous discrete

    peristriate visual association areas, located mostly withinBA 19 and designated V3, V4, V5 (MT), VP, V6 (PO) andso on (Felleman and Van Essen, 1991). The further occipito-fugal ow of visual information took the form of twodivergent multisynaptic pathways. One was directed dorsally

    towards the parietal cortex and was specialized for encodingthe spatial attributes of visual information, and the secondwas directed ventrally towards downstream visual associationareas of the temporal lobe and was specialized in theidentication of faces and objects (Ungerleider and Mishkin,1982; Mesulam, 1994 a ). This expanded view of visuo-fugalpathways will provide a starting point for exploring a newform of connectivity that links sensation to cognition.

    From sensation to perceptionThe representation of visual experienceFigure 2a contains a schematized summary of corticalconnectivity in the visual system of the monkey brain and isbased on the review by Felleman and Van Essen (1991).Virtually all of these connections are reciprocal. They arerepresented on a template of concentric circles where eachcircle is separated from the next by at least one unit of synaptic distance. V1 occupies the rst synaptic level. Theascending synaptic levels in Fig. 1 follow a downstream,feed-forward, sensory-fugal or bottom-up direction withrespect to the visual modality, whereas descending levelscan be described as following an upstream, feed-back,sensory-petal or top-down direction.

    Primary visual cortex (V1) is the exclusive cortical

    recipient of projections from the magnocellular andparvocellular layers of the lateral geniculate nucleus andprovides a precise retinotopic mapping of the visual elds.In addition to retinotopic location, its neurons are sensitiveto orientation, movement, binocular disparity, length, spatialfrequency, wavelength and luminance. Aggregations of neurons that are preferentially sensitive to colour, stereopsis,orientation and movement form a multidimensional mosaicof columns, layers and cytochrome oxidase-reactive (or-negative) modules (Van Essen, 1985). Areas V2, V3, V4and V5 (MT) are monosynaptically connected with V1 andtherefore constitute upstream visual association areas at

    the second synaptic level, whereas MST (medial superiortemporal area), LIP (lateral intraparietal sulcus), the posteriorand anterior inferotemporal cortex (PIT and AIT), temporalarea TF (BA 20) and the caudal inferior parietal lobule(BA 7a) constitute some of the downstream visualassociation areas at the third and fourth synaptic levels. 2 Thecortical nodes at the second and perhaps also third synapticlevels full the criteria for unimodal visual association areasas described above. When taken as a whole, area TF andarea 7a display features of heteromodal cortex but appear to

    2 Parts of V4 receive monosynaptic input from V1, but this may not be seenin every case (Felleman and Van Essen, 1991). Area V4 may therefore haveparts that should be designated upstream visual association cortex and partsthat more closely t the denition of a downstream visual association cortex.

  • 8/10/2019 From Sensation to Cognition Brain 1998

    7/40

    From sensation to cognition 1019

    Fig. 2 Each concentric ring represents a different synaptic level. Any two consecutive levels are separated by at least one unit of synaptic distance. Level 1 is occupied by the primary sensory cortex. Small empty circles represent macroscopic cortical areas ornodes, one to several centimetres in diameter. Nodes at the same synaptic level are reciprocally interconnected by the black arcs of theconcentric rings. Coloured lines represent reciprocal monosynaptic connections from one synaptic level to another. ( a) Visual pathwaysas demonstrated by experimental neuroanatomical methods in the macaque brain. ( b ) The inferred organization of the homologous visualpathways in the human brain. ( c) Visual (green) and auditory (blue) pathways in the human brain. ( d) Visual (green), auditory (blue) andtransmodal (red) pathways in the human brain. In b , c and d , the anatomical details of individual pathways are inferred fromexperimental work in the monkey. The anatomical identity of many of the nodes is not specied because their exact anatomical locationis not critical. This review is guided by the hypothesis that these types of anatomical interconnections and functionally specialized nodesexist in the human brain even though their exact location has not yet been determined. The terms dorsal and ventral in a and b referto the separation of visuo-fugal pathways, especially at the fourth synaptic level, into dorsal and ventral streams of processing. The gapsin the circles at the rst four levels indicate the absence of monosynaptic connections between modality-specic components of auditoryand visual pathways. Abbreviations: A1 primary auditory cortex; AIT anterior inferotemporal cortex; f area specialized for faceencoding; L hippocampalentorhinal or amygdaloid components of the limbic system; LIP lateral intraparietal cortex; MST medial superior temporal cortex; P heteromodal posterior parietal cortex; Pf lateral prefrontal cortex; s area specialized for

    encoding spatial location; PIT posterior inferotemporal cortex; T heteromodal lateral temporal cortex; TF part of medialinferotemporal cortex; v area specialized for identifying individual voice patterns; V1 primary visual cortex; V2, V3, V4, V5 additional visual areas; W Wernickes area; wr area specialized for encoding word-forms; 7a(Opt) part of dorsal parieto-occipitalcortex.

  • 8/10/2019 From Sensation to Cognition Brain 1998

    8/40

    1020 M.-M. Mesulam

    have subregions (such as Opt in the inferior parietal lobuleand the caudal part of area TF) that may be engagedpredominantly in the processing of visual information(Mesulam et al. , 1977; Pandya and Yeterian, 1985; Andersenet al. , 1990). The nodes at the fourth synaptic level in Fig. 2a

    represent these relatively modality-specic subregions of areas 7a and TF.

    The primary dimension of visual mapping is retinotopicand is achieved by nely tuned neurons which provide anexquisitely ordered spatial representation of the visual eldsin V1. Other dimensions of visual experience, such as colourand motion, are mapped in V1 and V2 by coarsely tunedneurons, and become more fully encoded at further synapticstages such as V4 and V5. Nodes at upstream synaptic levelstend to contain neuronal groups specialized for encodingrelatively elementary attributes of visual experience, whereasnodes at more downstream levels are organized into neuronal

    groups specialized for encoding more composite features.The gradual increase of response latency, visual eld sizeand response complexity in the progression from V1 to V2,V4, PIT and AIT conrms the existence of a synaptichierarchy in the organization of visuo-fugal pathways.Although a visual event activates nodes at higher levels of this hierarchy with increasing latencies (11 ms between V1and V2 and 9 ms between V1 and V5, but 40 ms from V1to PITAIT), all areas eventually become concurrently activein the course of visual processing (Raiguel et al. , 1989; Dinseand Kruger, 1994). It seems as if each node is continuallypassing on information to the others rather than fullling itspart of the processing and then transmitting a completedproduct to the next station (Tovee, 1994).

    The specialization of V4 for colour perception and of V5for movement perception have been documented extensivelyin the monkey (Van Essen, 1985). In humans, studies basedon functional imaging have shown that the posterior parts of the lingual and, to a lesser extent, fusiform gyri (ventralBA 19) display selective activation in response to colourstimulation (Lueck et al. , 1989; Chao et al. , 1997). This areaconstitutes a human homologue of the colour-sensitive V4region of the monkey since its unilateral destruction leads toa contralateral loss of colour perception (hemi-achromatopsia)without equivalent impairments of acuity, movementperception or object identication (Damasio, 1985; Mesulam,1994 a ). Other functional activation studies have shown thata lateral occipitotemporal area at the conuence of BA 19and 37 shows selective activation in response to visual motion(Watson et al. , 1993). This region appears to represent thehuman homologue of area V5 in the monkey. Its destructioncauses a state known as akinetopsia where the patient cannotperceive visual motion, although acuity and colour perceptionmay be relatively preserved (Zeki, 1991; Zihl et al. , 1991;Mesulam, 1994 a). The clinical dissociation of achromatopsiafrom akinetopsia proves that the V1 projections to V4 andV5 are organized largely in parallel rather than in series. The

    presence of such parallel pathways would be expected to

    increase processing efciency by allowing the simultaneousanalysis of individual attributes associated with a visual event.

    The elementary sensory features encoded at the rst twosynaptic levels are used by more downstream areas alongthe ventral visuofugal pathway for the discrimination of

    form and complex patterns. In the monkey, for example, aposterolateral inferotemporal region (area TEO or PIT at the junction of lateral BA 19 with BA 2021) plays a criticalrole in form and pattern discrimination (Yaginuma et al. ,1982). A homologous area in the human brain includes partsof the fusiform gyrus (BA 19 and 37), just anterior to V4,and probably extends into the adjacent lingual and inferioroccipital gyri (Halgren et al. , 1997; Kanwisher et al. , 1997 b).This area appears to be involved in the construction of shapefrom simpler visual features since it becomes activated bytasks that require attention to both simple and complex shapesand does not give differential responses to upright versusinverted faces, real versus nonsense objects, or novel versusfamiliar stimuli (Corbetta et al. , 1991; Haxby et al. , 1994,1997 b; Martin et al. , 1996; Clark et al. , 1997; Kanwisheret al. , 1997 b). In comparison with the brain of the macaque,the downstream components of the ventral visuo-fugalpathway in the human appear to have been transposedventromedially, probably in response to the expansion of thelateral temporal and posterior parietal cortices.

    If the identication of visual events and objects had to bebased on a sequential compilation of the colour, form andmotion data encoded at the rst three synaptic levels,perception would probably take an inordinately long timeand might not allow the rapid recognition of frequently

    encountered and behaviourally signicant patterns. Thispotential limitation is overcome at the fourth synaptic level,where neuronal groups selectively tuned to specic visualcategories promote the rapid identication of entities suchas faces, objects and words. In the monkey, the anteriorinferotemporal area (AIT or anterior BA 2021) containsneuronal ensembles specialized for face and objectidentication (Gross et al. , 1972; Desimone, 1991). In thehuman brain, functional imaging studies, electrophysiologicalevoked responses and the location of lesions in patients withthe syndrome of prosopagnosia indicate that the homologousareas specialized for face and object identication are locatedpredominantly within the mid-portion of the fusiform gyrus(BA 37 and BA 20) (Damasio, 1985; Sergent et al. , 1992;Allison et al. , 1994; Puce et al. , 1996).

    The face area in the human brain (f in Fig. 2b) is morestrongly activated by faces than by other objects (Kanwisheret al. , 1997 a ). It is also more strongly activated by uprightand intact faces than by inverted or scrambled ones, but doesnot show a differential response to familiar versus novelfaces (Gorno Tempini et al. , 1997; Haxby et al. , 1997 b).This area therefore appears to encode faces at a categoricalor generic level, prior to the stage of individual recognition.The fourth synaptic level of the human brain containsadditional regions specialized for the identication of other

    common objects such as chairs and houses (Ishai et al. ,

  • 8/10/2019 From Sensation to Cognition Brain 1998

    9/40

    From sensation to cognition 1021

    1997). An area specialized for the encoding of word-formsand word-like letter-strings (wr in Fig. 2b) has also beenidentied in this region, at a location perhaps slightly morelateral to that of the fusiform face area (Nobre et al. , 1994;Puce et al. , 1996). A second potential visual word-form

    identication area may be located in a more lateral occipito-temporal region, at the conuence of BA 19 with BA 37(Petersen et al. , 1988). Considering the extremely recentemergence of written language in human phylogeny and itsrelatively late acquisition in ontogeny, the organization of the word-form area is almost certainly not genetically orepigenetically programmed. A more likely possibility isthat it represents an experiential modication of neuronalsubgroups within populations tuned to the encoding of facesand objects. The visual word-form areas could thus mediatea sort of processing where written words are handled asobjects rather than as symbols.

    The fourth synaptic level also contains components of thedorsal visuofugal pathways (s in Fig. 2b), where relativelymore elementary retinotopic and visuomotor informationleads to the selective identication of extrapersonal targets.Neurons in parts of area 7a of the macaque, for example,can compute the allocentric coordinates of extrapersonalevents by combining retinotopic location with informationabout eye position (Zipser and Andersen, 1988). Some of these neurons display tuning for locations in head-centredcoordinate space and show response enhancements to spatialpositions containing events that will become targets of visualor manual grasp (Mesulam, 1981; Andersen et al. , 1985).Functional imaging experiments based on tasks of spatial

    localization indicate that the analogous region in the humanbrain may be located in the dorsal occipito-parietal region,at the junction of BA 19 with BA 7 (Haxby et al. , 1994).

    Representation of individual faces and objectsNeuronal ensembles within downstream visual associationareas provide representations of objects and faces through aprocess of group encoding. The tuning is broad and coarse:one neuron may be activated by several faces and the sameface may excite several neurons (Rolls, 1987). The faceneurons in the inferotemporal cortex are selectively tuned tocategory-specic canonical features such as intereye distance,style of hair, expression, and direction of gaze (Yamane et al. ,1988). Neurons tuned to similar object features form verticalcolumns measuring ~0.4 mm in diameter. Several adjacentcolumns responsive to similar effective features may belinked to form larger patches or modules measuring severalmillimetres (Harries and Perrett, 1991; Tanaka, 1996). Groupsof such patches may form interconnected but distributedensembles collectively tuned to the set of canonical featuresthat dene an object class. The tangential inter-patchconnections that are necessary for establishing such anorganization could be stabilized during the period of cortical

    development and subsequently strengthened by experience

    through the mediation of temporally correlated multifocalactivity (Lo wel and Singer, 1992).

    Although neurons in a given column respond preferentiallyto similar canonical features of an object or face, optimaltuning properties vary among constituent cells so that

    columnar activation may encode generic properties whereasthe activities of individual cells within the column may helpto encode distinguishing (subordinate) features of individualexemplars (Fujita et al. , 1992). In response to an object, asmall subset of neurons in a given column can re maximallyand set constraints to guide and restrict the interpretation of less active neurons within the same ensemble (Geisler andAlbrecht, 1995). Identication can thus start by matching thecoarse (or generic) features and then focusing on ner(subordinate) detail. A visual entity may be represented bya small number of modules, each broadly selective for somereference object or face, which collectively measure thesimilarity of the target stimulus to the reference entities(Edelman, 1998). This type of encoding, also known assecond-order isomorphism, is thought to be computationallymore parsimonious than representations based on a moredirect isomorphic matching of the target shape (Edelman,1998).

    The face-responsive ensembles display considerableplasticity so that some neurons alter their ring rate to agiven face when it becomes more familiar or when a newone is added to the set (Rolls et al. , 1989), suggesting thatthe identity of an individual face is not encoded by xedrates of ring but by the relative ring frequencies (andperhaps interneuronal correlation patterns) across the entire

    ensemble. The processing parsimony offered by thisorganization is substantial (Erickson, 1982). A very largenumber of faces can be encoded by a small number of neurons, recognition can be graded (rather than being all-or-none) and based on partial information, the same informationcan be probed through multiple associations, generalizationsbased on a few common features (or analysis based ondifferences) can be achieved rapidly, the progression fromcategorical to subordinate identication can proceedsmoothly, and damage or refractory states in a subset of neurons within the ensemble can lead to a graceful, partialdegradation of function. These neurons can achieve the rapiddetection of behaviourally relevant, recurrent and compositevisual events, obviating the need for a cumbersomecompilation of the more elementary sensory features. Thegeneral principles that guide the visual identication of facesprobably also apply to the encoding of other classes of objects, words and spatial targets.

    In the frogs brain, the equivalent process of objectidentication begins in the retina and is largely completedone synapse later in the tectum. In the primate brain, atleast four synaptic levels of cerebral cortex are devoted toidentifying a bug and transforming its location into a targetfor action. The disadvantages of the increased synapticdistance are partially offset by the insertion of parallel

    pathways (AIT, for example, can receive inputs either through

  • 8/10/2019 From Sensation to Cognition Brain 1998

    10/40

    1022 M.-M. Mesulam

    PIT or directly from V4) and the extensive myelination of the axons. The advantages of this more exible arrangementare extensive: we can identify the individual species of bug,comment on its function, remember the experience for futurereference, let it inspire poetry, 3 and, if necessary, inhibit the

    tendency for immediate predatory action in order to plan astrategy for approach, avoidance or attack.The anatomical areas that play crucial roles in the

    identication of colour, movement, faces, words, objects andspatial targets display relative rather than absolutespecializations. For example, V4 is specialized for colourperception but also participates in spatial attention, theidentication of salience and the encoding of form (Moranand Desimone, 1985; Schiller, 1995; Connor et al. , 1996).In turn, the processing of colour information may involvenot only V4 but also a part of the lateral peristriate cortex(Corbetta et al. , 1990). Furthermore, neuronal ensemblesselectively tuned to canonical features of faces participate,although to a lesser extent, in encoding other visual entities(Rolls, 1987). It is quite likely that several ensembles, eachcomposed of neurons optimally tuned to a different category,form an interdigitating or partially overlapping mosaic andthat the predominant type of ensemble varies from onelocation to another. 4 This organization has been designatedselectively distributed processing (Mesulam, 1994 c; Seeck et al. , 1995) to set it apart from other models based onequipotentially distributed processing (Lashley, 1929),parallel distributed processing (Rumelhart and McClelland,1986) and modular processing (Fodor, 1983).

    Although the ow of visual information shown in Fig. 2b

    obeys a core hierarchy, it also contains many nodes of convergence and divergence embedded within multipleparallel pathways. This arrangement offers the well-knowncomputational advantages of parallel processing. It also helpsto sort the undifferentiated visual information that impingeson the eye into attributes that become available for newcombinatorial permutations. This process may be likened tothe prismatic diffraction of white light into primary colourswhich, in turn, become available for creating a large numberof secondary and tertiary colours not present in the originalspectrum.

    Auditory experienceThe primary dimension of auditory mapping is tonotopic.The superior temporal plane of the monkey contains threetonotopic maps, one each in areas A1, R and CM, raisingthe possibility that the primate cerebral cortex containsmultiple auditory areas just as it contains multiple visualareas (Rauschecker et al. , 1997). Two of these areas, A1 and

    3 Just so much honour, when thou yieldst to me,/Will waste, as this easdeath took life from thee . John Donne, The Flea .

    4 The specialization of an anatomical region for a certain task cantheoretically be assessed by calculating the sum of changes (in metabolism,relative frequency of ring or cross-correlation of activity) across the entireregion of interest when responding to one versus another class of stimuli.

    R, receive the most extensive input from the primary thalamicauditory relay nucleus in the ventral medial geniculate. Theycan be activated directly by pure-tone responses whereasarea CM depends on inputs from A1 and R for its pure-toneresponses and is more responsive to complex broad-band

    stimuli and high-frequency sounds used for soundlocalization. These three areas may occupy cortical nodes atthe rst two synaptic levels of auditory pathways.

    The complex cytoarchitecture and connectivity of thesuperior temporal gyrus in the monkey brain suggest thatmultiple association areas with hierarchies and intercon-nections analogous to those described in the visual systemare also likely to exist in the auditory modality. Pitch andtone discrimination are accomplished at the level of A1 andclosely related upstream auditory association areas of theposterior superior temporal gyrus, whereas the identicationof more complex auditory sequences, the discrimination of

    species-specic calls and the detection of sound motion andlocalization engage downstream auditory association areasin the superior temporal gyrus (Colombo et al. , 1996).

    As in the case of visual pathways, the auditory pathwaysof the monkey brain may be divided into dorsal and ventralprocessing streams. Area Tpt of the posterior superiortemporal gyrus, for example, may play a prominent role inthe dorsal audio-fugal stream of processing, and may bespecialized for detecting the spatial localization of soundsources, whereas the more anterior and ventral parts of thesuperior temporal gyrus may be specialized for identifyingcomplex auditory sequences and species-specic vocaliza-tions (Leinonen et al. , 1980; Heffner and Heffner, 1984;Colombo et al. , 1996). The role of the superior temporalgyrus in the auditory identication of species-specic callsappears analogous to the role of inferotemporal cortex inface identication, both processes serving crucial functionsin social communication.

    A similar organization may exist in the human brain.Neurons in A1 are tuned to pure tones and pitch, whereasthose of the mid to anterior parts of the human superiortemporal gyrus are relatively unresponsive to pure tonesand non-linguistic noises but respond to specic phoneticparameters of spoken language (Petersen et al. , 1988;Demonet et al. , 1992; Zatorre et al. , 1994). The superiortemporal gyrus neurons are broadly tuned to the segmentationand sequencing of phonemes as well as to their coherencewithin polysyllabic and compound words (Creutzfeldt et al. ,1989). They encode speech at a presemantic level, since theyrespond to spoken real words as readily as to distortedbackward speech (Creutzfeldt et al. , 1989). They may thusbe analogous to the visual word-form neurons of the fusiformand occipitotemporal areas where word-forms are processedas perceptual patterns rather than symbols. The middletemporal gyrus in the human brain (BA 21) appears tocontain a further downstream auditory association area, sinceapproximately half of its neurons give highly selective

    responses, mostly in the form of suppression, to

  • 8/10/2019 From Sensation to Cognition Brain 1998

    11/40

    From sensation to cognition 1023

    understandable speech but not to distorted speech (Creutzfeldtet al. , 1989).

    These observations suggest that the ow of auditoryinformation in the human brain may follow a template similarto the one in the visual modality (Fig. 2c). Although the

    evidence is not as easily interpreted as in the case of visualpathways, it appears that upstream auditory areas tend toencode more elementary features such as frequency and pitch,whereas downstream areas may contain neuronal groups thatencode more composite features related to the identicationof words (wr in Fig. 2c), the localization of sound sources(s in Fig. 2c), the categorization of object-specic sounds,and perhaps also the characterization of individual voicepatterns (area v in Fig. 2c).

    Sensory delity and memory in unimodal areasOne of the most important principles in the organization of the primate cerebral cortex is the absence of interconnectionslinking unimodal areas that serve different sensory modalities.In the monkey, for example, the auditory association cortexin BA 22 has no direct connections with the visual associationcortex in BA 19, BA 20 or BA 21 (Pandya and Yeterian,1985). Neurons in visually responsive areas such as BA 1921 do not respond to auditory stimuli, and neurons in auditorycortices such as BA 22 do not respond to visual stimuli. Thisis particularly interesting since many of these unimodalassociation areas receive monosynaptic feedback projectionsfrom heteromodal cortices which are responsive to bothauditory and visual stimuli. The sensory-petal (or feedback)projections from heteromodal cortices therefore appear to

    display a highly selective arrangement that actively protectsthe delity of sensory tuning during the rst four synapticlevels of sensory-fugal processing.

    Virtually every neuron and synapse in the CNS canpotentially alter its excitability or efcacy in a manner thatallows the storage of information for periods of time rangingfrom a few milliseconds to a lifetime. The molecular andcellular bases of such learning include a broad spectrum of phenomena such as reversible alterations in the conductanceof ionic channels, long-term potentiation (or depression),changes of synaptic efcacy, and the expression of geneswhich alter the structure and number of synapses (Baileyet al. , 1996; Bear, 1996). Durable learning effects, however,

    would be least desirable among neurons of the rst synapticlevel, where the accurate registration of new inputsnecessitates a rapid return to a narrowly tuned baseline,whereas they would be highly useful at more downstreamlevels, where synaptic plasticity, induced by life experiences,could play a critical role in the adaptive modication of response patterns. 5

    5 Breuer expressed this succinctly when he said that the mirror of a telescopecannot at the same time be a photographic plate (Breuer and Freud, 1895).Although experience-induced changes have been reported in primary sensoryareas in the adult brain of phylogenetically more advanced species (Creutz-feldt and Heggelund, 1975; Wiesel, 1981; Cruikshank and Weinberger,1996), they tend to be less common than in association or limbic areas, andrequire more drastic circumstances such as prolonged exposure to anomalousvisual input, nerve section and limb amputation.

    In keeping with these expectations, prominent learningeffects have been identied in downstream unimodalassociation areas of the cerebral cortex. In the monkey, forexample, when pattern is paired with pattern in a paired-associates task acquired prior to the recording period, an AIT

    neuron responsive to but not to will increase its ringin anticipation of towards the end of the delay periodfollowing stimulation by , showing that it has learned thearbitrary association between the two stimuli (Naya et al. ,1996). Neurons in AIT also display a familiarity response tofaces encountered as long as 24 h ago, indicating that theinitial exposure had been stored in long-term memory (Fahyet al. , 1993). These observations have led to the suggestionthat the downstream visual association cortices in the temporallobe act as a memory storehouse for object vision (Mishkin,1982; Naya et al. , 1996). Analogous long-term memoryencoding probably exists in the downstream componentsof unimodal auditory association cortices, although thesephenomena have not been investigated as extensively as thosein the visual modality. Modications of synaptic efcacy, suchas those necessary for long-term potentiation and depression,have been obtained in the human middle and inferior temporalgyri and could mediate similar long-term encoding (Chenet al. , 1996).

    The foregoing comments indicate that the neural nodes inFig. 2c can both identify and record visual and auditoryevents. Furthermore, evanescent cross-modal coherence of the visual and auditory features encoded by the nodes inFig. 2c could arise through the temporal synchrony of thetwo sensory channels during the actual unfolding of an event.

    Modality-specic cortices may thus initially appear to provideall the necessary ingredients for the stable registration of experience. A brain that contained only those componentsshown in Fig. 2c, however, would face serious challenges if an associative synthesis or retrospective reconstruction of therelevant experience became necessary. It wouldbe impossible,for example, to encode the relationships between the visualand auditory components of the experience, since the two setsof unimodal cortices have no interconnections. Experience forsuch a brain would therefore tend to be incoherent acrossmultiple channels of sensory processing. The unimodalassociation areas in Fig. 2c would be sufcient for decidingif the sensory features of two words or two faces are identicalor not, and even for distinguishing one object category fromanother, but could not lead from word to meaning, fromphysiognomy to recognition, or from sensory events tocoherent experiences. Such transformations of sensation intocognition necessitate the participation of a different class of cortical areas that can be classied as transmodal.

    From perception to recognition: transmodalgatewaysTransmodal areas and bindingTransmodal areas include all heteromodal, paralimbic and

    limbic areas and occupy the fth and sixth synaptic levels

  • 8/10/2019 From Sensation to Cognition Brain 1998

    12/40

    1024 M.-M. Mesulam

    in Fig. 2d. Their only common feature is the absence of specicity for any single modality of sensory input. Theyreceive afferents predominantly from the more downstreamparts of unimodal areas and from other transmodal regions(Fig. 1). These connections are reciprocal and enable

    transmodal areas to provide a site for multimodal convergenceand also to exert a top-down, sensory-petal or feedbackinuence upon unimodal areas.

    Everyday experiences unfold in multiple modalities. Theestablishment of a durable record of experience, and itsassociative incorporation into the existing base of knowledge,necessitate multimodal integration. The desirability of suchintegration (or binding) had been articulated and its presencepostulated on multiple occasions. The most widelyacknowledged version was introduced more than threecenturies ago by Descartes, who proposed a convergence of sensory information within the pineal gland, where theimmaterial mind could observe the representation of experience provided by the material brain. Although Cartesiandualism has attracted severe and often justied criticism(Dennett and Kinsbourne, 1992), the convergence that itpostulated continues to be as compelling now as it was then.Its organization, however, appears to entail more than aspatial conuence within a single theatre where, in Dennettand Kinsbournes words, it all comes together for thebenet of its immaterial spectator. As will be shown in thefollowing discussion, the brain favours a distributed type of Cartesian convergence where there are multiple theatres andwhere the actor and spectator are one and the same.

    The neuroanatomy of the 1960s and 1970s identied

    several areas for multimodal convergence, raising thepossibility that these might be the sites where amultidimensional synthesis of knowledge, memory andexperience could be taking place. The logistic andepistemological arguments against such a purely conver-gent organization of knowledge have been enumerated onmultiple occasions (Goldman-Rakic, 1988; Mesulam, 1990;McClelland, 1994). Two of these objections are most relevantto this review: (i) if knowledge of was encoded inconvergent form by a small number of neurons within atransmodal area, the brain would have to resolve thecumbersome problem of conveying -related information inall relevant modalities to the one highly specic addresswhere this convergent synthesis is located; (ii) the modality-specic attributes of would succumb to cross-modalcontamination during the process of convergence and thesensorial delity of the experience would be lost. This secondcircumstance can be likened to the mixing of yellow andblue to obtain green, a process which precludes the subsequentextraction of the original hues from the resultant mixture.

    Both objections can be addressed by assuming that therole of transmodal nodes is not only to support convergentmultimodal synthesis but also, predominantly, to createdirectories (or address codes, maps, look-up tables) forbinding distributed modality-specic fragments into coherent

    experiences, memories and thoughts. This alternative process

    can be likened to obtaining green by superimposing a blueand a yellow lens, which can then be separated from eachother to yield back the original uncontaminated colours.Transmodal areas allow multidimensional integration throughtwo interactive processes: (i) the establishment, by local

    neuronal groups, of convergent cross-modal associationsrelated to a target event; and (ii) the formation of a directorypointing to the distributed sources of the related information.Transmodal areas can thus enable the binding of modality-specic information into multimodal representations that havedistributed as well as convergent components.

    Transmodal areas are not necessarily centres whereconvergent knowledge resides, but critical gateways (orhubs, sluices, nexuses) for accessing the relevant distributedinformation (Mesulam, 1994 c). Paradoxically, they alsoprovide neural bottlenecks in the sense that they constituteregions of maximum vulnerability for lesion-induced decitsin the pertinent cognitive domain. Transmodal areas indifferent parts of the brain share similar principles of organization, each in relation to a specic cognitive domain.Examples that will be examined in this review include thepivotal role of midtemporal and temporopolar cortices in faceand object recognition, Wernickes area in lexical retrieval,the hippocampalentorhinal cortex in explicit memory, theprefrontal cortex in working memory, the amygdala inemotion, and the dorsal parietal cortex in spatial awareness.

    Face recognition and associative agnosiasDownstream visual association cortices are essential for the

    categorical encoding of faces and objects. By itself, thisinformation would provide an isolated percept devoid of meaning or context. The ability of this modality-specicperceptual information to activate the relevant multimodalassociations that lead to recognition requires the mediationof transmodal cortical areas. In the monkey, for example,unimodal AIT neurons are sensitive to the sensory propertiesof face stimuli whereas more downstream transmodal neuronsof the superior temporal sulcus are also tuned to morecomplex aspects of faces such as their expression andfamiliarity (Young and Yamane, 1992). In humans, theperceptual identication of unfamiliar faces activatesunimodal visual association areas in the fusiform region,whereas the recognition of familiar faces also activatestransmodal nodes, including those in the lateral midtemporalcortex (Gorno Tempini et al. , 1997). Other transmodal areas,such as those in the temporopolar cortex, also appear to playan important role in face recognition, since their involvementin neurological lesions frequently impairs the ability torecognize famous faces (Tranel et al. , 1997). Transmodalareas in the midtemporal and temporopolar cortex(represented by area T in Fig. 2d) may therefore act asgateways for binding the additional associations (such as thename, voice, expression, posture, and private recollections)that collectively lead to the recognition of familiar faces.

    Some neurological lesions lead to a specic face

  • 8/10/2019 From Sensation to Cognition Brain 1998

    13/40

    From sensation to cognition 1025

    recognition decit known as associative prosopagnosias. 6

    This syndrome is most commonly caused by bilateral lesionsin the mid to anterior parts of the lingual and fusiform gyri.Bilateral lesions in this part of the brain are not infrequentsince the posterior cerebral arteries which supply the relevant

    regions arise from a common basilar trunk. In the scheme of Fig. 2d, associative prosopagnosia can potentially arise fromone of three different types of interruption in the connectionsbetween area f and transmodal node T: (i) the visual inputinto region f of the fusiform face area may be interruptedby a more upstream lesion; (ii) area f may be damageddirectly; (iii) the output from area f to the inferotemporalheteromodal cortex (T in Fig. 2d) may be interrupted by amore downstream lesion. In all three instances perception(as tested by the ability to tell if two faces have an identicalshape) can be relatively intact, presumably because moreupstream areas of the occipito-temporal cortex remain intact.Although a prosopagnosic patient cannot recognize familiarfaces by visual inspection, recognition becomes possiblewhen information in a non-visual modality, for example thevoice pattern characteristic of that person, becomes available.This auditory information can presumably access transmodalarea T through area v of the unimodal auditory cortex inFig. 2d, leading to the activation of the other distributedassociations that lead to recognition. Furthermore, a face thatis not consciously recognized can occasionally still elicit aphysiological emotional response (Bauer, 1984; Tranel andDamasio, 1985), presumably because the damage is locateddownstream to f and interrupts its connections to area Tbut not to limbic areas such as those represented by area L

    in Fig. 2d.The most conspicuous manifestation of prosopagnosia is

    the impaired recognition of familiar faces, including thepatients own face. However, patients with prosopagnosiacan also display impaired subordinate-level recognition of additional object categories (Damasio, 1985). Such patientsmay have no difculty in the generic recognition and namingof object classes (for example, they may recognize and namea car as a car or a face as a face) but may not be able todetermine the make of a particular car, recognize a favouritepet, or identify a personal object from among other examplesof the same category. This additional feature of prosopagnosiaraises the possibility that area f may also participate in theidentication of objects other than faces or, alternatively, thatmost lesion sites may involve immediately adjacent regionsthat encode additional object categories.

    If prosopagnosia represents an impairment of subordinate-level recognition, associative visual object agnosia representsan impairment that extends to the level of categoricalrecognition. The patient with this syndrome can neithername a familiar object nor describe its nature. While aprosopagnosic patient can tell that a face is a face and apencil is a pencil, a patient with object agnosia is unable to

    6 This syndrome should be differentiated from apperceptive prosopagnosia,where the face recognition decit results from faulty visuospatial per-ceptual synthesis.

    perform this task but retains the ability to determine if twoobjects are perceptually identical or not. In analogy toprosopagnosia, object agnosia appears to represent adisconnection between visual association areas involved inthe categorical encoding of visual entities and transmodal

    nodes analogous to T in Fig. 2d.The question may be asked why all visual agnosias arenot of this type. If the fusiform and inferior occipital areasspecialized for face and object identication are organizedas described above, how can a prosopagnosic patient withdamage to these areas continue to name and identify objectsand faces at a categorical level? Conceivably, the encodingof proprietary features that lead to subordinate classicationmay require more information than the encoding of genericfeatures. The identication of unique exemplars would thusnecessitate the participation of a larger group of neurons orfurther downstream processing. Object agnosia may thereforerepresent the outcome of a more extensive or more upstreamlesion than that associated with prosopagnosia. Althoughclinical reports show that the lesions in object agnosia appearvery similar to those in prosopagnosia, minor differences inlesion size or location could easily escape detection in casestudies based on cerebrovascular accidents.

    A second potential distinction between prosopagnosia andobject agnosia may be based on the memory systems thatsupport the recognition of generic versus proprietaryinformation. The generic recognition of familiar objects ispart of semantic knowledge, whereas the recognition of familiar faces and objects is more closely related toautobiographical experience. Although prosopagnosia and

    object agnosia are usually seen after bilateral lesions, thesesyndromes do occasionally arise after unilateral lesions, inwhich case prosopagnosia tends to result from lesions in theright hemisphere and object agnosia from lesions in the left(Farah and Feinberg, 1997). This dissociation is interestingsince the right hemisphere appears to have a greater role in theactivation of autobiographical memories (Fink et al. , 1996).

    Associative agnosias have also been identied in theauditory modality. Patients with a condition known as auditoryobject agnosia fail to associate the ringing of a telephone orthe siren of an ambulance with the corresponding entity, eventhough more elementary auditory perceptual abilities remainrelatively preserved. This syndrome may reect adisconnection of unimodal auditory areas specialized forencoding auditory properties of familiar objects fromtransmodal nodes (such as T) that coordinate theirmultimodal recognition. The lesions that give rise to auditoryagnosia typically involve the auditory association cortex, butthe more detailed anatomical correlates of this relatively raresyndrome remain to be elucidated (Spreen et al. , 1965).

    Associative agnosias arise when unimodal areas specializedfor categorical perceptual encoding fail to access transmodalgateways that lead to explicit multimodal recognition andconceptual knowledge. The transmodal areas involved in thisprocess are not centres for convergent conceptual knowledge,

    but optimal conduits for accessing the relevant distributed

  • 8/10/2019 From Sensation to Cognition Brain 1998

    14/40

    1026 M.-M. Mesulam

    associations. These syndromes highlight the importance of sensory-fugal pathways to the process of recognition andoffer a neuroanatomical basis for the distinction betweenperception and recognition. Other cognitive domains, suchas explicit memory, language, spatial awareness and emotion,

    display analogous principles of organization but revolvearound transmodal gateways located in different parts of the brain.

    Arbitrary associations of sensory experience:explicit memory and language Explicit/declarative/episodic memoryThe encoding of space, colour, movement and form displaysconsiderable species-specic invariance and is relativelyunaffected by peculiarities of individual experience. Muchof mental content, however, is based on idiosyncraticassociations which endow percepts and events with personalsignicance. The recording and explicit recall of thesearbitrary relationships require the establishment of long-termmemories. The long-term storage of individual experiencebecomes increasingly more important in nervous systemswhere the linkage of sensation to cognition is complex, andwhere experience can alter the contingencies signalled by asensory event. It would seem superuous to evolve a personalsystem of explicit memory if future contingencies are identicalto those of the past, and if the signicance of sensory eventsdoes not vary from one time to another, or from one individualto another. In the frog, where a certain retinal pattern

    invariably signals a bug to be snapped at, regardless of context or experience, the encoding of individual encounterswould appear relatively unimportant. In fact, even the mostrudimentary form of conditioning has been quite difcult toestablish in frogs (Russek, 1969).

    The situation is drastically different in the primate brain.Each component of the unimodal and transmodal cortexappears to participate in learning arbitrary associations in itsown area of specialization. The inferotemporal cortex encodesnew information in memory tasks related to faces andcomplex visual patterns (Sobotka and Ringo, 1993; Nayaet al. , 1996; Owen et al. , 1996; Squire and Zola, 1996),the midtemporal cortex in memory tasks related to words(Ojemann et al. , 1988), the dorsal parietal cortex in memorytasks related to spatial relationships (Roland and Friberg,1985; Owen et al. , 1996), and the posterior parietal andprefrontal cortices in memory tasks involving multimodalassociations (Kapur et al. , 1996; Haxby et al. , 1997 a).

    In accessing established knowledge, such as the name of a colour, the meaning of a word or the identity of a familiarface, recall is based on rich and stable associations that havebeen consolidated for many years. In order to encode andaccess new facts and experiences, however, fragile andinitially sparse linkages have to be established, nurtured andinserted into the matrix of existing knowledge. This kind of

    learning, also known as explicit, declarative or episodic

    memory, 7 is critically dependent on a special type of bindingsubserved by transmodal nodes within the limbic system, 8

    especially within its hippocampal and entorhinal components.The rst case report describing the onset of severe amnesia

    in a patient with bilateral hippocampal damage was published

    in 1900 by Bechterew. Since then, a very large number of clinical observations have shown that limbic lesions,especially those that involve the hippocampalentorhinalcomplex, completely abolish the conscious recall of new andrecent events while allowing relatively more effective accessto remote memories and semantic knowledge (Scoville andMilner, 1957; Signoret, 1985; Mesulam, 1988; Nadel andMoscovitch, 1997). Lesions outside the limbic system do notlead to similar decits.

    The amnestic state caused by limbic lesions is characterizedby dissociation between the explicit/declarative/episodicrecording of new experience, which is severely impaired,and the implicit learning of motor tasks and perceptualassociations, which is relatively preserved. An amnesticpatient, for example, may develop new motor skills, improveperformance in priming and stem completion tasks, and learnto avoid situations that have recently been associated withaversive consequences, even when he or she has no consciousmemories of the relevant experiences (Claparede, 1911;Milner et al. , 1968; Schacter, 1995). In addition to theimpairment of new learning (anterograde amnesia), thesepatients also display a retrograde amnesia for events thatoccurred before the onset of the limbic lesion, and a gradualshrinkage of the time period encompassed by this retrogradeamnesia in the course of recovery (Benson and Geschwind,

    1967). The shrinkage of the memory impairment duringrecovery suggests that some of the memories lost to retrogradeamnesia had not been obliterated, but had become impossibleto retrieve.

    These observations, especially those related to theshrinkage of the retrograde amnesia, suggest that the limbicsystem is unlikely to be a central storage site for memories.According to a model of explicit memory which is graduallyattracting considerable support, facts and events are initiallyrecorded at multiple sites with an anatomical distribution thatreects the modality- and category-specic aspects of therelevant information. This information is relayed, throughreciprocal multisynaptic pathways, to transmodal nodeswithin the limbic system. These transmodal nodes appear to

    7 Explicit memory refers to voluntary and conscious recall that can bereported overtly; episodic memory refers to the explicit recall of personalexperiences, including their temporal and spatial contexts and the feeling of having been there; semantic memory refers to the explicit recall of generaland invariable facts related to the world around us; declarative memory isa collective term referring to episodic and semantic memories and isusually used as a synonym for explicit memory; implicit memory refers tocircumstances where exposure to a task or stimulus inuences futureperformance even when the subject has no conscious awareness of theexperience related to the prior exposure.

    8 The interconnected components of the limbic system include the corticalareas in the paralimbic and limbic zones, the limbic nuclei of the thalamus(such as the midline, anterior and magnocellular dorsomedial nuclei), andthe hypothalamus (Mesulam, 1985 b).

  • 8/10/2019 From Sensation to Cognition Brain 1998

    15/40

    From sensation to cognition 1027

    play their critical roles by establishing a directory that guidesthe binding of the modality- and category-specic fragmentsof individual events into coherent multimodal experiences(Mishkin, 1982; Damasio, 1989; Mesulam, 1990; McClelland,1995; Squire and Zola, 1996; Nadel andMoscovitch, 1997). In

    addition to its binding function, the hippocampalentorhinalcomplex also appears to promote the stable encoding of new associations in other parts of the neocortex, includingunimodal sensory areas (Halgren et al. , 1985; Eichenbaumet al. , 1996; Higuchi and Miyashita, 1996).

    When a critical volume of the limbic system is destroyed,new associations become more fragile and the process of binding is jeopardized. Consequently, fragments of new andrecent events cannot be integrated interactively into theoverall fabric of consciousness with the type of coherencethat is necessary for declarative recall. However, some of theinformation related to new events continues to be encodedin the neocortical association cortex in a manner that supportsimplicit learning. The unbound, fragmentary form of thisinformation helps in understanding why implicit learningtasks such as priming are so sensitive to the surface (ratherthan associative) properties of the stimuli and why they areresistant to transmodal generalization (Schacter, 1995). Thereis probably no fundamental difference in the type of encodingthat is involved in implicit versus explicit memory. In implicitmemory, the information remains in the form of isolatedfragments, mostly within unimodal and heteromodalassociation areas; in explicit memory it becomes incorporatedinto a coherent context through the binding function of limbicnodes. In keeping with this formulation, tasks of explicit

    memory lead to the activation of medial temporolimbic aswell as neocortical areas, whereas tasks of implicit memorylead to the activation predominantly of neocortical areas(Squire et al. , 1992; Haxby et al. , 1997 a ; Seeck et al. , 1997).

    One of the most important components of the amnesticstate is retrograde memory loss. Retrograde amnesia is usuallymuch more severe for events that occurred just before theonset of the limbic lesion than for those of the distant past, 9

    and more severe for autobiographical-episodic experiencesthan for semantic knowledge (Nadel and Moscovitch, 1997).In general, the severity of the anterograde amnesia in patientswith hippocampalentorhinal lesions tends to be correlatedwith the severity of the retrograde component, suggestingthat the limbic components are as crucial for encoding asthey are for retrieval (Nadel and Moscovitch, 1997). Thehippocampal subregions that participate in retrieval may bedifferent from those that participate in encoding (Gabrieliet al. , 1997), providing a potential explanation for theemergence of major dissociations between retrograde andanterograde amnesia in some patients (OConnor et al. , 1992).

    The emergence of retrograde amnesia and the temporalgradient that it displays in some patients suggest that thehippocampalentorhinal complex and association neocortexare involved in a continuing process of reconstruction,

    9 This is known as Ribots gradient.

    updating and associative elaboration, which collectively leadto the consolidation of new memories. At the initial stage of encoding, a new fact or event has few associations anddepends on the limbic system for maintenance and coherentretrieval. In time, as additional linkages become established

    through reciprocal connections with other transmodal andunimodal areas, the relevant information can be probedthrough numerous associative approaches and becomes lessdependent on the limbic system. The participation of thehippocampalentorhinal complex in retrieval is likely to bemost critical for the most recently acquired memories, forthose that have limited resonance with other mental contents,for those that have been registered casually rather thanintentionally, for those with relatively weak emotionalvalence, for those that require extensive cross-modalintegration, for those that have been recalled rarely and havetherefore failed to establish associative elaboration, and forthose that require the reactivation of idiosyncratic contextualanchors related to temporal and spatial circumstances. Theexistence of such multiple factors helps to explain why thevulnerability of a memory to retrograde amnesia is not alwaysa simple function of its time of acquisition and why cleartemporal gradients are not universally found in amnesticpatients.

    Memory consolidation appears to involve a gradualincrease in the density of the matrix that binds the componentsof the memory to each other and to other aspects of mentalcontent. The outcome is to increase the number of associativeapproaches through which the memory can be probed. Thehippocampalentorhinal complex may well participate in the

    retrieval of all autobiographical and episodic memories,recent and remote, and even in the retrieval of the semanticknowledge related to arbitrary facts about the world, but itmay no longer be critical for the recall of facts and eventsthat have established a rich matrix of associations. In keepingwith this formulation, functional imaging experiments showthat the magnitude of hippocampalentorhinal activationduring memory retrieval is inversely proportional to thestrength of encoding (Petersson et al. , 1997). The transmodalnodes that play a critical role in the retrieval of consolidatedknowledge remain outside the limbic system. Examplesinclude the relationship of transmodal midtemporal andtemporopolar cortices to the recognition of familiar facesand, as will be shown below, that of Wernickes area tothe recognition of words. Thus even the most massivehippocampalentorhinal lesions sustained during adulthoodspare areas of consolidated knowledge such as the recognitionof familiar faces and lexical retrieval.

    Why is new learning dependent on the limbic system? Atentative answer may be based on the constraints that theCNS faces: the number of neurons is xed with little hopeof obtaining new ones, every existing neuron is alreadyoccupied by previously stored information, new informationneeds to be written on top of existing items, and the amountof new information is boundless. The CNS may therefore need

    to be protected from learning too rapidly and indiscriminately,

  • 8/10/2019 From Sensation to Cognition Brain 1998

    16/40

    1028 M.-M. Mesulam

    since this could jeopardize the stability of existing knowledge(McClelland, 1995). An initial ltering is provided byattentional systems which select subsets of behaviourallyrelevant events for further consideration. The limbic systemappears to erect a second line of defence. It provides a

    mechanism that allows the rapid learning of behaviourallyrelevant relationships, but in an initially transient (limbic-dependent) form that may induce a relatively small amount of permanent change in the association cortex. This transitionalperiod may allow new memories to enter associativereadjustments before being assimilated in a more permanentform and also to compete with each other, allowing only thettest to solidify their hold on precious synaptic space.Through these processes, the limbic system simultaneouslysatises the need to limit the indiscriminate inux of newlearning and the need to adapt to a rapidly changingenvironment (McClelland, 1994).

    The question may also be asked why a function as vitalas explicit memory should display a critical dependency ona phylogenetically primitive part of the brain such as thelimbic system. One explanation is that explici