the pdp approach to understanding the mind and brain j. mcclelland cognitive core class lecture...

31
The PDP Approach to Understanding the Mind and Brain J. McClelland Cognitive Core Class Lecture March 7, 2011

Upload: olivia-heath

Post on 31-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

The PDP Approach to Understanding the Mind and Brain

J. McClellandCognitive Core Class Lecture

March 7, 2011

Decartes’ Legacy

• Mechanistic approach to sensation and action

• Divine inspiration creates mind

• This leads to four dissociations:– Mind / Brain– Higher Cognitive Functions /

Sensory-motor systems– Human / Animal– Descriptive / Mechanistic

Early Computational Models of Human Cognition (1950-1980)

• The computer contributes to the overthrow of behaviorism.

• Computer simulation models emphasize strictly sequential operations, using flow charts.

• Simon announces that computers can ‘think’.

• Symbol processing languages are introduced allowing some success at theorem proving, problem solving, etc.

• Minsky and Pappert kill off Perceptrons.

• Cognitive psychologists distinguish between algorithm and hardware.

• Neisser deems physiology to be only of ‘peripheral interest’

• Psychologists investigate mental processes as sequences of discrete stages.

Ubiquity of the Constraint SatisfactionProblem

• In sentence processing– I saw the grand canyon flying to New York– I saw the sheep grazing in the field

• In comprehension– Margie was sitting on the front steps when she heard the

familiar jingle of the “Good Humor” truck. She remembered her birthday money and ran into the house.

• In reaching, grasping, typing…

Graded and variable nature of neuronal responses

Lateral Inhibition in Eye of Limulus

(Horseshoe Crab)

The Interactive Activation Model

Distributed Representations in the Brain:Overlapping Patterns for Related

Concepts (Kiani et al, 2007)

dog goat hammer

dog goat hammer

• Many hundreds of single neurons recorded in monkey IT.

• 1000 different photographs were presented twice each to each neuron.

• Hierarchical clustering based on the distributed representation of each picture:– The pattern of activation over

all the neurons

Kiani et al, J Neurophysiol 97: 4296–4309, 2007.

The Rumelhart

Model

The QuillianModel

1. Show how learning could capture the emergence of hierarchical structure

2. Show how the model could make inferences as in the Quillian model

DER’s Goals for the Model

Experience

Early

Later

LaterStill

Start with a neutral representation on the representation units. Use backprop to adjust the representation to minimize the error.

The result is a representation similar to that of the average bird…

Use the representation to infer what this new thing can do.

Questions About the Rumelhart Model

• Does the model offer any advantages over other approaches?– Do distributed representations really buy us anything?– Can the mechanisms of learning and representation in the

model tell us anything about• Development?• Effects of neuro-degeneration?

Phenomena in Development

• Progressive differentiation• Overgeneralization of

– Typical properties– Frequent names

• Emergent domain-specificity of representation

• Basic level advantage• Expertise and frequency

effects• Conceptual reorganization

Disintegration in Semantic Dementia

• Loss of differentiation • Overgeneralization

The Hierarchical Naïve Bayes Classifier Model (with R. Grosse and J. Glick)

• The world consists of things that belong to categories.

• Each category in turn may consist of things in several sub-categories.

• The features of members of each category are treated as independent– P({fi}|Cj) = i p(fi|Cj)

• Knowledge of the features is acquired for the most inclusive category first.

• Successive layers of sub-categories emerge as evidence accumulates supporting the presence of co-occurrences violating the independence assumption.

Living Things

Animals Plants

Birds Fish Flowers Trees

Property One-Class Model 1st class in two-class model

2nd class in two-class model

Can Grow 1.0 1.0 0

Is Living 1.0 1.0 0

Has Roots 0.5 1.0 0

Has Leaves 0.4375 0.875 0

Has Branches 0.25 0.5 0

Has Bark 0.25 0.5 0

Has Petals 0.25 0.5 0

Has Gills 0.25 0 0.5

Has Scales 0.25 0 0.5

Can Swim 0.25 0 0.5

Can Fly 0.25 0 0.5

Has Feathers 0.25 0 0.5

Has Legs 0.25 0 0.5

Has Skin 0.5 0 1.0

Can See 0.5 0 1.0

A One-Class and a Two-Class Naïve Bayes Classifier Model

Accounting for the network’s feature attributions with mixtures of classes at

different levels of granularity

Reg

ress

ion

Bet

a W

eigh

t

Epochs of Training

Property attribution model:P(fi|item) = kp(fi|ck) + (1-k)[(jp(fi|cj) + (1-j)[…])

Should we replace the PDP model with the Naïve Bayes Classifier?

• It explains a lot of the data, and offers a succinct abstract characterization

• But– It only characterizes what’s learned when the data actually

has hierarchical structure

• So it may be a useful approximate characterization in some cases, but can’t really replace the real thing.

Structure Extracted by a Structured Statistical Model

Predictions

• Similarity ratings (and patterns of inference) will violate the hierarchical structure

• Patterns of inference will vary by context

Experiments

• Size, predator/prey, and other properties affect similarity across birds, fish, and mammals

• Property inferences show clear context specificity

• Future experiments will examine whether inferences (even of biological properties) violate a hierarchical tree for items like weasels, pandas, and beavers

The Nature of Cognition, and the Place of PDP in Cognitive Theory?

• Many view human cognition as inherently– Structured– Systematic– Rule-governed

• In this framework, PDP models are seen as– Mere implementations of higher-level, rational, or

‘computational level’ models– … that don’t work as well as models that stipulate explicit

rules or structures

The Alternative

• We argue instead that cognition (and the domains to which cognition is applied) is inherently– Quasi-regular– Semi-systematic– Context sensitive

• On this view, highly structured models: – Are Procrustian beds into which natural cognition fits

uncomfortably– Won’t capture human cognitive abilities as well as models

that allow a more graded and context sensitive conception of structure

Levels of Analysis• Marr (1982) suggested we should analyze cognitive tasks at three levels:

– Computation: what are the goals, what information is available, how could the information be used to achieve the goals; what is the best that can be done with the given information?

– Algorithms and representations: How is information represented? What algorithms are used in manipulating representations?

– Implementation: How are the algorithms and representations implemented in neural circuitry?

• PDP models often closely approximate (and can in many cases exactly match) idealized competence models (including structured probabilistic models).

• Which is the approximation?

• The PDP approach encourages computational level analysis but asks many questions about it:

– How do we know what task – which computations – an organism is actually trying to carry out?

– Is performance constrained by tasks the organism was trying to perform when it evolved or that it has performed habitually? Such constraints may be ‘wired into’ the processing mechanism, constraining its performance and preventing optimality for a given task.

– The approach leads us to ask: How does the architecture and/or type of processing machinery constrain the problem and its solution? Perhaps performance is being optimized within such constraints?

• The PDP approach also blurs the distinction between the algorithmic and implementation levels

– PDP models generally do not concern themselves with the minute details of neural implementation, and their performance often approximates performance that would be achieved by an explicit algorithm – thus they appear to lie between Marr’s algorithmic and implementation levels

– PDP models do not deny that there are temporally extended cognitive processes, e.g. in problem solving and planning, that involve many steps and that can often be usefully characterized in terms of a sequence of discrete states (but leave open the possibility that insight and creativity short-circuit such processes).

– The automatic and intuition-based nature of PDP models may, however, be very relevant even in our most advanced forms of cognition.