the appeal of parallel distributed processing

29
The Appeal of The Appeal of Parallel Distributed Parallel Distributed Processing Processing J.L. McClelland, D.E. Rumelhart, a nd G.E. Hinton 인인인인 인인인인 인인인

Upload: lise

Post on 06-Jan-2016

71 views

Category:

Documents


0 download

DESCRIPTION

The Appeal of Parallel Distributed Processing. J.L. McClelland, D.E. Rumelhart, and G.E. Hinton 인지과학 협동과정 강소영. Contents. 1. Introduction 2. Parallel Distributed Processing 3. Examples of PDP Models 4. Representation and Learning In PDP Models 5. Origins of Parallel Distributed Processing. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Appeal of Parallel Distributed Processing

The Appeal of Parallel The Appeal of Parallel Distributed ProcessingDistributed Processing

J.L. McClelland, D.E. Rumelhart, and G.E. Hinton

인지과학 협동과정 강소영

Page 2: The Appeal of Parallel Distributed Processing

ContentsContents

1. Introduction 2. Parallel Distributed Processing 3. Examples of PDP Models 4. Representation and Learning In PDP Models 5. Origins of Parallel Distributed Processing

Page 3: The Appeal of Parallel Distributed Processing

1. Introduction1. Introduction

Multiple Simultaneous Constraints Reaching and Grasping The Mutual Influence of Syntax and Semantics Simultaneous Mutual Constraints in Word Recognition Understanding the Interplay of Multiple Sources of

Knowledge

Page 4: The Appeal of Parallel Distributed Processing

The Mutual Influence of The Mutual Influence of Syntax and SemanticsSyntax and Semantics Syntactic constraint

The boy the man chased kissed the girl

Semantic constraint I saw the grand canyon flying to New York I saw the sheep grazing in the field

Mutual constraint within each of these domains I like the joke I like the drive I like to joke I like to drive

Page 5: The Appeal of Parallel Distributed Processing

Simultaneous Mutual Constraints Simultaneous Mutual Constraints in Word Recognitionin Word Recognition

Selfridge’s example Paradox:

How can we get the process started?

Solution: our perceptual system is capable of exploring all these

possibilities without committing itself to one until all of the constraints are taken into account

Page 6: The Appeal of Parallel Distributed Processing
Page 7: The Appeal of Parallel Distributed Processing

Understanding the Interplay of Understanding the Interplay of Multiple Sources of KnowledgeMultiple Sources of Knowledge

Knowledge Structure scripts (Schank 1976) frames (Minsky 1975) schemata (Norman and Bobrow 1976; Rumelhart 1975)

Most everyday situations cannot be rigidly assigned to just a single script Interplay between a number of different sources of in

formation ex) birthday party at a restaurant

The generative capacity of human understanding in novel situations --> interact with each other

Page 8: The Appeal of Parallel Distributed Processing

2. 2. Parallel Distributed Parallel Distributed ProcessingProcessing Properties of the tasks that people are good at.

A number of different pieces of information must be kept in mind at once.

Each plays a part, constraining others and being constrained by them

Assumption of PDP model: interactions of a large number of simple processing

elements each sending excitatory and inhibitory signals to other

units.

Elements of model unit, activation, interaction among units

Page 9: The Appeal of Parallel Distributed Processing

PDP Models: Cognitive Science or NeuroscPDP Models: Cognitive Science or Neuroscience ience

The appeal of PDP : Computationally sufficient and psychologically

accurate mechanistic accounts of the phenomena of human cognition

PDP models have radically altered the way we think about the time course of processing the nature of representation the mechanisms of learning

Page 10: The Appeal of Parallel Distributed Processing

Microstructure of CognitionMicrostructure of Cognition

Parallel Distributed Model offer alternatives to serial models of the microstructure of cognition. They do not deny that there is a macrostructure

Objects referred to in macrostructural models of cognitive processing are seen as approximate descriptions of emergent properties of the microstructure

Page 11: The Appeal of Parallel Distributed Processing

3. Examples of PDP Models3. Examples of PDP Models

Recent application of PDP Motor control, perception, memory, language

PDP mechanisms are used to provide natural accounts of the exploitation of multiple, simultaneous, mutual constraint

Page 12: The Appeal of Parallel Distributed Processing

3.1 Motor Control 3.1 Motor Control

Hinton’s stick person Two constraints on the task

the tip of the forearm must touch the object center of gravity over the foot

Each processor receives two information how far the tip of the hand was from the target where the center of gravity was with respect to the foot

Combination of joint angles

Page 13: The Appeal of Parallel Distributed Processing
Page 14: The Appeal of Parallel Distributed Processing

3.2 Perception3.2 Perception

Stereoscopic Vision Random Dot Stereogram --> Depth Perception

Perceptual Completion of Familiar Patterns Completion of Novel Patterns

Page 15: The Appeal of Parallel Distributed Processing

Marr and Poggio (1976) explain the perception of depth in random-dot stereo

grams Two general principles about the visual world

Stereoscopic VisionStereoscopic Vision

Page 16: The Appeal of Parallel Distributed Processing

Perceptual Completion of Perceptual Completion of Familiar PatternsFamiliar Patterns Perception is influenced by familiarity

Less time ambiguous lower level information to fill in missing lower-level information phonemic restoration effect

visual perception of words (McClelland and Rumelhart 1981)

Assumption of model detectors for the visual features

Page 17: The Appeal of Parallel Distributed Processing

Two hypotheses or activation mutually consistant support each other mutually inconsistant weaken each other

two kinds of inconsistency between-level inconsistency

between-level inhibition

mutual exclusion competitive inhibition

Page 18: The Appeal of Parallel Distributed Processing
Page 19: The Appeal of Parallel Distributed Processing
Page 20: The Appeal of Parallel Distributed Processing

Completion of Novel PatternsCompletion of Novel Patterns

Result of word perception model exhibits perceptual facilitation to pronounceable nonwo

rds as well as words general principles or rules can emerge from the interact

ions of simple processing elements. does not implement exactly any of the systems of ortho

graphic rules that have been proposed by linguists or psychologists

PDP models may provide more accurate accounts of the details of human performance than models based on a set of rules representing human competence

Page 21: The Appeal of Parallel Distributed Processing
Page 22: The Appeal of Parallel Distributed Processing

3.3 Retrieving Information 3.3 Retrieving Information From MemoryFrom Memory

Content Addressability Graceful Degradation Default Assignment Spontaneous Generalization

Page 23: The Appeal of Parallel Distributed Processing

Jets and Sharks ModelJets and Sharks Model

Page 24: The Appeal of Parallel Distributed Processing

4. Representation and 4. Representation and Learning In PDP ModelsLearning In PDP Models What is the stored knowledge that gives rise to

that pattern of activation? The difference between PDP models and other

models of cognitive processes others: knowledge is stored as a static copy of a pattern PDP:

the patterns themselves are not stored what is stored is the connection strengths between

units that allow these patterns to be re-created

Page 25: The Appeal of Parallel Distributed Processing

Local Versus Distributed Local Versus Distributed RepresentationRepresentation Distributed Representation

The knowledge about any individual pattern is not stored in the connections of a special unit reserved for that pattern, but is distributed over the connections among a large number of processing units.

Units are conceptual primitives Units have no particular meaning as individuals

Pattern Associator --> Hebbian Rule

Page 26: The Appeal of Parallel Distributed Processing

Attractive Properties of Pattern AsAttractive Properties of Pattern Associator Modelssociator Models Uncorrelated patterns do not interact with each oth

er, but more similar ones do if we present the same pair of patterns over and ov

er, but each time we add a little random noise to each element of each member of the pair, the system will automatically learn to associate the central tendency of the two patterns and will learn to ignore the noise

What will be stored will be an average of the similar patterns with the slight variations removed.

Page 27: The Appeal of Parallel Distributed Processing

Extracting the Structure of an EnsExtracting the Structure of an Ensemble of Patternsemble of Patterns Distributed Model

if there are regularities in the correspondences between pairs of patterns, the model will naturally extract these regularities.

Language Learning Model - learning past tense creation of regular past tenses of new verbs overregularization of the irregular verbs same phenomena as what is shown in children’s past te

nse acquisition we can see how the acquisition of performance that con

forms to linguistic rules can emerge from a simple, local, connection strength modulation process

Page 28: The Appeal of Parallel Distributed Processing

5. Origins of Parallel 5. Origins of Parallel Distributed ProcessingDistributed Processing

Jackson(1869/1958) and Luria(1966) distributed, multilevel conceptions of processing systems dynamic functional system

Hebb(1949) and Lashley(1950) “there are no special cells reserved for special memories”

Rosenblatt(1959, 1962) and Selfridge(1955) perceptron Pandemonium : importance of interactive processing

Anderson, Grossberg, Longuet-Higgins (60’s, 70’s) concept learning competitive learning mechanism distributed memory models

Page 29: The Appeal of Parallel Distributed Processing

Marr and Poggio(1976) Morton’s logogen model(1969)

one of the first models to capture concretely the principle of interaction of different sources of information

Marseln-Wilson(1978) empirical demonstrations of interaction between different levels

of language processing

Levin’s Proteus model(1976) virtues of activetion-competition model

Feldman and Ballard(1982) Hofstadter(1979, 1985) Sutton and Barto(1981) --> delta rule Hopfield(1982)