connectionist sentence comprehension and production system a model by dr. douglas rohde, m.i.t by...
TRANSCRIPT
![Page 1: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/1.jpg)
Connectionist Sentence Comprehension and Production System
A model by Dr. Douglas Rohde, M.I.T
byDave CookeNov. 6, 2004
![Page 2: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/2.jpg)
OverviewIntroduction – A brief overview of Artificial Neural Networks– The basic architecture
Introduce Douglas Rohde's CSCP model– Overview– Penglish Language– Architecture– Semantic System– Comprehension, Prediction, and Production System– Training– Testing– Conclusions
Bibliography
![Page 3: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/3.jpg)
A Brief Overview o Basic definition of an Artificial Neural Network
o A network of interconnected “neurons” inspired by the biological nervous system.
o The function of an Artificial Neural Network is to produce an output pattern from a given input.
o First described by Warren McCulloch and Walter Pitts in 1943 in their seminal paper “A logical calculus of ideas imminent in nervous activity”.
![Page 4: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/4.jpg)
Artificial neurons are modeled after biological neurons
The architecture of an Artificial Neuron
![Page 5: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/5.jpg)
Architecture -- Structure
o Network Structureo Many types of neural network structures
o Ex Feedforward, Recurrent
o Feedforwardo Can be single layered or multi-layeredo Inputs are propagated forward to the output layer
![Page 6: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/6.jpg)
Architecture -- Recurrent NNo Recurrent Neural Networks
o Operate on an input space and an internal state space – they have memory.
o Primary types of Recurrent neural networkso simple recurrento fully recurrent
o Below is an example of a simple recurrent network (SRN)
![Page 7: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/7.jpg)
Architecture -- Learning
o Learning used in NN's
o Learning = change in connection weights
o Supervised networks: network is told about correct answer o ex. back propagation, back propagation through time,
reinforcement learning
o Unsupervised networks: network has to find correct input.o competitive learning, self-organizing or Kohonen maps
![Page 8: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/8.jpg)
Architecture -- Learning (BPTT)
o Backpropagation Through Time (BPTT) is used in the CSCP Model and
SRNso In BPTT the network runs ALL of its forward passes then performs
ALL of the backward passes.o Equivalent to unrolling the network backwards through time
![Page 9: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/9.jpg)
The CSCP Model
o Connectionist Sentence Comprehension and Production model
o Primary Goal: learn to comprehend and produce sentences developed in the Penglish( Pseudo English) language.
o Secondary Goal: to construct a model that will acount for a wide range of human sentence processing behaviours.
![Page 10: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/10.jpg)
Basic Architecture
o A Simple Recurrent NN is used
o Penglish (Pseudo English) was used to train and test the model.
o Consists of 2 separate parts contected by a “message layer”o Semantic System (Encoding/Decoding System)o CPP system
o Backpropagation Through Time (BPTT) is the learning algorithm.
o method for learning temporal tasks
![Page 11: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/11.jpg)
Penglish
o Goal: to produce only sentences that are reasonably valid in english
o Built around the framework of a stochastic context-free grammar.
o Given a SCFG it is easy to generate sentences, parse sentences, and perform optimal prediction
o Subset of english some grammatical structures used areo 56 verb stemso 45 noun stemso adjectives, determiners, adverbs, subordinate clauseso several types of logical ambiguity.
![Page 12: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/12.jpg)
Penglish
o Penglish sentences do not always sound entirely natural even though constraints to avoid semantic violations were implemented
o Example sentences are:
o (1) We had played a trumpet for you
o (2) A answer involves a nice school.
o (3) The new teacher gave me a new book of baseball.
o (4) Houses have had something the mother has forgotten
![Page 13: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/13.jpg)
The CSCP Model
Semantic System
CPP
System
Start
stores all propositions seen for current sentence
![Page 14: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/14.jpg)
Semantic System
Propositions loaded sequentially
Propositions stored in Memory
![Page 15: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/15.jpg)
Semantic System
Error measure
![Page 16: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/16.jpg)
Training (SS)
o Backpropagationo Trained separate and prior to the rest of the model.
o The decoder: uses standard single-step backpropagation
o The encoder is trained using BPTT.
o Majority of the running time is in the decoding stage.
![Page 17: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/17.jpg)
Training (SS)
Error is assessed here.
![Page 18: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/18.jpg)
CPP System
Error measure
Phonologically encoded word.The CPP System
![Page 19: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/19.jpg)
CPP System (cont.)
Starts here by trying to predict next word in sentence.
Goal to produce next word in sentence and pass it to Word Input Layer
![Page 20: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/20.jpg)
4. BPTT
The CPP System - Training
1. BPTT starts here.
2. Backpropagated to here.3. Previously recorded output errors are injected here
![Page 21: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/21.jpg)
Trainingo 16 Penglish training sets
o Set = 250,000 sentences, total = 4 million sentences
o 50 000 weight updates per set = 1 epoch
o Total of 16 epochs.
o The learning rate start at .2 for the first epoch and then was gradually reduced over the course of learning.
o After the Semantic System the CPP system was similarily trained
o Training began with limited complexity sentences and complexity increased gradually.
o Training a single network took about 2 days on a 500Mhz alpha. Total training time took about two months.
o Overall 3 networks were trained
![Page 22: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/22.jpg)
Testing
o 50,000 sentences
o 33.8% of testing sentences also appeared in one of the training sets.
o Nearly all of the sentences had 1 or 2 propositions.
o 3 forms of measurement are used in measuring comprehension.
o multiple choice measure
o Reading time measure
o Grammaticality rating measure
![Page 23: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/23.jpg)
Testing (Multiple Choice)
o Example: “When the owner let go, the dog ran after the mailman.”o Expressed as [ran after, theme, ?]
o Possible answerso Mailman (correct answer)o owner, dog, girls, cats. (distractors)
o Error measure is
o When applying four distractors, the chance performance is 20% correct.
![Page 24: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/24.jpg)
Testing (Reading Time)
o Also known as Simulated Reading Timeo It’s a weighted average of 4 components.
o 1 and 2 “Measure the degree to which the current word was expected”
o 3rd “The change in the message that occurred when the current word was read”
o 4th “The average level of activation in the message layer”o The four components are multiplied by scaling factors to achieve average
values of close to 1.0 for each of them and a weighted average is then taken.
o Ranges from .4 for easy words to 2.5 or more for very hard words.
![Page 25: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/25.jpg)
Testing (Grammaticality)
o The Grammaticality Method
o (1) prediction accuracy (PE)o Indicator of syntactic complexityo Involves the point in the sentence at which the worst two consecutive
predictions occur.o (2) comprehension performance (CE)
o Average strict-criterion comprehension error rate on the sentence.o Intented to reflect the degree to which the sentence makes sense.
o Simulated ungrammaticality rating (SUR)o SUR = (PE – 8) X (CE + 0.5)o combines the two components into a single measure of
ungrammaticality
![Page 26: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/26.jpg)
Conclusionso General Comprehension Results
o final networks are able to provide complete, accurate answero Given NO choices 77%o Given 5 choices 92%
o Sentential Complement Ambiguityo Strict criterion error rate 13.5%o Multiple choice 2%
o Subordinate Clause Ambiguityo Ex. Although the teacher saw a book was taken in the school.o Intransitive, weak bad, weak good condition, strong bad, and strong
good all were under 20% error rate on multiple choice questions.
![Page 27: Connectionist Sentence Comprehension and Production System A model by Dr. Douglas Rohde, M.I.T by Dave Cooke Nov. 6, 2004](https://reader036.vdocuments.site/reader036/viewer/2022070307/551b3db85503465c7e8b4fdc/html5/thumbnails/27.jpg)
Bibliography
1. Artificial Intelligence 4th ed, Luger G.F., Addison Wesley, 2002
2. Artificial Intelligence 2nd ed, Russel & Norvig, Prentice Hall, 2003
3. Neural Networks 2nd ed, Picton P., Palgrave, 20004. A connectionist model of sentence comprehension and
production, Rohde D., MIT, March 2 20025. Finding Structure in Time, Elman J.L, UC San Diego,
Cognitive Science, 14, 179-211, 19906. Fundamentals of Neural Networks, Fausett L, Pearson,
1994