cognitive science and its educational application
TRANSCRIPT
Cognitive Science
And its educational application
Cognitive Science
the study of intelligence and intelligent systems, with particular reference to intelligent behaviour as computation" (Simon & Kaplan, 1989)
Simon, H. A. & C. A. Kaplan, "Foundations of cognitive science", in Posner, M.I. (ed.) 1989, Fou
ndations of Cognitive Science, MIT Press, Cambridge MA.
Cognitive Science
interdisciplinary study of the acquisition and use of knowledge. It includes as contributing disciplines: artificial intelligence,
psychology, linguistics, philosophy, anthropology, neuroscience, and education.
Cognitive science grew out of three developments: the invention of computers and the attempts to design programs that
could do the kinds of tasks that humans do; the development of information processing psychology where the goal
was to specify the internal processing involved in perception, language, memory, and thought;
and the development of the theory of generative grammar and related offshoots in linguistics.
Cognitive science was a synthesis concerned with the kinds of knowledge that underlie human cognition, the details of human cognitive processing, and the computational modeling of those processes.
Eysenck, M.W. ed. (1990). The Blackwell Dictionary of Cognitive Psychology. Cambridge, Massachusetts: Basil Blackwell Ltd.
Cognitive psychology
is concerned with information processing, and includes a variety of processes such as attention, perception, learning, and memory.
It is also concerned with the structures and representations involved in cognition.
The greatest difference between the approach adopted by cognitive psychologists and by the Behaviorists is that cognitive psychologists are interested in identifying in detail what happens between stimulus and response.
Types of Information Processing
Sequential ProcessingParallel Distributed Processing
Sequential Processing
What is
Cognitive View of Learning
Learning is the result of our attempts to make sense of the world
KnowledgeStimulus
Response
What does “make sense” mean?
Information Processing Model
ENVIRONMENT
EXECTIVE CONTROL EXPECTANCIESEFFECTORS
RECEPTORS
RESPONSE
GENERATORS
SENSORY
REGISTER
SHORT- TERM
MEMORY
LONG-TERM
MEMORY
(From R.M. Gagne, 1974)
Sensory Register
Receptor Memory
Information
leaves in 1 to 3 seconds
perceived (organized) information
goes to working memory
Attention
Storage Structur
eCode Capacity
Duration
RetrievalCauses of failure to
recall
Sensory "store"
Sensory: features
12-20 items to
huge
250 msec. - 4
sec.Complete
Masking or decay
Short-term
memory
Acoustic, visual,
semantic, sensory features
identified and named
7 +- 2 items
About 12 sec.;
longer with
rehearsal
Complete: with each item being retrieved every 35
msec.
Displacement,
interference; decay
Long-term
memory
semantic, visual know., abstractions,
meaning, images
Enormous, virtually unlimited
Indefinite
Specific and general
information available
given proper cueing
Interference, organic
dysfunctioning,
inappropriate cues
TYPES OF KNOWLEDGE
KNOWLEDGE
CONCEPTUAL KNOWLEDGE
PROCEDURAL KNOWLEDGE
Example of a semantic Network
Table
brown
4 legs
Furniture
chair
isa
isa
colour
colour
shapeshape
Tom
dog
terrier
isaisa
colour
Pattern-recognition
This means I should add up 3 and 5
35+
35+
Action-sequence
If I should add up 3 and 5, then let me think,... Ah, the answer is 8?
If <the pattern of adding up numbers is seen>Then <carry out the operation to add the numbers>
If <the pattern of adding up numbers is seen>Then <carry out the operation to add the numbers>
Pattern-recognition:
Action-sequence:
If <two numbers are to be added> Then do the addition
Composition of Productions
Production 1 Production 2 Production 3
Composite Production
Composition
rehearsed several timesfasternot easily separated
If p then q If q then r If r then s
If p then s
Learning conceptual knowledge
Knowledge stores in the long term memory
A piece of knowledge is learned if it is linked to other pieces, the more it is rehearsed (linked to others), the links will be stronger, and thus has more chances to be recalled
Learning Procedural Knowledge
Initially as declarative knowledgeAfter compilation and composition, it
becomes a piece of procedure knowledgeStored in the long term memory with the
related conceptual knowledge
Retrieval of Conceptual Knowledge
Activation of a node or several nodesSpread of activation through links among
nodes until the required piece is activated.Nodes connected to the activated node
with stronger links will have more chances to be recalled.
Forgetting is the effect of interference.
Retrieval of Procedural Knowledge
Procedures with condition part match the situation will be fired
Several pieces may have the same condition, those with higher strength will have more chances to be fired.
Strength of a procedure depends on …..
Learning of Procedural Knowledge
An Example
Pattern of Maturation – A possible route when
a student learns a rule
CORRECT
Sleeman (1985)
UNPREDICTABLE
CONSISTENT USE of MAL-RULES (incorrect rules)
MODELS AND THEORIES OF PROCEDURAL ERRORS
WHY STUDY ERRORS:Instruction requires diagnosing well;diagnosing well requires to know:
What the errors are?How are errors formulated?
Types of Procedural Erros
Slips Careless work Intend to perform the appropriate action but fail
to do so
Systematic Errors Due to mistaken or missing knowledge
Possible Reasons for slips
Loss of information from working memorydeployment of attention or cognitive
controlCompletion among the activation levels
and triggering conditional of coexisting demons or schemata
Schema
a way of capturing the insight that concepts are defined by a configuration of features, and each of these features involves specifying a value the object has on some attribute.
The schema represents a concept by pairing a class of attribute with a particular value, and stringing all the attributes together.
They are a way of encoding regularities in categories, whether these regularities are propositional or perceptual.
They are also general, rather than specific, so that they can be used in many situations.
Example:
References: Anderson, J.R. (1990). Cognitive psychology and its implications. New York, NY: Freeman.
Possible Reasons for Systematic Errors
Incomplete or misguided learning
WHY STUDY SYSTEMATIC ERRORS:
A deeper insight into the learning process may be reached. To diagnose students' work and to help them to avoid making the same error again.computer diagnostic consultant may be developed based on the findings so that individualized tutoring may possibly be actualized.
EXPLANATIONS ON SYSTEMATIC ERRORS
Incorrect or faulty algorithms with the same inputs of the correct algorithms. Bugs.Mal-rules.Repair theory (Brown & VanLehn, 1980).Deletion theory (Young & O'Shea, 1981)Misgeneralization theory (Matz, 1982; Sleeman, 1984). Control-slip explanation of errors.
Bugs
Systematic Errors – bugs in the correct procedure
Slight modification or perturbation of a correct procedure (VahLehn, 1984)
Describes which problem the student gets wrong, what each wrong answer is, and the steps followed by the student in producing it (VahLehn, 1984)
306 - 138 ---------
78
80 - 4
-------- 76
183 - 95------
88
702 - 11
591 ------
3005- 28------1087
7002
- 239------4873
Borrowing Across Zero Bug
減法
設置
寫出 寫出
轉換
下式大
比較
比較個
新問題
倒轉
改變
直行列
完成 直行
借位?
於上式?上式 下式
別數字 符號
減法
借位
行列完成
取上位數
減數表
取上位數字
取下位數字
比較數字
加拾 借拾
找另一直行
零
借一
減一
得九記號
寫下
Procedure used in Subtraction
Use of Bug Theory
A computer programme called Debuggy was designed to mimic errors made by the students (100% correct)
Mal-rules
Incorrect rules made by students in solving a task.
Algebra Task Types
8. Mx = N (P*Q) 9. Mx = N (Px + Q)10. Mx = N + P*Q11. M + Nx + Px = Q12. Mx = N + P (Qx+R)13. Mx = N (x + P) = Q14. Mx + N = Px + Q
1. Mx = P 2. Mx = N + P 3. Mx = NP 4. Mx + Nx = P 5. Mx + N = P 6. M + Nx = P 7. Mx = Nx + P
Mal-rulesS1 M*X = N -> X = M/NS2 pat1 +|- M pat2 = pat3 -> pat1 pat2 = pat3 +|- MS3 pat1 +|- M pat2 = pat3 -> pat1 +|- pat2 = pat3 +|- MS4 pat1 = pat2 +|- MX pat3 -> pat1 +|- MX = pat2 pat3S5 M (NX +|- P) -> M*NX +|- PS6 M (NX +|- P) -> M*NX +|- M +|- PS7 M*X + N*X -> M*X*NS8 M*X + N*X -> M*X + NS9 M*X + N*X -> M+X+N+XS10 M*X ->M+X
S11 M+N*X= ->M*N+X=S12 M+N*X= ->M+N+X=S13 M*X + N = ->M+X+N=S14 M*X+N = ->M*X*N=S15 M*X+N= ->(M+N)*X=S16 M*X=N*P -> X=MS17 M*X=N*X+P -> X+X=M+N+PS18 M*X=N+P -> M*X=NS19 M*X=N+P -> M*X=PS20 M*X=N -> X=N
S21 M*X=N -> X=<N/F>/MS22 M*X=N -> X=N/<M/F>S23 M*(N*X+P) -> M*X+M*PS24 2X/2X -> 0S25 A*1/A -> 0S26 0*A -> A
Table 2 The Algebra Mal-rules listed by Sleeman (1984) Supplemented by Three from Matz 1982)
Why errors formulated explained?
Attempts to explain why errors formed
1. Repair Theory:Causes
uninteresting to assert merely that an error can be "explained" by a mal-ruleQuestions such as how the bugs are caused; why some bugs are found but the other possible ones are not answered.To develop a theory which can be used to predict what bugs will exist for procedural skills they have not yet analyzed.
Repairing Theory
ImpasseNot quit, find ways to repair
Repaired, remembered
Repair Theory (Brown and VahLehn, 1980)
Get stuck (Impasse) when executing a possibly incomplete procedure
Not quit, but do a small amount of problem solving, just enough to get unstuck and complete the problem
The local problem solving strategy (Repair), rarely succeed in rectifying the broken procedure – causes errors
An example
Suppose a student has never borrowed from zero. The first time he is asked to solve a borrow-from-zero problem, such as (a),
(a)
305 - 48
he will probably proceed as follows:
(b)
305 - 48 267
process the units column by attempting to borrow from the tens column,impasse reaches because zero cannot be decremented;repair -- skip the decrement operation;answer is as in (b), a bug called Stops-Borrow-AT-Zero.
if repair is done as relocating the decrement operation and do it instead on a nearby digit that is not zero;result is as in (c), a bug called Borrow-Across-Zero.
(c)
305 - 48 167
2. Deletion Theory (Young & O'shea, 1981)
BUGGY produces models that behave functionally as the students, these models are not very convicting as psychological models.Many of the bugs appear to be very similar (many are connected with borrowing from zero).some of the BUGGY data can be analyzed more simply in terms of certain competences being omitted from the ideal model.
Young and O'Shea classified students into 3 categories:
Algorithm errors (124=36%)Pattern errors (54=16 percent)Number-fact errors (127=37%)
Algorithm errors (124=36%)
Borrow when < e.g. 96-42=44.Take smaller e.g. 64-34=21Always borrow e.g. 96-42=44 and 92-46=46S>M -> zero e.g. 72-57=20One -> 10 e.g. 71-52=18Add Column 2 e.g. 21-19=22 (add the digits in the sec.col.)
Pattern errors (54=16 percent)0-N=N0-N=0N-N=N
Number-fact errors (127=37%)impossible to analyze (39=11%)
Production system to account for the subtraction errors.
A production system (PS) is a collection of rules in the form:
C => Awhen the condition C is satisfied, the action A is meant to apply.
The architecture has three components:
A working memory (WM): A set of elements such ( S EQ M) or (RESULT 5).A production memory, which holds a collection of production rules.A conflict resolution method to determine which rule is be fired when more than one is applicable.
Production System for subtraction by decomposition:
FD: M=m, S=s => FindDiff, NextColumnB2A: S>M => BorrowBS1: Borrow => *AddTenToMBS2: Borrow => *DecrementCM: M=m, S=s => *CompareIN: ProcessColumn => *ReadMandSTS: FindDiff => *TakeAbsDiffNXT: NextColumn => *ShiftLeft, ProcessColumnWA: Result=x => *Write=xDONE: NoMore => *HALTB2C: S=M => Result 0, NextColumnAC: Result 1=x => *Carry, Result=x
3. Misgeneralization
(Overgeneralization)
overgeneralising from instances, using an "old" operator instead of a more recently introduced one, and regressing under cognitive load. (Davis, Jockusch, & McKnight, 1978)e.g. "+" instead of "*", "*" instead of exponential.
Matz (1982) suggested a number of high-level schema which explain a series of observed errors. extrapolation principle:(A*B)C => AC * BC makes students write (A+B)C => AC + BC
Sleeman further classified the students' error in four categories:
Manipulative Errors: A variant on a correct rule which has one substage either omitted or replaced by an inappropriate or incorrect operation
X=6/4 => X=3/4(divide 4 by 2 is omitted)
5*X=12 => X= 2 2/12 (2/12 should be 2/5)
Parse Errors: what happens when a student "mis-sees," or "mis-parse, an algebraic equation. These errors cannot be explained by omitting a component. Nor does it seem that they can be explained by performing a repair to a core procedure.
e.g. 6*X = 3*X-12 6*X = 3*X+129*X=12 X+X=12+3-6X=12/9 2*X=9X=4/3 X=9/2
Clerical errors: Just slips described above."Wild"/unexplained errors: mistakes not explained so far, may be due to the consistent use of mal-rules which so far not identified.
4. Competition of Rules Payne & Squibb
cooccurrence of a slip and mistake, simultaneous representation of alternative rules (correct or incorrect) that apply in the same situations
using some notion of rule strength to resolve conflicts
Errors are represented by faulty ruleserror arises only when weaker, faulty rules
are preferred to correct, stronger rules
Origins of Errors Explained?
where the incorrect versions of rules come from?
the mistake-generating mechanisms of misgeneralization and repair have difficulty predicting the development of novel, incorrect rules in problem solvers who already know the correct versions.
Mechanisms that explains the "unlearning":
Deletion (Young and O'Shea, 1981): elements of the condition and/or action specifications of a production rule deleted. Deletion then become internalized as follows: (i) deletion does its work at run time, (ii) the student acquires the mal-rule by induction, i.e., the input/output pattern generated by the "deleted" rule is described and remembered as a mal-rule.New mal-rules may arise when students attempt to make sense of currently purely syntactic rules. e.g. 3 1/2 + 1 = 4 1/2 ( three and a half apples plus one...
5. Perception of problems and errors(Impasse or not)
Correctly perceived and solvedCorrectly perceived but not solvedIncorrectly perceived and solvedIncorrectly perceived but not solved.
Misperceive During Learning and Misperceiving During Solving(without impasse)
Misperceived during learning -- mislearned – mal-rule -- wrong answer
misperceived during problem solving -- use correct or mal-rule -- usually wrong answer.
Primary Mal-rules Rules that explains mal-rules “log A” treated as “log times A”incorrect use of distributive law in addition
to treating “log A” as “log times A” log A X B as log A X log B
Errors due to confusion caused by the logarithm axioms log A + log B as log A X log B log A - log B as log A / log B
Causes of Confusion
Incomplete LearningNAME Input Pattern
NO. OF TERMS 2
OPERATOR 1
TERM 1
TERM 2
log of Expression 1
log of Expression 2
minus
NAME Output Pattern
NO. OF TERMS
OPERATOR 1
TERM 1
TERM 2
1
log of Expression 1over Expression 2
NAME Input Pattern
NO. OF TERMS 2
OPERATOR 1
TERM 1
TERM 2
log of Expression 1
log of Expression 2
minus
NAME Output Pattern
NO. OF TERMS
OPERATOR 1
TERM 1
TERM 2
division
NAME Input Pattern
NO. OF TERMS 2
OPERATOR 1
TERM 1
TERM 2
log of Expression 1
log of Expression 2
minus
NAME Output Pattern
NO. OF TERMS
OPERATOR 1
TERM 1
TERM 2
division
2
log of Expression 1
log of Expression 2
* Values of slots underlined are those inherited from the input pattern.
Two Examples
0 4771
4 7710 4771 4 771
.
.. .
2 0 4771
2 0 301
0 9542
0 3020 9542 0 602
.
.
.
.. .
ConclusionOrigins of Errors
Impasse-RepairMisgeneralizationMisperceiving
Incomplete L
earning
Causes of errors explained?
Sequential Processing versus Parallel ProcessingEvidence of parallel processinghuman processing posses fast in some
cases: Pattern recognition perception
A Perception Example
Some More Examples
Some More Examples
Why does this happen?
What Neural Network can do?
Lin. http://home.ipoline.com/~timlin/neural/
What is a Neural Network?
Biological Neural Network
Artificial Neural Network
A Neural Network
Pattern Recognition
1. Classification: Given a pattern, find its class;
2. Determine a Pattern: Given a classification and part of a pattern, complete the pattern.
00100
01100
00100
00000
00100
00100
01110
00100
01100
00100
00100
00100
00100
01110
1
How an ANN Learn from Examples
Initially as a blank artificial neural network. Two basic phases :
Training Computation (or Recognition).
Training
data is imposed upon a neural network to force the network to remember the pattern of training data.
remember the training data pattern by adjusting its internal synaptic connections..
Recognition
part of the input data is not known. The neural network, based on its internal
synaptic connections, will determine the unknown part
Training Phase
two data files are used: Training data file; and Retraining-data-file.
starts with feeding the neural network with the data from the training-data-file.
If initial training is not satisfactory, the network can be trained interactively over and over again by the data in retraining-data-files.
Recognition phase
two data files are used: Recognition data file; and Output data file.
The recognition data file contains the data for neural network computations.
The output data file contains the results of neural network computations.
A '5 by 7' Character Recognition Problem
P is a '5 by 7' character. P has 35 bits. For example, one of the many images of "1" is:
00100
01100
00100
00100
00100
00100
01110
Group C contains eleven neurons. Ten of them represent the ten digits: 0, 1, 2, ..., 9 and the eleventh one represents the "other than digits" class. Only one of them is clamped on and the other ten are clamped off. In particular,
?class "0" is 10000 00000 0;
?class "1" is 01000 00000 0;
?...
?class "9" is 00000 00001 0;
?class "other than digits" is 00000 00000 1.
V = (C, P).
Example of training-vectors
Class 0
A Neural Network Example
1. Neural Network: download Attrasoft Boltzmann
Machine for Windows 95 Test how characters (5*7) can be recogni
zed.
Running ABM
Initializing ABM-- click 'Example/5x7 Character You can find 3 files opened: Example1.trn;
example1.rtn,example1.rec Train the neural network -- Click Run/Train or the
"T" button. Click: Run/1-neuron-1-class(One) or the "1"
button. open the output data file, by clicking the "O"
button or "Data/Output File“, to check the results. Check the results, if not satisfactory, then retrain,
until all vectors are recognized.
Errors as explained by Neural Network
=
Challenging Task
Think of an example in learning or recognizing that can be explained by Parallel Distributed Processing