cognitive science and its educational application

Cognitive Science

And its educational application

Cognitive Science

the study of intelligence and intelligent systems, with particular reference to intelligent behaviour as computation" (Simon & Kaplan, 1989)

Simon, H. A. & C. A. Kaplan, "Foundations of cognitive science", in Posner, M.I. (ed.) 1989, Fou

ndations of Cognitive Science, MIT Press, Cambridge MA.

Cognitive Science

interdisciplinary study of the acquisition and use of knowledge. It includes as contributing disciplines: artificial intelligence,

psychology, linguistics, philosophy, anthropology, neuroscience, and education.

Cognitive science grew out of three developments: the invention of computers and the attempts to design programs that

could do the kinds of tasks that humans do; the development of information processing psychology where the goal

was to specify the internal processing involved in perception, language, memory, and thought;

and the development of the theory of generative grammar and related offshoots in linguistics.

Cognitive science was a synthesis concerned with the kinds of knowledge that underlie human cognition, the details of human cognitive processing, and the computational modeling of those processes.

Eysenck, M.W. ed. (1990). The Blackwell Dictionary of Cognitive Psychology. Cambridge, Massachusetts: Basil Blackwell Ltd.

Cognitive psychology

is concerned with information processing, and includes a variety of processes such as attention, perception, learning, and memory.

It is also concerned with the structures and representations involved in cognition.

The greatest difference between the approach adopted by cognitive psychologists and by the Behaviorists is that cognitive psychologists are interested in identifying in detail what happens between stimulus and response.

Types of Information Processing

Sequential ProcessingParallel Distributed Processing

Sequential Processing

What is

Cognitive View of Learning

Learning is the result of our attempts to make sense of the world

KnowledgeStimulus

Response

What does “make sense” mean?

Information Processing Model

ENVIRONMENT

EXECTIVE CONTROL EXPECTANCIESEFFECTORS

RECEPTORS

RESPONSE

GENERATORS

SENSORY

REGISTER

SHORT- TERM

MEMORY

LONG-TERM

MEMORY

(From R.M. Gagne, 1974)

Sensory Register

Receptor Memory

Information

leaves in 1 to 3 seconds

perceived (organized) information

goes to working memory

Attention

Storage Structur

eCode Capacity

Duration

RetrievalCauses of failure to

recall

Sensory "store"

Sensory: features

12-20 items to

huge

250 msec. - 4

sec.Complete

Masking or decay

Short-term

memory

Acoustic, visual,

semantic, sensory features

identified and named

7 +- 2 items

About 12 sec.;

longer with

rehearsal

Complete: with each item being retrieved every 35

msec.

Displacement,

interference; decay

Long-term

memory

semantic, visual know., abstractions,

meaning, images

Enormous, virtually unlimited

Indefinite

Specific and general

information available

given proper cueing

Interference, organic

dysfunctioning,

inappropriate cues

TYPES OF KNOWLEDGE

KNOWLEDGE

CONCEPTUAL KNOWLEDGE

PROCEDURAL KNOWLEDGE

Example of a semantic Network

Table

brown

4 legs

Furniture

chair

isa

isa

colour

colour

shapeshape

Tom

dog

terrier

isaisa

colour

Pattern-recognition

This means I should add up 3 and 5

35+

35+

Action-sequence

If I should add up 3 and 5, then let me think,... Ah, the answer is 8?

If <the pattern of adding up numbers is seen>Then <carry out the operation to add the numbers>

If <the pattern of adding up numbers is seen>Then <carry out the operation to add the numbers>

Pattern-recognition:

Action-sequence:

If <two numbers are to be added> Then do the addition

Composition of Productions

Production 1 Production 2 Production 3

Composite Production

Composition

rehearsed several timesfasternot easily separated

If p then q If q then r If r then s

If p then s

Learning conceptual knowledge

Knowledge stores in the long term memory

A piece of knowledge is learned if it is linked to other pieces, the more it is rehearsed (linked to others), the links will be stronger, and thus has more chances to be recalled

Learning Procedural Knowledge

Initially as declarative knowledgeAfter compilation and composition, it

becomes a piece of procedure knowledgeStored in the long term memory with the

related conceptual knowledge

Retrieval of Conceptual Knowledge

Activation of a node or several nodesSpread of activation through links among

nodes until the required piece is activated.Nodes connected to the activated node

with stronger links will have more chances to be recalled.

Forgetting is the effect of interference.

Retrieval of Procedural Knowledge

Procedures with condition part match the situation will be fired

Several pieces may have the same condition, those with higher strength will have more chances to be fired.

Strength of a procedure depends on …..

Learning of Procedural Knowledge

An Example

Pattern of Maturation – A possible route when

a student learns a rule

CORRECT

Sleeman (1985)

UNPREDICTABLE

CONSISTENT USE of MAL-RULES (incorrect rules)

MODELS AND THEORIES OF PROCEDURAL ERRORS

WHY STUDY ERRORS:Instruction requires diagnosing well;diagnosing well requires to know:

What the errors are?How are errors formulated?

Types of Procedural Erros

Slips Careless work Intend to perform the appropriate action but fail

to do so

Systematic Errors Due to mistaken or missing knowledge

Possible Reasons for slips

Loss of information from working memorydeployment of attention or cognitive

controlCompletion among the activation levels

and triggering conditional of coexisting demons or schemata

Schema

a way of capturing the insight that concepts are defined by a configuration of features, and each of these features involves specifying a value the object has on some attribute.

The schema represents a concept by pairing a class of attribute with a particular value, and stringing all the attributes together.

They are a way of encoding regularities in categories, whether these regularities are propositional or perceptual.

They are also general, rather than specific, so that they can be used in many situations.

Example:

References: Anderson, J.R. (1990). Cognitive psychology and its implications. New York, NY: Freeman.

Possible Reasons for Systematic Errors

Incomplete or misguided learning

WHY STUDY SYSTEMATIC ERRORS:

A deeper insight into the learning process may be reached. To diagnose students' work and to help them to avoid making the same error again.computer diagnostic consultant may be developed based on the findings so that individualized tutoring may possibly be actualized.

EXPLANATIONS ON SYSTEMATIC ERRORS

Incorrect or faulty algorithms with the same inputs of the correct algorithms. Bugs.Mal-rules.Repair theory (Brown & VanLehn, 1980).Deletion theory (Young & O'Shea, 1981)Misgeneralization theory (Matz, 1982; Sleeman, 1984). Control-slip explanation of errors.

Bugs

Systematic Errors – bugs in the correct procedure

Slight modification or perturbation of a correct procedure (VahLehn, 1984)

Describes which problem the student gets wrong, what each wrong answer is, and the steps followed by the student in producing it (VahLehn, 1984)

306 - 138 ---------

78

80 - 4

-------- 76

183 - 95------

88

702 - 11

591 ------

3005- 28------1087

7002

- 239------4873

Borrowing Across Zero Bug

減法

設置

寫出寫出

轉換

下式大

比較

比較個

新問題

倒轉

改變

直行列

完成直行

借位？

於上式？上式下式

別數字符號

減法

借位

行列完成

取上位數

減數表

取上位數字

取下位數字

比較數字

加拾借拾

找另一直行

零

借一

減一

得九記號

寫下

Procedure used in Subtraction

Use of Bug Theory

A computer programme called Debuggy was designed to mimic errors made by the students (100% correct)

Mal-rules

Incorrect rules made by students in solving a task.

Algebra Task Types

8. Mx = N (P*Q) 9. Mx = N (Px + Q)10. Mx = N + P*Q11. M + Nx + Px = Q12. Mx = N + P (Qx+R)13. Mx = N (x + P) = Q14. Mx + N = Px + Q

1. Mx = P 2. Mx = N + P 3. Mx = NP 4. Mx + Nx = P 5. Mx + N = P 6. M + Nx = P 7. Mx = Nx + P

S11 M+N*X= ->M*N+X=S12 M+N*X= ->M+N+X=S13 M*X + N = ->M+X+N=S14 M*X+N = ->M*X*N=S15 M*X+N= ->(M+N)*X=S16 M*X=N*P -> X=MS17 M*X=N*X+P -> X+X=M+N+PS18 M*X=N+P -> M*X=NS19 M*X=N+P -> M*X=PS20 M*X=N -> X=N

S21 M*X=N -> X=<N/F>/MS22 M*X=N -> X=N/<M/F>S23 M*(N*X+P) -> M*X+M*PS24 2X/2X -> 0S25 A*1/A -> 0S26 0*A -> A

Table 2 The Algebra Mal-rules listed by Sleeman (1984) Supplemented by Three from Matz 1982)

Why errors formulated explained?

Attempts to explain why errors formed

1. Repair Theory:Causes

uninteresting to assert merely that an error can be "explained" by a mal-ruleQuestions such as how the bugs are caused; why some bugs are found but the other possible ones are not answered.To develop a theory which can be used to predict what bugs will exist for procedural skills they have not yet analyzed.

Repairing Theory

ImpasseNot quit, find ways to repair

Repaired, remembered

Repair Theory (Brown and VahLehn, 1980)

Get stuck (Impasse) when executing a possibly incomplete procedure

Not quit, but do a small amount of problem solving, just enough to get unstuck and complete the problem

The local problem solving strategy (Repair), rarely succeed in rectifying the broken procedure – causes errors

An example

Suppose a student has never borrowed from zero. The first time he is asked to solve a borrow-from-zero problem, such as (a),

(a)

305 - 48

he will probably proceed as follows:

(b)

305 - 48 267

process the units column by attempting to borrow from the tens column,impasse reaches because zero cannot be decremented;repair -- skip the decrement operation;answer is as in (b), a bug called Stops-Borrow-AT-Zero.

if repair is done as relocating the decrement operation and do it instead on a nearby digit that is not zero;result is as in (c), a bug called Borrow-Across-Zero.

(c)

305 - 48 167

2. Deletion Theory (Young & O'shea, 1981)

BUGGY produces models that behave functionally as the students, these models are not very convicting as psychological models.Many of the bugs appear to be very similar (many are connected with borrowing from zero).some of the BUGGY data can be analyzed more simply in terms of certain competences being omitted from the ideal model.

Young and O'Shea classified students into 3 categories:

Algorithm errors (124=36%)Pattern errors (54=16 percent)Number-fact errors (127=37%)

Algorithm errors (124=36%)

Borrow when < e.g. 96-42=44.Take smaller e.g. 64-34=21Always borrow e.g. 96-42=44 and 92-46=46S>M -> zero e.g. 72-57=20One -> 10 e.g. 71-52=18Add Column 2 e.g. 21-19=22 (add the digits in the sec.col.)

Pattern errors (54=16 percent)0-N=N0-N=0N-N=N

Number-fact errors (127=37%)impossible to analyze (39=11%)

Production system to account for the subtraction errors.

A production system (PS) is a collection of rules in the form:

C => Awhen the condition C is satisfied, the action A is meant to apply.

The architecture has three components:

A working memory (WM): A set of elements such ( S EQ M) or (RESULT 5).A production memory, which holds a collection of production rules.A conflict resolution method to determine which rule is be fired when more than one is applicable.

Production System for subtraction by decomposition:

FD: M=m, S=s => FindDiff, NextColumnB2A: S>M => BorrowBS1: Borrow => *AddTenToMBS2: Borrow => *DecrementCM: M=m, S=s => *CompareIN: ProcessColumn => *ReadMandSTS: FindDiff => *TakeAbsDiffNXT: NextColumn => *ShiftLeft, ProcessColumnWA: Result=x => *Write=xDONE: NoMore => *HALTB2C: S=M => Result 0, NextColumnAC: Result 1=x => *Carry, Result=x

3. Misgeneralization

(Overgeneralization)

overgeneralising from instances, using an "old" operator instead of a more recently introduced one, and regressing under cognitive load. (Davis, Jockusch, & McKnight, 1978)e.g. "+" instead of "*", "*" instead of exponential.

Matz (1982) suggested a number of high-level schema which explain a series of observed errors. extrapolation principle:(A*B)C => AC * BC makes students write (A+B)C => AC + BC

Sleeman further classified the students' error in four categories:

Manipulative Errors: A variant on a correct rule which has one substage either omitted or replaced by an inappropriate or incorrect operation

X=6/4 => X=3/4(divide 4 by 2 is omitted)

5*X=12 => X= 2 2/12 (2/12 should be 2/5)

Parse Errors: what happens when a student "mis-sees," or "mis-parse, an algebraic equation. These errors cannot be explained by omitting a component. Nor does it seem that they can be explained by performing a repair to a core procedure.

e.g. 6*X = 3*X-12 6*X = 3*X+129*X=12 X+X=12+3-6X=12/9 2*X=9X=4/3 X=9/2

Clerical errors: Just slips described above."Wild"/unexplained errors: mistakes not explained so far, may be due to the consistent use of mal-rules which so far not identified.

4. Competition of Rules Payne & Squibb

cooccurrence of a slip and mistake, simultaneous representation of alternative rules (correct or incorrect) that apply in the same situations

using some notion of rule strength to resolve conflicts

Errors are represented by faulty ruleserror arises only when weaker, faulty rules

are preferred to correct, stronger rules

Origins of Errors Explained?

where the incorrect versions of rules come from?

the mistake-generating mechanisms of misgeneralization and repair have difficulty predicting the development of novel, incorrect rules in problem solvers who already know the correct versions.

Mechanisms that explains the "unlearning":

Deletion (Young and O'Shea, 1981): elements of the condition and/or action specifications of a production rule deleted. Deletion then become internalized as follows: (i) deletion does its work at run time, (ii) the student acquires the mal-rule by induction, i.e., the input/output pattern generated by the "deleted" rule is described and remembered as a mal-rule.New mal-rules may arise when students attempt to make sense of currently purely syntactic rules. e.g. 3 1/2 + 1 = 4 1/2 ( three and a half apples plus one...

5. Perception of problems and errors(Impasse or not)

Correctly perceived and solvedCorrectly perceived but not solvedIncorrectly perceived and solvedIncorrectly perceived but not solved.

Misperceive During Learning and Misperceiving During Solving(without impasse)

Misperceived during learning -- mislearned – mal-rule -- wrong answer

misperceived during problem solving -- use correct or mal-rule -- usually wrong answer.

Primary Mal-rules Rules that explains mal-rules “log A” treated as “log times A”incorrect use of distributive law in addition

to treating “log A” as “log times A” log A X B as log A X log B

Errors due to confusion caused by the logarithm axioms log A + log B as log A X log B log A - log B as log A / log B

Causes of Confusion

Incomplete LearningNAME Input Pattern

NO. OF TERMS 2

OPERATOR 1

TERM 1

TERM 2

log of Expression 1

log of Expression 2

minus

NAME Output Pattern

NO. OF TERMS

OPERATOR 1

TERM 1

TERM 2

1

log of Expression 1over Expression 2

NAME Input Pattern

NO. OF TERMS 2

OPERATOR 1

TERM 1

TERM 2

log of Expression 1

log of Expression 2

minus

NAME Output Pattern

NO. OF TERMS

OPERATOR 1

TERM 1

TERM 2

division

NAME Input Pattern

NO. OF TERMS 2

OPERATOR 1

TERM 1

TERM 2

log of Expression 1

log of Expression 2

minus

NAME Output Pattern

NO. OF TERMS

OPERATOR 1

TERM 1

TERM 2

division

2

log of Expression 1

log of Expression 2

* Values of slots underlined are those inherited from the input pattern.

Two Examples

0 4771

4 7710 4771 4 771

.

.. .

2 0 4771

2 0 301

0 9542

0 3020 9542 0 602

.

.

.

.. .

ConclusionOrigins of Errors

Impasse-RepairMisgeneralizationMisperceiving

Incomplete L

earning

Causes of errors explained?

Sequential Processing versus Parallel ProcessingEvidence of parallel processinghuman processing posses fast in some

cases: Pattern recognition perception

A Perception Example

Some More Examples

Why does this happen?

What Neural Network can do?

Lin. http://home.ipoline.com/~timlin/neural/

What is a Neural Network?

Biological Neural Network

Artificial Neural Network

A Neural Network

Pattern Recognition

1. Classification: Given a pattern, find its class;

2. Determine a Pattern: Given a classification and part of a pattern, complete the pattern.

00100

01100

00100

00000

00100

00100

01110

00100

01100

00100

00100

00100

00100

01110

1

How an ANN Learn from Examples

Initially as a blank artificial neural network. Two basic phases :

Training Computation (or Recognition).

Training

data is imposed upon a neural network to force the network to remember the pattern of training data.

remember the training data pattern by adjusting its internal synaptic connections..

Recognition

part of the input data is not known. The neural network, based on its internal

synaptic connections, will determine the unknown part

Training Phase

two data files are used: Training data file; and Retraining-data-file.

starts with feeding the neural network with the data from the training-data-file.

If initial training is not satisfactory, the network can be trained interactively over and over again by the data in retraining-data-files.

Recognition phase

two data files are used: Recognition data file; and Output data file.

The recognition data file contains the data for neural network computations.

The output data file contains the results of neural network computations.

A '5 by 7' Character Recognition Problem

P is a '5 by 7' character. P has 35 bits. For example, one of the many images of "1" is:

00100

01100

00100

00100

00100

00100

01110

Group C contains eleven neurons. Ten of them represent the ten digits: 0, 1, 2, ..., 9 and the eleventh one represents the "other than digits" class. Only one of them is clamped on and the other ten are clamped off. In particular,

?class "0" is 10000 00000 0;

?class "1" is 01000 00000 0;

?...

?class "9" is 00000 00001 0;

?class "other than digits" is 00000 00000 1.

V = (C, P).

Example of training-vectors

Class 0

A Neural Network Example

1. Neural Network: download Attrasoft Boltzmann

Machine for Windows 95 Test how characters (5*7) can be recogni

zed.

Running ABM

Initializing ABM-- click 'Example/5x7 Character You can find 3 files opened: Example1.trn;

example1.rtn,example1.rec Train the neural network -- Click Run/Train or the

"T" button. Click: Run/1-neuron-1-class(One) or the "1"

button. open the output data file, by clicking the "O"

button or "Data/Output File“, to check the results. Check the results, if not satisfactory, then retrain,

until all vectors are recognized.

Errors as explained by Neural Network

=

Challenging Task

Think of an example in learning or recognizing that can be explained by Parallel Distributed Processing

cognitive science and its educational application

Documents

cognitive psychologists

cognitive psychologyis

cognitive scienceand

kinds of knowledge

internal processing

stronger links

procedural knowledgeinitially

human cognition