9.63 laboratory in cognitive...

9.63 Laboratory in Cognitive Science

Fall 2005�Course 2a- Signal �Detection Theory�

Aude Oliva

Hypothetical Experiment • : How LSD drug

affects rat’s runningspeed ?

• : 40 rats have been trained to run a straight-alley mazefor a food reward. Theyare randomly assigned to

group: LSD injection, and

Question

Method & Subjects

two groups (Experimental

Control group: injection of a placebo).

1

Ben Balas, Charles Kemp

Hypothetical Experiment: Results

• running times is

• Central tendency (mean)

• Dispersion: how the score are spread out about the center (standard deviation)

i

)

The distribution of

characterized by two measurements:

Histograms representing scores for 20 participants in the control (top) and exper mental conditions of the hypothetical LSD experiment

Running time (seconds

Frequency distributions

Normal Distribution (curve) Mean

Normal distribution

2�

Normal Distribution

• ii l i

• i lecti i i l i• • ithin 2 stdev • i

• i

Each side of the normal curve has a point where the curve slightly reverses ts direction: this s the inf ection po nt The nf on po nt s a ways one standard-deviat on from the mean About 68 % of all scores are contained within one standard deviation of the mean 96 % of the scores are contained w99.74 % of the scores are conta ned within 3 stdev

This property of normal curves is extremely useful because if we know an individual’s score and the mean and standard deviation n the distribution of scores, we also know the person relative rank.

IQ Population Distribution

100857055 115 130 145 IQ

14%

10%

6%

2%

Mean

Normal distribution

XX

% o

f pe

ople

with

this

sco

re

Most IQ tests are devised so that the population mean is 100 and the stdev is 15. If a person has an IQ of 115, she has scored higher than % of all people.

3�

Figure by MIT OCW.

0.13% 2.14%

13.59% 34.13% 34.13% 13.59%

-4 -3 -2

-2.67

-1 0 +1 +2 +3 +4

+2.67Standard (z) Score Units

Inflection Point

Proportions of scores in specific areas under the normal curve. The inflection points are one standarddeviation from the mean.

Inflection Point

2.14% 0.13%

IQ Population Distribution

100857055 115 130 145

IQ

14%

10%

6%

2%% o

f pe

ople

with

this

sco

re

Mentally Retarded Mentally Gifted

• imeans and variances

•

• X ? • Y ? •

It is common to compare scores across normal distributions w th different in terms of standard scores or Z-scores

The z-score is the difference between an individual score and the mean expressed in units of standard deviations. So, an IQ of 115 transfers in a z-score of An IQ of 78 translates to a z-score of Grades in courses should be calculated in terms of z-scores, why ?

4�

Normal Curve and Z scores

Figure by MIT OCW.

0.13% 2.14%

13.59% 34.13% 34.13% 13.59%

-4 -3 -2

-2.67

-1 0 +1 +2 +3 +4

+2.67Standard (z) Score Units

Inflection Point

Proportions of scores in specific areas under the normal curve. The inflection points are one standarddeviation from the mean.

Inflection Point

2.14% 0.13%

5

Detection Task – Reaction Time distribution

Histogram distribution of all reaction times for 6 participants

(4200 samples total) for target present case

• Mean = 471 msec • Stdev = 141 msec • 3 stdev from the mean = 874

msec

Detection Task: Reaction Time Distribution

Masking effects on behavioral reaction time. Reaction time distribution of correct go responses as a function of the SOA (10 msec bins) averaged in 16 subjects.

Figure removed due to copyright reasons.

Figures removed due to copyright reasons. Please see: Bacon-Macé, Nadège, et al. Figures 1 and 3A in "The time course of visual processing: Backward masking and natural scene categorisation." Vision Research 45 (2005): 1459-1469.

874 msec

Visual Search: Reaction Time Distribution

Figure removed due to copyright reasons. Please see: Wolfe, Jeremy M., and Aude Oliva, et al. Figures 13 and 16 in "Segmentation of objects from backgrounds in visual search tasks." Vision Research 42 (2002): 2985-3004.�

Signal Detection Theory

•�Starting point of signal detection theory: all reasoning and decision making takes place in the presence of uncertainty

•�Your decision depends on the signal but also your response bias and internal criterion

6

A tumor scenario •�You are a radiologist examining a CT scan, looking

for a tumor. •�The task is hard, so there is some uncertainty:

either there is a tumor (signal present), or there is not (signal absent). Either you see a tumor(response “yes”), either you do not see a tumor (response “no”).

•�There are 4 possible outcomes:

Signal Present Signal Absent

Say “Yes" Hit False Alarm

Say “No" Miss Correct Rejection

Adapted from David Heeger document

Decision making process

•�Two main components: •�(1) The signal or information. You look at

the information in the CT scan. A tumor might be brighter or darker, have a differenttexture, etc. With expertise and additionalinformation (other scans), the likelihood ofgetting a HIT or CORRECT REJECTIONincrease.


7

________________

Decision making process •� Two main components: • (2) Criterion: the second component of the decision process is

very different: it refers to your own judgment or “internal criterion”. For instance, for two doctors:

•� Criterion “life and death” (and money): Increase in False Alarm = decision towards “yes” (tumor present) decision. A false alarm will result in a routine biopsy operation.

•� This doctor has a bias toward “yes”: liberal response �strategy.�

•� Criterion “unnecessary surgery”: surgeries are very bad (expensive, stress). They will miss more tumors and save money to the social system. They will feel that a tumor if there is really one will be picked-up at the next check-up.

•� This doctor has a bias towards “no”: a conservative �response strategy.�


Internal Response and Internal noise

Content removed due to copyright reasons.

Refer to:Heeger, David. Signal Detection Theory. Department of Psychology, New York University, 2003.

8

http://www.cns.nyu.edu/~david/sdt/sdt.html

________________

________________

Probability of Occurrence Curves


Refer to: Heeger, David. Signal Detection Theory. Department of Psychology, New York University, 2003.

Probability of Occurrence Curves


Refer to: Heeger, David. Signal Detection Theory. Department of Psychology, New York University, 2003.

9



10

Hypothetical internal response curves• SDT assumes that your internal

response will vary randomly over trials around an average value, producing a normal curve distribution of internal responses.

• The decision process compare the strength of the internal (sensory) response to an internally set criterion: whenever the internal response is greater than this criterion, response “yes”. Whenever the internal response is less than the criterion, response “no”.

• The decision process is influenced by knowledge of the probability of signal events (cf. Wolfe et al., Nature, 2005) and payoff factors.

• Criterion line divides the graph into 4 sections (hits, misses, false alarm, correct rejections).

• On both HIT and FA, the internal criterion is greater than the criterion.


Effects of Criterion

• If you choose a “low” criterion, you respond “yes” to almost everything (never miss a tumor and have a very high HIT rate, but a lot of unnecessary surgeries).

• If you choose a high criterion, you respond “no” to almost everything.

from David Heeger document

Figure removed due to copyright reasons.Refer to:

Figure removed due to copyright reasons.Refer to:

Heeger, David. Figure 2 in Signal Detection Theory.Department of Psychology, New York University, 2003.

Heeger, David. Figure 2 in Signal Detection Theory.Department of Psychology, New York University, 2003.

______________

______________



SDT and d-prime •� The underlying model of SDT consists of two normal distributions •� one representing a signal (target present) and another representing

"noise." (target absent) •� The willingness of the person to say 'Signal Present' in response to

an ambiguous stimulus is represented by the criterion. •� How well a person can discriminate between Signal Present and

Signal Absent trials is represented by the difference between the means of the two distributions, d'.

Criterion

http://wise.cgu.edu/sdt/models_sdt1.html

Data from an experiment with target present/absent:

Where do I start ?

•�Everything is in the HIT and FA rates. For instance:

HIT = 0.84 FA = 0.16 •�Response bias c = -0.5[z(H)+z(F)] • Sensitivity (discrimination) d’= |z(H)-z(F)|

11

________________________

http://wise.cgu.edu/sdt/models_sdt1.html

D-prime: d' = separation / spread d’= | z (hit rate) - z (false alarm)|

Target absent Target Present

HIT = 0.84 Proportion of “yes” responses given the target is present

FA = 0.16 Proportion of “yes” responses given the target is absent

Only need of HIT and FA

absent present

Response bias

)

d’= z(H)-z(F)

d’= -1 -1

HIT = 0.84 = Area to the right of C is 84 %

FA = 0.16 Area to the right of C is 16 %

Criterion

c = -0.5[z(H)+z(F)] C = - 0.5 [z(0.84) + z(0.16)] C = - 0.5 [-1 + 1] C = 0 (no bias

Sensitivity d’= z(0.84) - z(0.16)

d’ = 2

In Excel: z(X) is NORMINV(x,0,1) e.g. NORMINV(0.84,0,1)

12



HIT = ?? Proportion of “yes” responses given the target is present

FA = ?? Proportion of “yes” responses Given the target is absent

Higher Criterion


FA = (very small) HIT = 0.5 Proportion of “yes” responses Proportion of “yes” responses


given the target is present Given the target is absent

Higher Criterion

13

14

Area under normal curve and z-score

Slide adapted from Ben Backus, Uni. Pennsylvania

Z-scores are measured in standard deviations from the mean. Area to the right of a z-score is the probability that a draw from the normal distribution will be above the z-score

84 % (area to the left)

Conclusion on d’ • d' is a measure of sensitivity. The larger the d'

value, the better your performance. A d' value of zero means that you cannot distinguish trials with the target from trials without the target. A d' of 4.6 indicates a nearly perfect ability to distinguish between trials that included the target and trials that did not include the target.

• C is a measure of response bias. A value greater than 0 indicates a conservative bias (a tendency to say `absent' more than `present') and a value less than 0 indicates a liberal bias. Values close to 0 indicate neutral bias.

1 stdev

•

• that

• i

•

•

•

• Hit l

N=6 /

CogLab 1: Signal Detection Results

d' is a measure of sensitivity. A d' value of zero means

?????? A d' of 4.6 indicates a nearly perfect abil ty to distinguish between trials that included the target and trials that did not include the target. C is a measure of response bias. A value greater than 0 indicates ????? a value less than 0 indicates ????? Values close to 0 indicate ?????

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

False A arm

SDT, Perception and Memory "Yes-No" paradigms •

in

ici lil

particil i

matrix. •

i

l i

ial

l diisi

i i i

A research domain where SDT has been successfully applied is the study of memory. Typically in memory experiments, part pants are shown a st of words and ater asked to make a "yes" or "no" statement as to whether they remember seeing an item before. Alternatively,

pants make "old" or "new" responses. The resu ts of the exper ment can be portrayed in what is called a decision

The hit rate is defined as the proportion of "old" responses given for tems that are Old and the false alarm rate is the proportion of "o d" responses g ven to items that are New.

Interactive program: the WISE Project's Signal Theory Tutor

Hypothetica stribution of yes and no response. The dec on criterion C determines whether a “yes” or “no” response will be made. Strong ev dence to the r ght of the criter on will lead to “yes” responses and weak evidence to the “left” will lead to weak responses.

15�

Animal Detection Task (Project)

Target:

Images removed due to copyright reasons. Image removed due to copyright reasons.

Distractor:

Image removed due to copyright reasons.

Method: experiment ready to run in matlab

False Memory (Project)

• Scene memory and visual complexity (clutter)

Images removed due to copyright reasons.

Method: pictures already ranked for complexity. Experiment can be done with�powerpoint.�Also: Memory of comics drawing, memory of places (e.g. for eyewitness testimony)�memory of emotional images, memory under dual-tasks, short term memory�(change blindness paradigm, etc), memory of emotions-faces, etc..�

16

Costs and Utilities of d’

What are the costs of a false alarm and of a miss for the following:

•�A pilot emerges from the fog and estimates whether her position is suitable for landing

•�A doctor estimates whether a fuzzy spot could be a tumor

•�You are screening bags at the airport

Wolfe et al (2005) – Detection of Rare Target - Visual Search (project)

Refer to:�Wolfe, Jeremy M., Todd. S. Horowitz, and Naomi M. Kenner. �“Rare items often missed in visual searches.” Nature 435 (2005): 439-440.�

“Our society relies on accurate performance in visual screening tasks. These are visual search for rare targets: we show here that target rarity leads to disturbingly Inaccurate performance in target detection”

17

What happened? When tools are present

on 50% of trials, observers missed 5-10% of them

When the same tools are present on just 1%

of trials, observers missed

A problem with performance, not searcher competence.

l

Find the tool 30-40% of them

Here, the important errors are Misses

Figure removed due to copyright reasons.

Please see: Wolfe, Jeremy M., Todd. S. Horowitz, and Naomi M. Kenner. Figure 1 in “Rare items often missed in visual searches.” Nature 435 (2005): 439-440.

18

Courtesy of Dr. Jeremy Wolfe. Used with permission.

The Gambler’s Fallacy

Reaction Time

-8 -4 0 4 8

Right after a MISS, RTs jump way up

2000

1000

l

Trial relative to Target Present Trial

(msec)

But I am sure another rare target won’t come again soon…right?

Reaction Time

So I don’t learn from my error

2000

1000

-8 -4 0 4 8

l


(msec)

19�



20

And I do the same after a Hit.

-8 -4 0 4 8


Reaction Time

(msec)

The Gambler’s Fallacy

-8 -4 0 4 8


Reaction Time

(msec)

S



The z-score

(xi − x)zi�= How many standard deviations above the s mean is score zi?

• x = 18, 24, 12, 6 → x = 1.5, 2, 1, 0.5 • z = .4, 1.3, -.4,-1.3 → z = .4, 1.3, -.4, -

1.3

Example use of z-scores

•�How do you combine scores from different people?

•�Mary, Jeff, and Raul see a movie. •�Their ratings of the movie, on a scale from 1 to

10: –�Mary = 7, Jeff = 9, Raul = 5

•�Average score = 7? •�It’s more meaningful to see how those scores

compare to how they typically rate movies.

21

Courtesy of Ruth Rosenholtz. Used with permission.


Example use of z-scores •�Recent ratings from Mary, Jeff, & Raul: •�Mary: 7, 8, 7, 9, 8, 9 7 is pretty low… •�Jeff: 8, 9, 9, 10, 8, 10 9 is average… •�Raul: 1, 2, 2, 4, 5 5 is pretty good…

•�z-scores: –�Mary: m=8, s=.8 z(7) = -1.2 –�Jeff: m=9 z(9) = 0 –�Raul: m=2.8, s=1.5 z(5) = 1.5

•�mean(z) = 0.1 = # of standard deviations above the mean –�It’s probably your average movie, nothing outstanding.

22


9.63 laboratory in cognitive...

Documents