analyzing the results of an experiment… -not straightforward.. –why not?

Analyzing the Results of an Experiment…

• -not straightforward..

– Why not?

Variability and Random/chance outcomes

Inferential Statistics

• Statistical analysis appropriate for inferring causal relationships and effects.

• Many different formulas…which one do you use?

Inferential Stat selection

• -Determine that you are analyzing the results of an experimental manipulation, not a correlation

• Identify the IV and DV.

• The IV Will always be nominal on some level, even when it may seem to be continuous..low, medium and high doses of a drug

Inf. Stat Selection

• What is the scale of the DV?

– Scale of DV -Statistic to use

Nominal Chi-squared

Ordinal Mann-Whitney U-test

Continuous T-test or ANOVA

t-test or ANOVA?

How many levels of the IV are there?

2 levels more than 2 levels

T-test or ANOVA ANOVA

There are different forms of T-tests and ANOVA’s:

Did the Study Use a Within Group or Between group Experimental Design?

Between Group Within Group

Only 2 levels of the IV Unpaired t-tests (or “t for independent samples”).

“Paired t-tests ( or “t for dependent samples”)

Or…ANOVA ( the basic ANOVA is fitted for between group designs)

Or…Within group ANOVA (often referred to as a “repeated measures ANOVA”)

More than 2 levels of the IV

ANOVA Repeated Measures ANOVA

In some ways all inferential Stats are similar.

• They calculate the probability that a result was due to the IV as opposed to random variability…

• Let’s focus on the Basic ANOVA since it is likely to be the statistic you may use most commonly.

ANOVA

• ANOVA produces an F-value.

• F values are the ratio of overall between group Variability to the Mean within group variability

Between Var. (+ chance) /Mean within grp.

Variability (+ chance)

What does this mean?

Lets suppose:

• Experiment- IV marijuana– Control– Placebo control– Low dose– High dose

Dependent Variable is:

• Performance on a short term memory task measured number correct out of 10 test items.

• 9 subjects in each group

Possible out come 1

Possible Outcome 1

Control Placebo Low dose High dose

• 4 2 2 2• 5 3 3 3• 6 4 4 5• 5 6 4 3• 5 5 5 4• 6 5 4 4• 4 4 5 4• 3 4 6 6• 7 3 3 5

Distribution of scores for control sample

0

.5

1

1.5

2

2.5

3

3.5

Cou

nt

0 2 4 6 8 10 12control

Placebo scores

0

.5

1

1.5

2

2.5

3

3.5C

ount

0 2 4 6 8 10 12placebo

Low dose scores

0

.5

1

1.5

2

2.5

3

3.5C

ount

0 2 4 6 8 10 12low

High dose scores

0

.5

1

1.5

2

2.5

3

3.5

Cou

nt

0 2 4 6 8 10 12high

The population distribution of scores

0

2

4

6

8

10

12C

ount

0 1 2 3 4 5 6 7 8 9 10 11population

F value relatively low

Highlow placebo

control

Between grp. Var

w/in grp. var

Now consider this: Possible Outcome 2

Control Placebo Low dose High dose

• 4 2 2 2• 5 3 3 3• 6 4 4 5• 5 6 4 3• 5 5 5 4• 6 5 4 4• 4 4 5 4• 3 4 6 6• 7 3 3 5

Distribution of scores for control sample

0

.5

1

1.5

2

2.5

3

3.5C

ount

0 2 4 6 8 10 12control

Placebo scores

0

.5

1

1.5

2

2.5

3

3.5C

ount

-2 0 2 4 6 8 10 12placebo

Low dose scores

0

.5

1

1.5

2

2.5

3

3.5C

ount

0 2 4 6 8 10 12low

High dose scores

0

.5

1

1.5

2

2.5

3

3.5C

ount

0 2 4 6 8 10 12high

F value relatively High

Highlow placebo

control

Between grp. Var

w/in grp. var

The high F value reflects

• Logic!

• Distribution of score are much more obviously separated, and in this case are completely non-overlapping

• Low F values indicate highly overlapping score distributions

So how do we decide if an F value is large enough to consider the result as causal?

• We consult a table of established probabilities of different F values, within the context of Degree of freedom terms:

ANOVA Significance table

http://home.comcast.net/~sharov/PopEcol/tables/f001.html

Where is/are the difference (s)?

0

10

20

30

40

50

60

70

Neutral Positive Negative Sex Drug Taboo

Neutral

Positive

Negative

Sex

Drug

Taboo

Inferential Statistics

The story of “Scratch”

Why not jus use repeated t-tests? Probability pyramiding

• 15 t-tests required for this data set

• Post-hocs include compensations for repeated testing of a large data set

0

10

20

30

40

50

60

70

Neutral Positive Negative Sex Drug Taboo

Neutral

Positive

Negative

Sex

Drug

Taboo

After all this where so we stand?We can still be wrong.

Factors that affect “power.”Sample size

One vs two-tailed testing

• Effect size

analyzing the results of an experiment… -not straightforward.. –why not?

Documents

basic anova

group variabilitybetween

varwin grp

ivunpaired ttests

different forms of t

group experimental design

nonoverlappinglow f

test items