linear models -- foundation. what type of variable? 1.temperature ( o f) 2.habitat complexity (low,...

Linear Models -- Foundation

LM Foundation 2

What Type of Variable?1. Temperature (oF)2. Habitat complexity (low, medium, high)3. Home range size (m2)4. Brood size5. Forest type (deciduous, mixed, coniferous)6. Number of docks (on a lake shoreline)7. Ecoregion (Northern Lakes & Forests, North Central

Hardwood Forests, Driftless Area, Southeastern Wisconsin Till Plains, Central Corn Belt Plains)

8. Survived (yes, no)9. Age (years)10.Race

LM Foundation 3

Which is the response variable?1. Can length be used to predict weight?

2. How is weight affected by typical daily ration?

3. Does metabolic rate differ by sex of rabbit?

4. Is gas mileages significantly affected by weight of the car?

5. Is there a relationship between how much money a person makes and their satisfaction with deer harvest regulations?

6. How is the uptake of heavy metals affected by the sex and age (young, middle, old) of the individual?

7. Is there a relationship between how much money a person makes and how much they weigh?

LM Foundation 4

Linear Models• A categorization scheme

• All use a common foundation of theory

12

3 4

1 Factor 2 Factors

LM Foundation 5

Which Test? Why?1. Does bird species diversity (number of species) decline

as you move away from the equator (increase latitude)?

2. Does the mean length of the anterior adductor muscle scar on a mussel species differ among five locations?

3. Does whether or not an otter captures a bluegill depend on the total length of the bluegill?

4. Is there a difference in fat reserves (thickness in mm) between wild and domestic seals, sex of the seal, or the interaction between the seal type and sex?

5. Does the relationship between the number of times the word gender was used in a journal volume and the year of the volume differ among three different journals?

LM Foundation 6

Which Test? Why?1. Does the relationship between resting heart rate

and body weight differ among groups of subjects that had or had not ingested caffeine?

2. Does the mean alcohol by volume differ among five different types of beer (pale ales, IPAs, lagers, stouts, and porters)?

3. Does mean alcohol by volume change depending on the weight of malt extract used in the brewing process?

LM Foundation 7

Which Test? Why?

LM Foundation 8

Which Test? Why?

LM Foundation 9

Which Test? Why?

LM Foundation 10

Example Data – Sex & DirectionA sample of 30 males and 30 females was taken to an unfamiliar wooded park and given spatial orientation tests, including pointing to the south. The absolute pointing error, in degrees, was recorded. The results are in the SexDirection.txt file on the webpage. Is there a difference in sense of direction between men and women?

from Sholl, M.J., J.C. Acacio, R.O. Makar, and C. Leon. 2000. The relation of sex and sense of direction to spatial orientation in an unfamiliar environment. Journal of Environmental Psychology. 20:17-28.

LM Foundation 11

Example Data – Sex & Direction

• What are the hypotheses?–HO: mm-mf=0 HA: mm- mf ≠ 0

• Use which hypothesis test?– Two Sample T-test

• What is conclusion from handout?– No significant difference in mean APE

between males and females

LM Foundation 12

Competing Models

Characteristic Full Model Simple Model

More LessBetter WorseRelative Fit

# ParametersHypothesis HA H0

LM Foundation 13

Competing Models

050

100

150

Sex

Abs

olut

e E

rror

Female Male

Full Simple

More LessBetter Worse

HA H0

LM Foundation 14

Competing Models – 2-sample T• H0: mi = m

– “The mean for each group equals a single grand mean”• i.e., “No difference in group means”

050

100

150

Sex

Abs

olut

e E

rror

Female Male

LM Foundation 15

Competing Models – 2-sample T• HA: mi = mi (where m1≠m2)

– “Each group mean equals a different value”• i.e., “Difference in group means”

050

100

150

Sex

Abs

olut

e E

rror

Female Male

16

Competing Models0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

050

100

150

Sex

Abs

olut

e E

rror

Female Male

Characteristic Full Model Simple Model

More LessBetter WorseRelative Fit

# ParametersHypothesis HA H0

Is the “benefit” of a better fit worth the

“cost” of added complexity?

Measuring Fit0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

LM Foundation 18

Measuring Fit – Notation

Yij = Y measurement on individual j in group i

I = total number of groups

ni = number of individuals in group i

n = number of individuals in all groups

`Yi. = group i sample mean (i.e., group mean)

`Y.. = sample mean of all individuals (i.e., grand mean)

LM Foundation 19

Measuring Fit – Notation Examples

ith Group Sample Mean Grand Sample Mean

LM Foundation 20

Measuring Fit – SS• Measures lack-of-fit of a model to a set of data

LM Foundation 21

Measuring Fit – SSTotal0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

= 115465

data model

Measuring Fit – SSWithin0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

=110496

data model

LM Foundation 23

Measuring Fit – SSWithin & SSTotal0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

SS

With

in

SS

With

in

SS

Tota

l

SSTotal = 115465SSWithin= 110496

Full model ALWAYS fits better!

LM Foundation 24

Measuring Fit – SSTotal Partitions

• SSTotal = SSWithin + SSAmong

• where

– Difference in SS between full & simple models– Improvement in lack-of-fit when using full model

(rather than simple model)– Measure of how different the group means are

LM Foundation 25

Measuring Fit – SSAmong0

5010

015

0

Sex

Abs

olut

e E

rror

Female Male

SS

Am

ong

• What would make SSamong be “large”?

• Must not forget about differences in model complexity!

LM Foundation 26

Measuring Complexity

• df = n – number of predictions– “Simple model” dfTotal = n-1– “Full model” dfWithin = n-I

• dfTotal = dfWithin + dfAmong

• dfAmong = I-1– Difference in number of model parameters– Added complexity of full model

LM Foundation 27

• Factor out difference in number of parameters on fit calculation by dividing SS by df

• Result is “mean square” (MS)

• MS are sample variances– MSTotal = s2 = total variability among individuals

around grand mean

– MSWithin = sp2 = pooled variability among

individuals around group means

– MSAmong = variability of group means around the grand mean

Fit vs. Complexity

LM Foundation 28

Fit vs. Complexity – MS

• Suppose that MSAmong = 10

– Is this “large” if MSWithin = 100?

– Is this “large” if MSWithin = 1?

• F=Within

Among

SMSM

LM Foundation 29

Fit vs Complexity – F Distribution• Has numerator and denominator df

– numerator from dfAmong

– denominator from dfWithin

• Right-skewed, all positive numbers• P-value always upper tail

Within

Among

MSMS

F

LM Foundation 30

Fit vs. Complexity – p-value• Large p-value?• Small F• Small MSAmong relative

to MSWithin

• Small SSAmong

• Full model not “better”• Group means do not

differ

Within

Among

MSMS

F

050

100

150

Sex

Abs

olut

e E

rror

Female Male

SS

Am

ong

Among

AmongAmong df

SSMS

LM Foundation 31

Fit vs. Complexity – p-value• Small p-value?• Large F• Large MSAmong relative

to MSWithin

• Large SSAmong

• Full model is “better”• Group means do differ

• Large p-value?• Small F• Small MSAmong relative

to MSWithin

• Small SSAmong

• Full model not “better”• Group means do not

differ

32

Things To Remember• Always two models

– Full model is separate means for each group– Simple model is a single mean for each group

• The SSTotal partitions into two parts -- SSAmong+SSWithin = SSTotal

– SSAmong is the improvement in lack-of-fit using the full model

• MS are SS/df and are variances– MSTotal is variance of Y– MSWithin is the pooled common variance

•dfAmong is the increase in complexity of the full model

• MSAmong + MSWithin not = MSTotal (because of different df)

• F is the ratio MSAmong / MSWithin

• If F is large then evidence for different means -- i.e., reject H0

LM Foundation 33

Linear Models in R – HO• Note use of

– lm()– summary()– confint()– fitPlot()– anova()

linear models -- foundation. what type of variable? 1.temperature ( o f) 2.habitat complexity (low,...

Documents