linear models -- foundation. what type of variable? 1.temperature ( o f) 2.habitat complexity (low,...
DESCRIPTION
Which is the response variable? 1.Can length be used to predict weight? 2.How is weight affected by typical daily ration? 3.Does metabolic rate differ by sex of rabbit? 4.Is gas mileages significantly affected by weight of the car? 5.Is there a relationship between how much money a person makes and their satisfaction with deer harvest regulations? 6.How is the uptake of heavy metals affected by the sex and age (young, middle, old) of the individual? 7.Is there a relationship between how much money a person makes and how much they weigh? LM Foundation 3TRANSCRIPT
Linear Models -- Foundation
LM Foundation 2
What Type of Variable?1. Temperature (oF)2. Habitat complexity (low, medium, high)3. Home range size (m2)4. Brood size5. Forest type (deciduous, mixed, coniferous)6. Number of docks (on a lake shoreline)7. Ecoregion (Northern Lakes & Forests, North Central
Hardwood Forests, Driftless Area, Southeastern Wisconsin Till Plains, Central Corn Belt Plains)
8. Survived (yes, no)9. Age (years)10.Race
LM Foundation 3
Which is the response variable?1. Can length be used to predict weight?
2. How is weight affected by typical daily ration?
3. Does metabolic rate differ by sex of rabbit?
4. Is gas mileages significantly affected by weight of the car?
5. Is there a relationship between how much money a person makes and their satisfaction with deer harvest regulations?
6. How is the uptake of heavy metals affected by the sex and age (young, middle, old) of the individual?
7. Is there a relationship between how much money a person makes and how much they weigh?
LM Foundation 4
Linear Models• A categorization scheme
• All use a common foundation of theory
12
3 4
1 Factor 2 Factors
LM Foundation 5
Which Test? Why?1. Does bird species diversity (number of species) decline
as you move away from the equator (increase latitude)?
2. Does the mean length of the anterior adductor muscle scar on a mussel species differ among five locations?
3. Does whether or not an otter captures a bluegill depend on the total length of the bluegill?
4. Is there a difference in fat reserves (thickness in mm) between wild and domestic seals, sex of the seal, or the interaction between the seal type and sex?
5. Does the relationship between the number of times the word gender was used in a journal volume and the year of the volume differ among three different journals?
LM Foundation 6
Which Test? Why?1. Does the relationship between resting heart rate
and body weight differ among groups of subjects that had or had not ingested caffeine?
2. Does the mean alcohol by volume differ among five different types of beer (pale ales, IPAs, lagers, stouts, and porters)?
3. Does mean alcohol by volume change depending on the weight of malt extract used in the brewing process?
LM Foundation 7
Which Test? Why?
LM Foundation 8
Which Test? Why?
LM Foundation 9
Which Test? Why?
LM Foundation 10
Example Data – Sex & DirectionA sample of 30 males and 30 females was taken to an unfamiliar wooded park and given spatial orientation tests, including pointing to the south. The absolute pointing error, in degrees, was recorded. The results are in the SexDirection.txt file on the webpage. Is there a difference in sense of direction between men and women?
from Sholl, M.J., J.C. Acacio, R.O. Makar, and C. Leon. 2000. The relation of sex and sense of direction to spatial orientation in an unfamiliar environment. Journal of Environmental Psychology. 20:17-28.
LM Foundation 11
Example Data – Sex & Direction
• What are the hypotheses?–HO: mm-mf=0 HA: mm- mf ≠ 0
• Use which hypothesis test?– Two Sample T-test
• What is conclusion from handout?– No significant difference in mean APE
between males and females
LM Foundation 12
Competing Models
Characteristic Full Model Simple Model
More LessBetter WorseRelative Fit
# ParametersHypothesis HA H0
LM Foundation 13
Competing Models
050
100
150
Sex
Abs
olut
e E
rror
Female Male
Full Simple
More LessBetter Worse
HA H0
LM Foundation 14
Competing Models – 2-sample T• H0: mi = m
– “The mean for each group equals a single grand mean”• i.e., “No difference in group means”
050
100
150
Sex
Abs
olut
e E
rror
Female Male
LM Foundation 15
Competing Models – 2-sample T• HA: mi = mi (where m1≠m2)
– “Each group mean equals a different value”• i.e., “Difference in group means”
050
100
150
Sex
Abs
olut
e E
rror
Female Male
16
Competing Models0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
050
100
150
Sex
Abs
olut
e E
rror
Female Male
Characteristic Full Model Simple Model
More LessBetter WorseRelative Fit
# ParametersHypothesis HA H0
Is the “benefit” of a better fit worth the
“cost” of added complexity?
Measuring Fit0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
LM Foundation 18
Measuring Fit – Notation
Yij = Y measurement on individual j in group i
I = total number of groups
ni = number of individuals in group i
n = number of individuals in all groups
`Yi. = group i sample mean (i.e., group mean)
`Y.. = sample mean of all individuals (i.e., grand mean)
LM Foundation 19
Measuring Fit – Notation Examples
ith Group Sample Mean Grand Sample Mean
LM Foundation 20
Measuring Fit – SS• Measures lack-of-fit of a model to a set of data
LM Foundation 21
Measuring Fit – SSTotal0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
= 115465
data model
Measuring Fit – SSWithin0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
=110496
data model
LM Foundation 23
Measuring Fit – SSWithin & SSTotal0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
SS
With
in
SS
With
in
SS
Tota
l
SSTotal = 115465SSWithin= 110496
Full model ALWAYS fits better!
LM Foundation 24
Measuring Fit – SSTotal Partitions
• SSTotal = SSWithin + SSAmong
• where
– Difference in SS between full & simple models– Improvement in lack-of-fit when using full model
(rather than simple model)– Measure of how different the group means are
LM Foundation 25
Measuring Fit – SSAmong0
5010
015
0
Sex
Abs
olut
e E
rror
Female Male
SS
Am
ong
• What would make SSamong be “large”?
• Must not forget about differences in model complexity!
LM Foundation 26
Measuring Complexity
• df = n – number of predictions– “Simple model” dfTotal = n-1– “Full model” dfWithin = n-I
• dfTotal = dfWithin + dfAmong
• dfAmong = I-1– Difference in number of model parameters– Added complexity of full model
LM Foundation 27
• Factor out difference in number of parameters on fit calculation by dividing SS by df
• Result is “mean square” (MS)
• MS are sample variances– MSTotal = s2 = total variability among individuals
around grand mean
– MSWithin = sp2 = pooled variability among
individuals around group means
– MSAmong = variability of group means around the grand mean
Fit vs. Complexity
LM Foundation 28
Fit vs. Complexity – MS
• Suppose that MSAmong = 10
– Is this “large” if MSWithin = 100?
– Is this “large” if MSWithin = 1?
• F=Within
Among
SMSM
LM Foundation 29
Fit vs Complexity – F Distribution• Has numerator and denominator df
– numerator from dfAmong
– denominator from dfWithin
• Right-skewed, all positive numbers• P-value always upper tail
Within
Among
MSMS
F
LM Foundation 30
Fit vs. Complexity – p-value• Large p-value?• Small F• Small MSAmong relative
to MSWithin
• Small SSAmong
• Full model not “better”• Group means do not
differ
Within
Among
MSMS
F
050
100
150
Sex
Abs
olut
e E
rror
Female Male
SS
Am
ong
Among
AmongAmong df
SSMS
LM Foundation 31
Fit vs. Complexity – p-value• Small p-value?• Large F• Large MSAmong relative
to MSWithin
• Large SSAmong
• Full model is “better”• Group means do differ
• Large p-value?• Small F• Small MSAmong relative
to MSWithin
• Small SSAmong
• Full model not “better”• Group means do not
differ
32
Things To Remember• Always two models
– Full model is separate means for each group– Simple model is a single mean for each group
• The SSTotal partitions into two parts -- SSAmong+SSWithin = SSTotal
– SSAmong is the improvement in lack-of-fit using the full model
• MS are SS/df and are variances– MSTotal is variance of Y– MSWithin is the pooled common variance
•dfAmong is the increase in complexity of the full model
• MSAmong + MSWithin not = MSTotal (because of different df)
• F is the ratio MSAmong / MSWithin
• If F is large then evidence for different means -- i.e., reject H0
LM Foundation 33
Linear Models in R – HO• Note use of
– lm()– summary()– confint()– fitPlot()– anova()