some questions (and a few answers) about multilevel modelsgelman/presentations/cdctalk.pdf · some...
TRANSCRIPT
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Some questions (and a few answers) aboutmultilevel models
Andrew GelmanDepartment of Statistics and Department of Political Science
Columbia University
3 May 2005
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Themes
I Multilevel models are necessary
I Tools needed to build, fit, check, and understand mlms
I Analogy to linear regression
I Mlm as regression with categorical inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Themes
I Multilevel models are necessary
I Tools needed to build, fit, check, and understand mlms
I Analogy to linear regression
I Mlm as regression with categorical inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Themes
I Multilevel models are necessary
I Tools needed to build, fit, check, and understand mlms
I Analogy to linear regression
I Mlm as regression with categorical inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Themes
I Multilevel models are necessary
I Tools needed to build, fit, check, and understand mlms
I Analogy to linear regression
I Mlm as regression with categorical inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Themes
I Multilevel models are necessary
I Tools needed to build, fit, check, and understand mlms
I Analogy to linear regression
I Mlm as regression with categorical inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Fitting and understanding multilevel models
I Some of my experiences with multilevel models
I Some challenges and solutions
I Lots of time for questionsI Collaborators:
I Iain Pardoe, Dept of Decision Sciences, University of OregonI David Park, Joseph Bafumi, Boris Shor, Dept of Political
Science, Columbia UniversityI Samantha Cook, Zaiying Huang, Jouni Kerman, Shouhao
Zhao, Dept of Statistics, Columbia UniversityI Phillip Price, Energy and Environment Division, Lawrence
Berkeley National Laboratory
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Fitting and understanding multilevel models
I Some of my experiences with multilevel models
I Some challenges and solutions
I Lots of time for questionsI Collaborators:
I Iain Pardoe, Dept of Decision Sciences, University of OregonI David Park, Joseph Bafumi, Boris Shor, Dept of Political
Science, Columbia UniversityI Samantha Cook, Zaiying Huang, Jouni Kerman, Shouhao
Zhao, Dept of Statistics, Columbia UniversityI Phillip Price, Energy and Environment Division, Lawrence
Berkeley National Laboratory
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Fitting and understanding multilevel models
I Some of my experiences with multilevel models
I Some challenges and solutions
I Lots of time for questionsI Collaborators:
I Iain Pardoe, Dept of Decision Sciences, University of OregonI David Park, Joseph Bafumi, Boris Shor, Dept of Political
Science, Columbia UniversityI Samantha Cook, Zaiying Huang, Jouni Kerman, Shouhao
Zhao, Dept of Statistics, Columbia UniversityI Phillip Price, Energy and Environment Division, Lawrence
Berkeley National Laboratory
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Fitting and understanding multilevel models
I Some of my experiences with multilevel models
I Some challenges and solutions
I Lots of time for questionsI Collaborators:
I Iain Pardoe, Dept of Decision Sciences, University of OregonI David Park, Joseph Bafumi, Boris Shor, Dept of Political
Science, Columbia UniversityI Samantha Cook, Zaiying Huang, Jouni Kerman, Shouhao
Zhao, Dept of Statistics, Columbia UniversityI Phillip Price, Energy and Environment Division, Lawrence
Berkeley National Laboratory
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Fitting and understanding multilevel models
I Some of my experiences with multilevel models
I Some challenges and solutions
I Lots of time for questionsI Collaborators:
I Iain Pardoe, Dept of Decision Sciences, University of OregonI David Park, Joseph Bafumi, Boris Shor, Dept of Political
Science, Columbia UniversityI Samantha Cook, Zaiying Huang, Jouni Kerman, Shouhao
Zhao, Dept of Statistics, Columbia UniversityI Phillip Price, Energy and Environment Division, Lawrence
Berkeley National Laboratory
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Plan of talk
I Rodents in NYC: apts within buildings within neighborhoods
I State-level opinions from national polls: mlm andpoststratification
I Mlm when number of groups is small
I Finite-population and superpopulation inference
I Understanding a fitted multilevel regression: Anova, averagepredictive effects, partial pooling, and R2
I Why I don’t use the terms “fixed” and “random” effects
I Questions . . .
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
NYC Dept of Health study
I Survey of 16000 apts in 9000 bldgs in 55 neighborhoods inNYC
I Do you have rodents?
I Hierarchical logistic regression:
Pr(yi = 1) = logit−1((Xβ)i + αbldg(i) + γneighborhood(i))
I Try to fit in WinBUGS, but too slow! Solutions:I Fit to subset of the data (900 apts in 500 bldgs)I Fit to all the data, separate model for each neighborhood
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
National opinion trends
1940 1950 1960 1970 1980 1990 2000
5060
7080
Year
Per
cent
age
supp
ort f
or th
e de
ath
pena
lty
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
State-level opinion trends
I Goal: estimating time series within each state
I One poll at a time: small-area estimation
I It works! Validated for pre-election polls
I Combining surveys: model for parallel time series
I Multilevel modeling + poststratification
I Poststratification cells: sex × ethnicity × age × education ×state
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Multilevel modeling of opinions
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Bayesian inference, summarize by posterior simulations of β:Simulation θ1 · · · θ75
1 ** · · · **...
.... . .
...1000 ** · · · **
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Interlude: why “multilevel” 6= “hierarchical”
I Logistic regression: Pr(yi = 1) = logit−1((Xβ)i )
I X includes demographic and geographic predictors
I Group-level model for the 16 age × education predictors
I Group-level model for the 50 state predictors
I Crossed (nonnested) structure of age, education, state
I Several overlapping “hierarchies”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Poststratification to estimate state opinions
I Implied inference for θj = logit−1(Xβ) in each of 3264 cells j(e.g., black female, age 18–29, college graduate, Georgia)
I PoststratificationI Within each state s, average over 64 cells:∑
j∈s Njθj
/ ∑j∈s Nj
I Nj = population in cell j (from Census)I 1000 simulation draws propagate to uncertainty for each θj
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
CBS/New York Times pre-election polls from 1988
I Validation study: fit model on poll data and compare toelection results
I Competing estimates:I No pooling: separate estimate within each stateI Complete pooling: no state predictorsI Hierarchical model and poststratify
I Mean absolute state errors:I No pooling: 10.4%I Complete pooling: 5.4%I Hierarchical model with poststratification: 4.5%
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Validation study: comparison of state errors
1988 election outcome vs. poll estimate
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
no pooling of state effects
Estimated Bush support
Act
ual e
lect
ion
outc
ome
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
complete pooling (no state effects)
Estimated Bush support
Act
ual e
lect
ion
outc
ome
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
multilevel model
Estimated Bush support
Act
ual e
lect
ion
outc
ome
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
How many groups do you need to fit a mlm?
I 9000 bldgs, 55 neighborhoods, 50 states: that’s okI But why do mlm with only 4 categories?
I Age 18–29, 30–44, 45–64, 65+I Education less than HS, HS, some college, college grad
I Simple to set up as mlm
I No need to choose a “baseline” category”
I Extends to interactions (16 age × education categories)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Finite-population and superpopulation estimands
I Consider the 4 coefficients, βage1 , . . . , βage
4
I Finite-population centering:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃0 = β0 + β̄age
I Adjusted parameters are more precisely estimated
I Especially when # of groups is smallI Sd of group effects
I βagej ∼ N(0, σ2
age), for j=1,. . . ,4I Superpopulation sd: σage
I Finite-population sd:√
13
∑4j=1(β
agej − β̄age)2
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Example of finite-pop and superpop ests
1 2 3 4 5 6 7 8
zero−centered parameters, δkadj
airport, k
δ kadj
−0.
50.
00.
5
1 2 3 4 5 6 7 8
uncentered parameters, δk
airport, k
δ k−
0.5
0.0
0.5
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant parameterization
I Data model: Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)I Usual model for the coefficients:
βagej ∼ N(0, σ2
age), for j = 1, . . . , 4
βstatej ∼ N(0, σ2
state), for j = 1, . . . , 50
I Additively redundant model:
βagej ∼ N(µage, σ
2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Why add the redundant µage, µstate?I Iterative algorithm moves more smoothly
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Motivation for redundant parameterization
80% interval for each chain R−hat
−3
−3
−2
−2
−1
−1
0
0
1
1
2
2
3
3
4
4
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
mu
eta[1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45]
sigma.y
sigma.eta
*
* array truncated for lack of space
medians and 80% intervals
mu
1
3.5
eta
−3
1
111111111 222222222 333333333 444444444 555555555 666666666 777777777 888888888 999999999101010101010101010 121212121212121212 141414141414141414 161616161616161616 181818181818181818 202020202020202020 222222222222222222 242424242424242424 262626262626262626 282828282828282828 303030303030303030 323232323232323232 343434343434343434 363636363636363636 383838383838383838 404040404040404040
*
sigma.y
0.77
0.83
sigma.eta
0
2
deviance
2170
2220
Bugs model at "C:/research/radon/radon.anova.1.txt", 3 chains, each with 100 iterations
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant additive parameterization
I Model
Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered parameters:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃statej = βstate
j − β̄state, for j = 1, . . . , 50
I Redefine the constant term:
β̃0 = β0 + β̄age + β̄age
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant additive parameterization
I Model
Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered parameters:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃statej = βstate
j − β̄state, for j = 1, . . . , 50
I Redefine the constant term:
β̃0 = β0 + β̄age + β̄age
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant additive parameterization
I Model
Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered parameters:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃statej = βstate
j − β̄state, for j = 1, . . . , 50
I Redefine the constant term:
β̃0 = β0 + β̄age + β̄age
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant additive parameterization
I Model
Pr(yi = 1) = logit−1(β0 + βage
age(i) + βstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered parameters:
β̃agej = βage
j − β̄age, for j = 1, . . . , 4
β̃statej = βstate
j − β̄state, for j = 1, . . . , 50
I Redefine the constant term:
β̃0 = β0 + β̄age + β̄age
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant multiplicative parameterization
I New model
Pr(yi = 1) = logit−1(β0 + ξageβage
age(i) + ξstateβstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered and scaled parameters:
β̃agej = ξage(βage
j − β̄age), for j = 1, . . . , 4
β̃statej = ξstate
(βstate
j − β̄state), for j = 1, . . . , 50
I Faster convergence
I More general model, connections to factor analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant multiplicative parameterization
I New model
Pr(yi = 1) = logit−1(β0 + ξageβage
age(i) + ξstateβstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered and scaled parameters:
β̃agej = ξage(βage
j − β̄age), for j = 1, . . . , 4
β̃statej = ξstate
(βstate
j − β̄state), for j = 1, . . . , 50
I Faster convergence
I More general model, connections to factor analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant multiplicative parameterization
I New model
Pr(yi = 1) = logit−1(β0 + ξageβage
age(i) + ξstateβstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered and scaled parameters:
β̃agej = ξage(βage
j − β̄age), for j = 1, . . . , 4
β̃statej = ξstate
(βstate
j − β̄state), for j = 1, . . . , 50
I Faster convergence
I More general model, connections to factor analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant multiplicative parameterization
I New model
Pr(yi = 1) = logit−1(β0 + ξageβage
age(i) + ξstateβstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered and scaled parameters:
β̃agej = ξage(βage
j − β̄age), for j = 1, . . . , 4
β̃statej = ξstate
(βstate
j − β̄state), for j = 1, . . . , 50
I Faster convergence
I More general model, connections to factor analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
Redundant multiplicative parameterization
I New model
Pr(yi = 1) = logit−1(β0 + ξageβage
age(i) + ξstateβstatestate(i)
)βage
j ∼ N(µage, σ2age), for j = 1, . . . , 4
βstatej ∼ N(µstate, σ
2state), for j = 1, . . . , 50
I Identify using centered and scaled parameters:
β̃agej = ξage(βage
j − β̄age), for j = 1, . . . , 4
β̃statej = ξstate
(βstate
j − β̄state), for j = 1, . . . , 50
I Faster convergence
I More general model, connections to factor analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
MLM and partial pooling
I Goal is to more accurately estimate coefficients that aregrouped
I A reparameterization can change a model(even if it leaves the likelihood unchanged)
I Redundant additive parameterization
I Redundant multiplicative parameterization
I Weakly-informative prior distribution for group-level varianceparameters
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
MLM and partial pooling
I Goal is to more accurately estimate coefficients that aregrouped
I A reparameterization can change a model(even if it leaves the likelihood unchanged)
I Redundant additive parameterization
I Redundant multiplicative parameterization
I Weakly-informative prior distribution for group-level varianceparameters
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
MLM and partial pooling
I Goal is to more accurately estimate coefficients that aregrouped
I A reparameterization can change a model(even if it leaves the likelihood unchanged)
I Redundant additive parameterization
I Redundant multiplicative parameterization
I Weakly-informative prior distribution for group-level varianceparameters
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
MLM and partial pooling
I Goal is to more accurately estimate coefficients that aregrouped
I A reparameterization can change a model(even if it leaves the likelihood unchanged)
I Redundant additive parameterization
I Redundant multiplicative parameterization
I Weakly-informative prior distribution for group-level varianceparameters
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
RodentsOpinionsMLM with few groups
MLM and partial pooling
I Goal is to more accurately estimate coefficients that aregrouped
I A reparameterization can change a model(even if it leaves the likelihood unchanged)
I Redundant additive parameterization
I Redundant multiplicative parameterization
I Weakly-informative prior distribution for group-level varianceparameters
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Displaying and summarizing inferences
I Displaying parameters in groups rather than as a long list
I Analysis of variance
I Average predictive effects
I R2 and partial pooling factors
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Displaying and summarizing inferences
I Displaying parameters in groups rather than as a long list
I Analysis of variance
I Average predictive effects
I R2 and partial pooling factors
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Displaying and summarizing inferences
I Displaying parameters in groups rather than as a long list
I Analysis of variance
I Average predictive effects
I R2 and partial pooling factors
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Displaying and summarizing inferences
I Displaying parameters in groups rather than as a long list
I Analysis of variance
I Average predictive effects
I R2 and partial pooling factors
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Displaying and summarizing inferences
I Displaying parameters in groups rather than as a long list
I Analysis of variance
I Average predictive effects
I R2 and partial pooling factors
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Raw display of inference
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
B.0 0.402 0.147 0.044 0.326 0.413 0.499 0.652 1.024 110
b.female -0.094 0.102 -0.283 -0.162 -0.095 -0.034 0.107 1.001 1000
b.black -1.701 0.305 -2.323 -1.910 -1.691 -1.486 -1.152 1.014 500
b.female.black -0.143 0.393 -0.834 -0.383 -0.155 0.104 0.620 1.007 1000
B.age[1] 0.084 0.088 -0.053 0.012 0.075 0.140 0.277 1.062 45
B.age[2] -0.072 0.087 -0.260 -0.121 -0.054 -0.006 0.052 1.017 190
B.age[3] 0.044 0.077 -0.105 -0.007 0.038 0.095 0.203 1.029 130
B.age[4] -0.057 0.096 -0.265 -0.115 -0.052 0.001 0.133 1.076 32
B.edu[1] -0.148 0.131 -0.436 -0.241 -0.137 -0.044 0.053 1.074 31
B.edu[2] -0.022 0.082 -0.182 -0.069 -0.021 0.025 0.152 1.028 160
B.edu[3] 0.148 0.112 -0.032 0.065 0.142 0.228 0.370 1.049 45
B.edu[4] 0.023 0.090 -0.170 -0.030 0.015 0.074 0.224 1.061 37
B.age.edu[1,1] -0.044 0.133 -0.363 -0.104 -0.019 0.025 0.170 1.018 1000
B.age.edu[1,2] 0.059 0.123 -0.153 -0.011 0.032 0.118 0.353 1.016 580
B.age.edu[1,3] 0.049 0.124 -0.146 -0.023 0.022 0.104 0.349 1.015 280
B.age.edu[1,4] 0.001 0.116 -0.237 -0.061 0.000 0.052 0.280 1.010 1000
B.age.edu[2,1] 0.066 0.152 -0.208 -0.008 0.032 0.124 0.449 1.022 140
B.age.edu[2,2] -0.081 0.127 -0.407 -0.137 -0.055 0.001 0.094 1.057 120
B.age.edu[2,3] -0.004 0.102 -0.226 -0.048 0.000 0.041 0.215 1.008 940
B.age.edu[2,4] -0.042 0.108 -0.282 -0.100 -0.026 0.014 0.157 1.017 170
B.age.edu[3,1] 0.034 0.135 -0.215 -0.030 0.009 0.091 0.361 1.021 230
B.age.edu[3,2] 0.007 0.102 -0.213 -0.039 0.003 0.052 0.220 1.019 610
B.age.edu[3,3] 0.033 0.130 -0.215 -0.029 0.009 0.076 0.410 1.080 61
B.age.edu[3,4] -0.009 0.109 -0.236 -0.064 -0.005 0.043 0.214 1.024 150
B.age.edu[4,1] -0.141 0.190 -0.672 -0.224 -0.086 -0.003 0.100 1.036 270
B.age.edu[4,2] -0.014 0.119 -0.280 -0.059 -0.008 0.033 0.239 1.017 240
B.age.edu[4,3] 0.046 0.132 -0.192 -0.024 0.019 0.108 0.332 1.010 210
B.age.edu[4,4] 0.042 0.142 -0.193 -0.022 0.016 0.095 0.377 1.015 160
B.state[1] 0.201 0.211 -0.131 0.047 0.172 0.326 0.646 1.003 960
B.state[2] 0.466 0.252 0.008 0.310 0.440 0.603 1.047 1.001 1000
B.state[3] 0.393 0.196 0.023 0.268 0.380 0.518 0.814 1.002 1000
B.state[4] -0.164 0.209 -0.607 -0.290 -0.149 -0.041 0.228 1.003 590
B.state[5] -0.054 0.141 -0.322 -0.143 -0.061 0.035 0.229 1.001 1000
B.state[6] 0.126 0.206 -0.313 0.010 0.126 0.256 0.512 1.011 1000
B.state[7] 0.095 0.183 -0.263 -0.023 0.087 0.207 0.466 1.004 490
B.state[8] -0.210 0.207 -0.666 -0.322 -0.194 -0.080 0.155 1.001 1000
B.state[9] -2.648 0.728 -4.291 -3.067 -2.602 -2.187 -1.385 1.007 290
B.state[10] 0.097 0.173 -0.296 -0.010 0.115 0.214 0.402 1.014 270
B.state[11] -0.138 0.173 -0.467 -0.253 -0.148 -0.034 0.240 1.005 1000
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Raw graphical display
80% interval for each chain R−hat
−4
−4
−2
−2
0
0
2
2
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
1 1.5 2+
B.0 ●●●
b.female ●●●
b.black ●●●
b.female.black ●●●
B.age[1] ●●●
[2] ●●●
[3] ●●●
[4] ●●●
B.edu[1] ●●●
[2] ●●●
[3] ●●●
[4] ●●●
B.age.edu[1,1] ●●●
[1,2] ●●●
[1,3] ●●●
[1,4] ●●●
[2,1] ●●●
[2,2] ●●●
[2,3] ●●●
[2,4] ●●●
[3,1] ●●●
[3,2] ●●●
B.state[1] ●●●
[2] ●●●
[3] ●●●
[4] ●●●
[5] ●●●
[6] ●●●
[7] ●●●
[8] ●●●
[9] ●●●
[10] ●●●
B.region[1] ●●●
[2] ●●●
[3] ●●●
[4] ●●●
[5] ●●●
Sigma.age ●●●
Sigma.edu ●●●
Sigma.age.edu ●●●
Sigma.state ●●●
Sigma.region ●●●
*
*
* array truncated for lack of space
medians and 80% intervals
B.00
0.20.40.6
●●●
b.female−0.3−0.2−0.1
00.1
●●●
b.black−2.5−2
−1.5−1
●●●
b.female.black−1
−0.50
0.5●●●
B.age−0.4−0.2
00.20.4
●●●
111111111
●●●
222222222
●●●
333333333
●●●
444444444
B.edu−0.5
00.5
●●●
111111111
●●●
222222222
●●●
333333333
●●●
444444444
B.age.edu−1
−0.50
0.5●●●
111111111111111111
●●●
222222222
●●●
333333333
●●●
444444444
●●●
222222222111111111
●●●
222222222
●●●
333333333
●●●
444444444
●●●
333333333111111111
●●●
222222222
●●●
333333333
●●●
444444444
●●●
444444444111111111
●●●
222222222
●●●
333333333
●●●
444444444
B.state−4−202
●●●
111111111
●●●
222222222
●●●
333333333
●●●
444444444
●●●
555555555
●●●
666666666
●●●
777777777
●●●
888888888●●●
999999999
●●●
101010101010101010
●●● ●●●
121212121212121212
●●●●●●
141414141414141414
●●● ●●●
161616161616161616
●●● ●●●
181818181818181818
●●● ●●●
202020202020202020
●●● ●●●
222222222222222222
●●● ●●●
242424242424242424
●●● ●●●
262626262626262626
●●●●●●
282828282828282828
●●● ●●●
303030303030303030
●●● ●●●
323232323232323232
●●● ●●●
343434343434343434
●●● ●●●
363636363636363636
●●● ●●●
383838383838383838
●●● ●●●
404040404040404040
*
B.region−1
−0.50
0.51
●●●
111111111
●●●
222222222
●●●
333333333
●●●
444444444
●●●
555555555
Sigma.age0
0.20.40.6
●●●
Sigma.edu0
0.51
●●●
Sigma.age.edu0
0.10.20.3
●●●
Sigma.state0
0.10.20.30.4
●●●
Sigma.region0
0.51
●●●
deviance2580260026202640
●●●
Bugs model at "C:/books/multilevel/election88/model4.bug", 3 chains, each with 2001 iterations
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Better graphical display 1: demographics
femaleblackfemale x black
18−2930−4445−6465+
no h.s.high schoolsome collegecollege grad
18−29 x no h.s.18−29 x high school18−29 x some college18−29 x college grad
30−44 x no h.s.30−44 x high school30−44 x some college30−44 x college grad
45−64 x no h.s.45−64 x high school45−64 x some college45−64 x college grad
65+ x no h.s.65+ x high school65+ x some college65+ x college grad
−2.5
−2.5
−2
−2
−1.5
−1.5
−1
−1
−0.5
−0.5
0
0
0.5
0.5
1
1
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Better graphical display 2: within states
−2.0 −1.0 0.00.0
0.4
0.8
Alaska
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
Arizona
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
Arkansas
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
California
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
Colorado
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
Connecticut
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
Delaware
linear predictor
Pr
(sup
port
Bus
h)
−2.0 −1.0 0.00.0
0.4
0.8
District of Columbia
linear predictor
Pr
(sup
port
Bus
h)Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Better graphical display 3: between states
Northeast
R vote in prev elections
regr
essi
on in
terc
ept
0.5 0.6 0.7
−0.
50.
00.
5
CTDEME
MDMA
NHNJ
NYPA
RI
VTWV
Midwest
R vote in prev elections
regr
essi
on in
terc
ept
0.5 0.6 0.7
−0.
50.
00.
5
IL
IN
IA
KS
MI
MN
MONEND
OH
SDWI
South
R vote in prev elections
regr
essi
on in
terc
ept
0.5 0.6 0.7
−0.
50.
00.
5 AL
AR FLGAKY LAMSNC
OKSCTN
TX
VA
West
R vote in prev elections
regr
essi
on in
terc
ept
0.5 0.6 0.7
−0.
50.
00.
5
AKAZ
CA COHIID
MTNV
NMOR
UT
WAWY
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Anova and multilevel models
I Each row of the Anova table is a variance componentI Goal
I How important is each source of variation?I Estimating and comparing variance componentsI Not testing if a variance component equals 0
I Multilevel regression solves classical Anova problems
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Raw display of inference
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff
B.0 0.402 0.147 0.044 0.326 0.413 0.499 0.652 1.024 110
b.female -0.094 0.102 -0.283 -0.162 -0.095 -0.034 0.107 1.001 1000
b.black -1.701 0.305 -2.323 -1.910 -1.691 -1.486 -1.152 1.014 500
b.female.black -0.143 0.393 -0.834 -0.383 -0.155 0.104 0.620 1.007 1000
B.age[1] 0.084 0.088 -0.053 0.012 0.075 0.140 0.277 1.062 45
B.age[2] -0.072 0.087 -0.260 -0.121 -0.054 -0.006 0.052 1.017 190
B.age[3] 0.044 0.077 -0.105 -0.007 0.038 0.095 0.203 1.029 130
B.age[4] -0.057 0.096 -0.265 -0.115 -0.052 0.001 0.133 1.076 32
B.edu[1] -0.148 0.131 -0.436 -0.241 -0.137 -0.044 0.053 1.074 31
B.edu[2] -0.022 0.082 -0.182 -0.069 -0.021 0.025 0.152 1.028 160
B.edu[3] 0.148 0.112 -0.032 0.065 0.142 0.228 0.370 1.049 45
B.edu[4] 0.023 0.090 -0.170 -0.030 0.015 0.074 0.224 1.061 37
B.age.edu[1,1] -0.044 0.133 -0.363 -0.104 -0.019 0.025 0.170 1.018 1000
B.age.edu[1,2] 0.059 0.123 -0.153 -0.011 0.032 0.118 0.353 1.016 580
B.age.edu[1,3] 0.049 0.124 -0.146 -0.023 0.022 0.104 0.349 1.015 280
B.age.edu[1,4] 0.001 0.116 -0.237 -0.061 0.000 0.052 0.280 1.010 1000
B.age.edu[2,1] 0.066 0.152 -0.208 -0.008 0.032 0.124 0.449 1.022 140
B.age.edu[2,2] -0.081 0.127 -0.407 -0.137 -0.055 0.001 0.094 1.057 120
B.age.edu[2,3] -0.004 0.102 -0.226 -0.048 0.000 0.041 0.215 1.008 940
B.age.edu[2,4] -0.042 0.108 -0.282 -0.100 -0.026 0.014 0.157 1.017 170
B.age.edu[3,1] 0.034 0.135 -0.215 -0.030 0.009 0.091 0.361 1.021 230
B.age.edu[3,2] 0.007 0.102 -0.213 -0.039 0.003 0.052 0.220 1.019 610
B.age.edu[3,3] 0.033 0.130 -0.215 -0.029 0.009 0.076 0.410 1.080 61
B.age.edu[3,4] -0.009 0.109 -0.236 -0.064 -0.005 0.043 0.214 1.024 150
B.age.edu[4,1] -0.141 0.190 -0.672 -0.224 -0.086 -0.003 0.100 1.036 270
B.age.edu[4,2] -0.014 0.119 -0.280 -0.059 -0.008 0.033 0.239 1.017 240
B.age.edu[4,3] 0.046 0.132 -0.192 -0.024 0.019 0.108 0.332 1.010 210
B.age.edu[4,4] 0.042 0.142 -0.193 -0.022 0.016 0.095 0.377 1.015 160
B.state[1] 0.201 0.211 -0.131 0.047 0.172 0.326 0.646 1.003 960
B.state[2] 0.466 0.252 0.008 0.310 0.440 0.603 1.047 1.001 1000
B.state[3] 0.393 0.196 0.023 0.268 0.380 0.518 0.814 1.002 1000
B.state[4] -0.164 0.209 -0.607 -0.290 -0.149 -0.041 0.228 1.003 590
B.state[5] -0.054 0.141 -0.322 -0.143 -0.061 0.035 0.229 1.001 1000
B.state[6] 0.126 0.206 -0.313 0.010 0.126 0.256 0.512 1.011 1000
B.state[7] 0.095 0.183 -0.263 -0.023 0.087 0.207 0.466 1.004 490
B.state[8] -0.210 0.207 -0.666 -0.322 -0.194 -0.080 0.155 1.001 1000
B.state[9] -2.648 0.728 -4.291 -3.067 -2.602 -2.187 -1.385 1.007 290
B.state[10] 0.097 0.173 -0.296 -0.010 0.115 0.214 0.402 1.014 270
B.state[11] -0.138 0.173 -0.467 -0.253 -0.148 -0.034 0.240 1.005 1000
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian AnovaSource df Est. sd of effects
0 0.5 1 1.5
sex 1ethnicity 1
sex * ethnicity 1
age 3education 3
age * education 9
region 3region * state 46
0 0.5 1 1.5
Source df Est. sd of effects0 0.5 1 1.5
sex 1ethnicity 1
sex * ethnicity 1age 3
education 3age * education 9
region 3region * state 46
ethnicity * region 3ethnicity * region * state 46
0 0.5 1 1.5
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Fixed and random effects?
I What are “fixed” and “random” effects?
I Five incompatible definitions:
1. Fixed effects are constant across individuals; randomeffects vary (Leeuw, 1998)
2. Effects are fixed if they are interesting in themselves,random if you care about the population (Searle, 1992)
3. Fixed effects are the entire population, random are asmall sample from a larger population (Tukey, 1960)
4. Random effects are realized values of a random variable(LaMotte, 1983)
5. Fixed effects are estimated using least squares, randomeffects are esitmated using shrinkage (Snijders, 1999)
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
How to think about fixed and random effects
I Ideally, allow all coefficients to vary by groupI Main limitation: complicated models can be overwhelming
I Bayesian multilevel modelingI Simultaneously estimate population parameters and individual
coefficientsI Suppose you are estimating a finite set of effects,
then told they are a sample from a larger populationI No need to change the modelI But estimand of interest might change!
I Separation of modeling, inference, and decision analysis
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
u
v
y
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects
I What is E (y | x1 = high) − E(y | x1 = low), with all other x ’sheld constant?
I In general, difference can depend on xI Average over distribution of x in the data
I You can’t just use a central value of x
I Compute APE for each input variable x
I Multilevel factors are categorical input variables
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
APE: why you can’t just use a central value of x
−6 −4 −2 0 2 4 6 80.0
0.2
0.4
0.6
0.8
1.0
v
E(y
|u, v
)
u = 1 u = 0
| || || ||| | |||| | ||| ||| | || || || || ||| || ||| | | |
avg pred effect = 0.03
pred effect at E(v) = 0.24
−6 −4 −2 0 2 4 6 80.0
0.2
0.4
0.6
0.8
1.0
v
E(y
|u, v
)
u = 1 u = 0
||| | | || ||| ||| ||| | | |||| || | |||| ||| |||||| ||
avg pred effect = 0.11
pred effect at E(v) = 0.03
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Framework for average predictive effects
I Regresion model, E (y |x , θ)I Predictors come from “input variables”
I Example: regression on age, sex, age × sex, and age2
I 5 linear predictors (including the constant term)I But only 4 inputs
I Compute APE for each input variable, one at a time, with allothers held constant
I Scalar input u: the “input of interest”I Vector v : all other inputs
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Defining predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Average over:I The transition, u(1) → u(2)
I The other inputs, vI The regression coefficients, θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Average predictive effects for binary inputs
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Binary input u:I predictive effect: δu(0 → 1, v , θ) = E (y |1, v , θ)− E (y |0, v , θ)I Average over v1, . . . , vn in the data (or weighted average if
desired)I Average over θ from inferential simulationsI Standard error of APE from uncertainty in θ
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Scenarios for average predictive effects
I predictive effect:
δu(u(1)→u(2), v , θ) = E(y |u(2),v ,θ)−E(y |u(1),v ,θ)
u(2)−u(1)
I Continuous inputs
I Unordered discrete inputs
I Variance components
I Interactions
I Inputs that are not always active
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
R2 for multilevel models
I How much of the variance is “explained” by the model?
I Separate R2 for each level
I Classical R2 = 1− variance of the residualsvariance of the data
I Multilevel model:at each level, k units: θk = (Xβ)k + εk
I At each level: R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Bayesian R2
I At each levelI θk = (Xβ)k + εk
I R2 = 1− variance among the (Xβ)k ’svariance among the εk ’s
I Numerator and denominator estimated by their posteriormeans
I Posterior distribution automatically accounts for uncertainty
I Bayesian generalization of classical “adjusted R2”
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Example of partial pooling
LAC QUI PARLE
basement
log
rado
n le
vel
0 1
−1
13
AITKIN
basement
log
rado
n le
vel
0 1
−1
13
KOOCHICHING
basement
log
rado
n le
vel
0 1
−1
13
DOUGLAS
basement
log
rado
n le
vel
0 1
−1
13
CLAY
basement
log
rado
n le
vel
0 1
−1
13
STEARNS
basement
log
rado
n le
vel
0 1
−1
13
RAMSEY
basement
log
rado
n le
vel
0 1−
11
3
ST LOUIS
basement
log
rado
n le
vel
0 1
−1
13
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Graphical display of a fitted mlmAnalysis of varianceAverage predictive effectsR2 and pooling factors
Partial pooling factors
I At each level of the model:I θk = (Xβ)k + εk
I λ = 0 if no pooling of ε’sI λ = 0 if complete pooling of ε’s to 0
I Multilevel generalization of Bayesian pooling factorI Can’t simply compare to the “complete pooling” and “no
pooling” estimatesI “No pooling” estimate doesn’t always exist!
I At each level, our pooling factor is defined based on the meanand variance of the εk ’s
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Understanding sources of variation
I Graphs, not tables, of parameter estimates
I In display, use the grouping infoI Analysis of variance
I Summarize the scale of each batch of predictorsI Go beyond classical null-hypothesis-testing framework
I Average predictive effects for models with nonlinearity andinteractions
I Generalization of R2 (explained variance), defined at eachlevel of the model
I Partial pooling factor, defined at each level
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models
Structured data and multilevel modelsUnderstanding multilevel models and variance components
Conclusions
Understanding sources of variationMultilevel models even when # groups is small
Multilevel models even when # groups is small
I Forget about “fixed and random effects”; think about“finite-pop and superpop estimands” instead
I Always use the multilevel model, but estimand of interestdepends on context
I For small # groups: use the new half-t prior dist for varianceparameters
I Challenges:I Multivariate models (for example, varying-intercept,
varying-slope models)I Models with deep interaction structuresI Automatic graphical displayI Model building
Andrew Gelman Q’s and A’s on multilevel models