advanced quantitative research methodology lecture notes

170
Advanced Quantitative Research Methodology Lecture Notes: Ecological Inference 1 Gary King http://GKing.Harvard.Edu January 28, 2012 1 c Copyright 2008 Gary King, All Rights Reserved. Gary King http://GKing.Harvard.Edu () Advanced Quantitative Research Methodology Lecture Notes: January 28, 2012 1 / 38

Upload: lenga

Post on 04-Jan-2017

292 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Advanced Quantitative Research Methodology Lecture Notes

Advanced Quantitative Research Methodology LectureNotes: Ecological Inference1

Gary Kinghttp://GKing.Harvard.Edu

January 28, 2012

1 c©Copyright 2008 Gary King, All Rights Reserved.Gary King http://GKing.Harvard.Edu () Advanced Quantitative Research Methodology Lecture Notes: Ecological InferenceJanuary 28, 2012 1 / 38

Page 2: Advanced Quantitative Research Methodology Lecture Notes

Reading

Reading: Gary King. A Solution to the Eco-logical Inference Problem: ReconstructingIndividual Behavior from Aggregate Data.Princeton University Press, 1997

Gary King () Ecological Inference 2 / 38

Page 3: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

Definition: Ecological Inference is the process of using aggregate (i.e.,“ecological”) data to infer discrete individual-level relationships ofinterest when individual-level data are not available.

History of the Problem:

1. Ogburn and Goltra (1919) in the very first multivariate statisticalanalysis of politics in a political science journal made ecologicalinferences and recognized the problem. The big issue in 1919: are thenewly enfranchised women going to take over the political system?They regressed votes in referenda in Oregon precincts on the percent ofwomen in each precinct. But they worried:

It is also theoretically possible to gerrymander the precincts in such away that there may be a negative correlative even though men andwomen each distribute their votes 50 to 50 on a givenmeasure. . . (Ogburn and Goltra, 1919).

Gary King () Ecological Inference 3 / 38

Page 4: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

Definition: Ecological Inference is the process of using aggregate (i.e.,“ecological”) data to infer discrete individual-level relationships ofinterest when individual-level data are not available.

History of the Problem:

1. Ogburn and Goltra (1919) in the very first multivariate statisticalanalysis of politics in a political science journal made ecologicalinferences and recognized the problem. The big issue in 1919: are thenewly enfranchised women going to take over the political system?They regressed votes in referenda in Oregon precincts on the percent ofwomen in each precinct. But they worried:

It is also theoretically possible to gerrymander the precincts in such away that there may be a negative correlative even though men andwomen each distribute their votes 50 to 50 on a givenmeasure. . . (Ogburn and Goltra, 1919).

Gary King () Ecological Inference 3 / 38

Page 5: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

Definition: Ecological Inference is the process of using aggregate (i.e.,“ecological”) data to infer discrete individual-level relationships ofinterest when individual-level data are not available.

History of the Problem:

1. Ogburn and Goltra (1919) in the very first multivariate statisticalanalysis of politics in a political science journal made ecologicalinferences and recognized the problem. The big issue in 1919: are thenewly enfranchised women going to take over the political system?They regressed votes in referenda in Oregon precincts on the percent ofwomen in each precinct. But they worried:

It is also theoretically possible to gerrymander the precincts in such away that there may be a negative correlative even though men andwomen each distribute their votes 50 to 50 on a givenmeasure. . . (Ogburn and Goltra, 1919).

Gary King () Ecological Inference 3 / 38

Page 6: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

Definition: Ecological Inference is the process of using aggregate (i.e.,“ecological”) data to infer discrete individual-level relationships ofinterest when individual-level data are not available.

History of the Problem:

1. Ogburn and Goltra (1919) in the very first multivariate statisticalanalysis of politics in a political science journal made ecologicalinferences and recognized the problem. The big issue in 1919: are thenewly enfranchised women going to take over the political system?They regressed votes in referenda in Oregon precincts on the percent ofwomen in each precinct. But they worried:

It is also theoretically possible to gerrymander the precincts in such away that there may be a negative correlative even though men andwomen each distribute their votes 50 to 50 on a givenmeasure. . . (Ogburn and Goltra, 1919).

Gary King () Ecological Inference 3 / 38

Page 7: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

Definition: Ecological Inference is the process of using aggregate (i.e.,“ecological”) data to infer discrete individual-level relationships ofinterest when individual-level data are not available.

History of the Problem:

1. Ogburn and Goltra (1919) in the very first multivariate statisticalanalysis of politics in a political science journal made ecologicalinferences and recognized the problem. The big issue in 1919: are thenewly enfranchised women going to take over the political system?They regressed votes in referenda in Oregon precincts on the percent ofwomen in each precinct. But they worried:

It is also theoretically possible to gerrymander the precincts in such away that there may be a negative correlative even though men andwomen each distribute their votes 50 to 50 on a givenmeasure. . . (Ogburn and Goltra, 1919).

Gary King () Ecological Inference 3 / 38

Page 8: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 9: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 10: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 11: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 12: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 13: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 14: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.

2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 15: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.

3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 16: Advanced Quantitative Research Methodology Lecture Notes

Preliminaries

2. Robinson’s (1950) clarified the problem, causing:

(a) several literatures to wither, including studies of local and regionalpolitics through aggregate electoral statistics in favor of survey researchbased on national samples.

(b) the development of a methodological literature devoted to solving theproblem.

3. Hundreds of other articles have helped us understand the problem.

History of Solutions: A 45-year war between supporters of

1. Duncan and Davis (1953): a deterministic solution.2. Goodman (1953, 1959): a statistical solution.3. for 50 years, no other methods used in applications.

Gary King () Ecological Inference 4 / 38

Page 17: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!

Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 18: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 19: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 20: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 21: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 22: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 23: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 24: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

1. Public policy: Applying the Voting Rights Act.

2. History: Who voted for the Nazi’s?

3. Marketing: What types of people buy your products?

4. Banking: Are banks complying with red-lining laws? Are there areaswith certain types of people who might take out loans but have not?

5. Candidates for office: How do good representatives decide whatpolicies they should favor? How can candidates tailor campaign appealsand target voter groups?

6. Sociology: Do the unemployed commit more crimes or is it just thatthere are more crimes in unemployed areas?

7. Economics: With some exceptions, most theories are based onassumptions about individuals, but most data are on groups.

Gary King () Ecological Inference 5 / 38

Page 25: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

8. Education: Do students who attend private schools through a vouchersystem do as well as students who can afford to attend on their own?

9. Atmospheric physics: How can we tell which types of the vehiclesactually on the roads emit more carbon dioxide and carbon monoxide?

10. Oceanography: How many marine organisms of a certain type werecollected at a given depth, from fishing nets dropped from the surfacedown through a variety of depths.

11. Epidemiology: Does radon cause lung cancer?

12. Changes in public opinion: How to use repeated independentcross-sectional surveys to measure individual change?

Gary King () Ecological Inference 6 / 38

Page 26: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

8. Education: Do students who attend private schools through a vouchersystem do as well as students who can afford to attend on their own?

9. Atmospheric physics: How can we tell which types of the vehiclesactually on the roads emit more carbon dioxide and carbon monoxide?

10. Oceanography: How many marine organisms of a certain type werecollected at a given depth, from fishing nets dropped from the surfacedown through a variety of depths.

11. Epidemiology: Does radon cause lung cancer?

12. Changes in public opinion: How to use repeated independentcross-sectional surveys to measure individual change?

Gary King () Ecological Inference 6 / 38

Page 27: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

8. Education: Do students who attend private schools through a vouchersystem do as well as students who can afford to attend on their own?

9. Atmospheric physics: How can we tell which types of the vehiclesactually on the roads emit more carbon dioxide and carbon monoxide?

10. Oceanography: How many marine organisms of a certain type werecollected at a given depth, from fishing nets dropped from the surfacedown through a variety of depths.

11. Epidemiology: Does radon cause lung cancer?

12. Changes in public opinion: How to use repeated independentcross-sectional surveys to measure individual change?

Gary King () Ecological Inference 6 / 38

Page 28: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

8. Education: Do students who attend private schools through a vouchersystem do as well as students who can afford to attend on their own?

9. Atmospheric physics: How can we tell which types of the vehiclesactually on the roads emit more carbon dioxide and carbon monoxide?

10. Oceanography: How many marine organisms of a certain type werecollected at a given depth, from fishing nets dropped from the surfacedown through a variety of depths.

11. Epidemiology: Does radon cause lung cancer?

12. Changes in public opinion: How to use repeated independentcross-sectional surveys to measure individual change?

Gary King () Ecological Inference 6 / 38

Page 29: Advanced Quantitative Research Methodology Lecture Notes

If you can avoid making ecological inferences, do so!Some of those who aren’t so lucky:

8. Education: Do students who attend private schools through a vouchersystem do as well as students who can afford to attend on their own?

9. Atmospheric physics: How can we tell which types of the vehiclesactually on the roads emit more carbon dioxide and carbon monoxide?

10. Oceanography: How many marine organisms of a certain type werecollected at a given depth, from fishing nets dropped from the surfacedown through a variety of depths.

11. Epidemiology: Does radon cause lung cancer?

12. Changes in public opinion: How to use repeated independentcross-sectional surveys to measure individual change?

Gary King () Ecological Inference 6 / 38

Page 30: Advanced Quantitative Research Methodology Lecture Notes

The Problem: The District Level

Race of

Voting Age Voting Decision

Person Democrat Republican No vote

black ? ? ? 55,054

white ? ? ? 25,706

19,896 10,936 49,928 80,760

The Ecological Inference Problem at the District-Level: The 1990Election to the Ohio State House, District 42. The goal is to inferfrom the marginal entries (each of which is the sum of the corre-sponding row or column) to the cell entries. (Note information in thebounds.)

Gary King () Ecological Inference 7 / 38

Page 31: Advanced Quantitative Research Methodology Lecture Notes

The Problem: The Precinct Level

Race of

Voting Age Voting Decision

Person Democrat Republican No vote

black ? ? ? 221

white ? ? ? 484

130 92 483 705

The Ecological Inference Problem at the Precinct-Level: Precinct Pin District 42 (1 of 131 in the district). The goal is to infer from themargins of a set of tables like this one to the cell entries in each.

Gary King () Ecological Inference 8 / 38

Page 32: Advanced Quantitative Research Methodology Lecture Notes

The best we could do, circa 1996

Estimated Percent of BlacksYear District Voting for the Democratic Candidate1986 12 95.65%

23 100.0629 103.4731 98.9242 108.4145 93.58

1988 12 95.6723 102.6429 105.0031 100.2042 111.0545 97.49

Sample Ecological Inferences: All Ohio State House districts where anAfrican American Democrat ran against a white Republican, 1986–1990 (Source: “Statement of Gordon G. Henderson,” presented as anexhibit in federal court, using Goodman’s regression). Figures above100% are logically impossible.

Gary King () Ecological Inference 9 / 38

Page 33: Advanced Quantitative Research Methodology Lecture Notes

The best we could do, circa 1996: Continued

Estimated Percent of BlacksYear District Voting for the Democratic Candidate1990 12 94.79%

14 97.8316 94.3623 101.0925 98.8329 103.4231 102.1736 101.3537 101.3942 109.6345 97.62

Sample Ecological Inferences: All Ohio State House districts where anAfrican American Democrat ran against a white Republican, 1986–1990 (Source: “Statement of Gordon G. Henderson,” presented as anexhibit in federal court, using Goodman’s regression). Figures above100% are logically impossible.

Gary King () Ecological Inference 10 / 38

Page 34: Advanced Quantitative Research Methodology Lecture Notes

What Information Does The New Method Provide?

Goodman’s Method: One incorrect number (5 standard deviations outside thedeterministic bounds)

The New Method:

Non-minority Turnout in New Jersey Cities and Towns. In contrast to the best existing

methods, which provide one (incorrect) number for the entire state, the method offered

here gives an accurate estimate of white turnout for all 567 minor civil divisions in the

state, a few of which are labeled.

Gary King () Ecological Inference 11 / 38

Page 35: Advanced Quantitative Research Methodology Lecture Notes

What Information Does The New Method Provide?

Goodman’s Method: One incorrect number (5 standard deviations outside thedeterministic bounds)

The New Method:

Non-minority Turnout in New Jersey Cities and Towns. In contrast to the best existing

methods, which provide one (incorrect) number for the entire state, the method offered

here gives an accurate estimate of white turnout for all 567 minor civil divisions in the

state, a few of which are labeled.

Gary King () Ecological Inference 11 / 38

Page 36: Advanced Quantitative Research Methodology Lecture Notes

What Information Does The New Method Provide?

Goodman’s Method: One incorrect number (5 standard deviations outside thedeterministic bounds)

The New Method:

Non-minority Turnout in New Jersey Cities and Towns. In contrast to the best existing

methods, which provide one (incorrect) number for the entire state, the method offered

here gives an accurate estimate of white turnout for all 567 minor civil divisions in the

state, a few of which are labeled.Gary King () Ecological Inference 11 / 38

Page 37: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 38: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 39: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 40: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 41: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 42: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 43: Advanced Quantitative Research Methodology Lecture Notes

Notation

Vote No vote

black βbi 1− βb

i Xi

white βwi 1− βw

i 1− Xi

Ti 1− Ti

Notation for Precinct i (i = 1, . . . , p).

Observed variables:

Ti = voter Turnout in precinct i

Xi = Black proportion of Voting Age Population in precinct i

Unobserved quantities of interest:

βbi = fraction of blacks who vote in precinct i

βwi = fraction of whites who vote in precinct i

Gary King () Ecological Inference 12 / 38

Page 44: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 45: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 46: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 47: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:

Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 48: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 49: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 50: Advanced Quantitative Research Methodology Lecture Notes

Notation

An accounting identity (a fact, not an assumption):

Ti = βbi Xi + βw

i (1− Xi )

= βwi + (βb

i − βwi )Xi

Goodman’s regression:Run a regression of Ti on Xi and (1−Xi ) (no constant term). Coefficientsare intended to be:

Bb, District-wide black turnout

Bw , District-wide white turnout

Gary King () Ecological Inference 13 / 38

Page 51: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

If we follow Goodman’s advice, we won’t apply the model.If we don’t follow Goodman’s advice & apply it anyway:

1. We know parameters are not constant

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

Precincts in Marion County, In-diana: Voter Turnout for theU.S. Senate by Fraction Black,1990.

Gary King () Ecological Inference 14 / 38

Page 52: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

If we follow Goodman’s advice, we won’t apply the model.

If we don’t follow Goodman’s advice & apply it anyway:

1. We know parameters are not constant

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

Precincts in Marion County, In-diana: Voter Turnout for theU.S. Senate by Fraction Black,1990.

Gary King () Ecological Inference 14 / 38

Page 53: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

If we follow Goodman’s advice, we won’t apply the model.If we don’t follow Goodman’s advice & apply it anyway:

1. We know parameters are not constant

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

Precincts in Marion County, In-diana: Voter Turnout for theU.S. Senate by Fraction Black,1990.

Gary King () Ecological Inference 14 / 38

Page 54: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

If we follow Goodman’s advice, we won’t apply the model.If we don’t follow Goodman’s advice & apply it anyway:

1. We know parameters are not constant

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

Precincts in Marion County, In-diana: Voter Turnout for theU.S. Senate by Fraction Black,1990.

Gary King () Ecological Inference 14 / 38

Page 55: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

The accounting identity, Ti = βbi Xi + βw

i (1− Xi ), contains no errorother than due to parameter variation. Thus, all scatter around theregression line is due to parameter variation.

2. Goodman’s model does not take into account information from themethod of bounds or from massive heteroskedasticity in aggregatedata. See the graph.

3. Goodman’s regression is biased in the presence of aggregation bias:C (βb

i ,Xi ) 6= 0 or C (βwi ,Xi ) 6= 0 (True in any regression even if not

ecological.)

Gary King () Ecological Inference 15 / 38

Page 56: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

The accounting identity, Ti = βbi Xi + βw

i (1− Xi ), contains no errorother than due to parameter variation. Thus, all scatter around theregression line is due to parameter variation.

2. Goodman’s model does not take into account information from themethod of bounds or from massive heteroskedasticity in aggregatedata. See the graph.

3. Goodman’s regression is biased in the presence of aggregation bias:C (βb

i ,Xi ) 6= 0 or C (βwi ,Xi ) 6= 0 (True in any regression even if not

ecological.)

Gary King () Ecological Inference 15 / 38

Page 57: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

The accounting identity, Ti = βbi Xi + βw

i (1− Xi ), contains no errorother than due to parameter variation. Thus, all scatter around theregression line is due to parameter variation.

2. Goodman’s model does not take into account information from themethod of bounds or from massive heteroskedasticity in aggregatedata. See the graph.

3. Goodman’s regression is biased in the presence of aggregation bias:C (βb

i ,Xi ) 6= 0 or C (βwi ,Xi ) 6= 0 (True in any regression even if not

ecological.)

Gary King () Ecological Inference 15 / 38

Page 58: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 59: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 60: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 61: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 62: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )

(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 63: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 64: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 65: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 66: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 67: Advanced Quantitative Research Methodology Lecture Notes

Selected Problems with the Goodman’s Approach

4. We cannot correct for aggregation bias within Goodman’s framework.

(a) The good idea that doesn’t work: since the coefficients vary with Xi , let’smodel that explicitly, hence using Xi to control for the covariation.

(b) More specifically, even if C (βbi ,Xi ) 6= 0, if we control for Zi it might be

true that C (βbi ,Xi |Zi ) = 0. And if Zi = Xi , its true for sure.

(c) Take Goodman’s regression E (Ti ) = BbXi + Bw (1− Xi )(d) Let Bb = γ0 + γ1Xi and Bw = θ0 + θ1Xi and substitute:

E (Ti ) = (γ0 + γ1Xi )Xi + (θ0 + θ1Xi )(1− Xi )

= θ0 + (γ0 + θ1 − θ0)Xi − (γ1 − θ1)X2i

(e) Model is not identified: Four parameters need to be estimated (γ0, γ1, θ0,and θ1), but only 3 can be estimated (θ0 and coefficients in parens on Xi

and X 2i ).

5. If the number of people differs across precinct, Goodman’s model is notestimating the correct quantity of interest.

Gary King () Ecological Inference 16 / 38

Page 68: Advanced Quantitative Research Methodology Lecture Notes

The Data

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

A Scattercross Graph of Voter Turnout by Fraction Hispanic

Solve the accounting identity:

Ti = βwi + (βb

i − βwi )Xi

for the unknowns:

βwi =

„Ti

1− Xi

«−

„Xi

1− Xi

«βb

i

Gary King () Ecological Inference 17 / 38

Page 69: Advanced Quantitative Research Methodology Lecture Notes

The Data

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

A Scattercross Graph of Voter Turnout by Fraction Hispanic

Solve the accounting identity:

Ti = βwi + (βb

i − βwi )Xi

for the unknowns:

βwi =

„Ti

1− Xi

«−

„Xi

1− Xi

«βb

i

Gary King () Ecological Inference 17 / 38

Page 70: Advanced Quantitative Research Methodology Lecture Notes

The Data

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

A Scattercross Graph of Voter Turnout by Fraction Hispanic

Solve the accounting identity:

Ti = βwi + (βb

i − βwi )Xi

for the unknowns:

βwi =

„Ti

1− Xi

«−

„Xi

1− Xi

«βb

i

Gary King () Ecological Inference 17 / 38

Page 71: Advanced Quantitative Research Methodology Lecture Notes

The Data

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

A Scattercross Graph of Voter Turnout by Fraction Hispanic

Solve the accounting identity:

Ti = βwi + (βb

i − βwi )Xi

for the unknowns:

βwi =

„Ti

1− Xi

«−

„Xi

1− Xi

«βb

i

Gary King () Ecological Inference 17 / 38

Page 72: Advanced Quantitative Research Methodology Lecture Notes

The Data

0 .25 .5 .75 10

.25

.5

.75

1

Xi

Ti

A Scattercross Graph of Voter Turnout by Fraction Hispanic

Solve the accounting identity:

Ti = βwi + (βb

i − βwi )Xi

for the unknowns:

βwi =

„Ti

1− Xi

«−

„Xi

1− Xi

«βb

i

Gary King () Ecological Inference 17 / 38

Page 73: Advanced Quantitative Research Methodology Lecture Notes

The Data: Continued

Precinct 52: T52 = .19, X52 = .88

βw52 =

T52

1− X52− X52

1− X52βb

52

=.19

1− .88− .88

1− .88βb

52

= 1.58− 7.33βb52

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 18 / 38

Page 74: Advanced Quantitative Research Methodology Lecture Notes

The Data: Continued

Precinct 52: T52 = .19, X52 = .88

βw52 =

T52

1− X52− X52

1− X52βb

52

=.19

1− .88− .88

1− .88βb

52

= 1.58− 7.33βb52

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 18 / 38

Page 75: Advanced Quantitative Research Methodology Lecture Notes

The Data: Continued

Precinct 52: T52 = .19, X52 = .88

βw52 =

T52

1− X52− X52

1− X52βb

52

=.19

1− .88− .88

1− .88βb

52

= 1.58− 7.33βb52

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 18 / 38

Page 76: Advanced Quantitative Research Methodology Lecture Notes

The Data: Continued

Precinct 52: T52 = .19, X52 = .88

βw52 =

T52

1− X52− X52

1− X52βb

52

=.19

1− .88− .88

1− .88βb

52

= 1.58− 7.33βb52

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 18 / 38

Page 77: Advanced Quantitative Research Methodology Lecture Notes

The Data: Continued

Precinct 52: T52 = .19, X52 = .88

βw52 =

T52

1− X52− X52

1− X52βb

52

=.19

1− .88− .88

1− .88βb

52

= 1.58− 7.33βb52

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 18 / 38

Page 78: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 79: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.

Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 80: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 81: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 82: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 83: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 84: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

The Goal: Knowledge of βbi and βw

i in each precinct.Begin with the basic accounting identity (not an assumption of linearity):

Ti = βbi Xi + βw

i (1− Xi )

add three assumptions (in the basic version of the model):

1. βbi and βw

i are truncated bivariate normal:

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(a) 0.5 0.5 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

02

46

8

(b) 0.1 0.9 0.15 0.15 0

βbi

βwi

00.2

0.40.6

0.81

0

0.2

0.4

0.6

0.8

1

0.1

0.2

0.3

0.4

0.5

0.6

(c) 0.8 0.8 0.6 0.6 0.5

βbi

βwi

(The 5 parameters of this density need to be estimated by forming thelikelihood.)

Gary King () Ecological Inference 19 / 38

Page 85: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

2. No aggregation bias (a priori): βbi and βw

i mean independent of Xi .Allows a posteriori aggregation bias (i.e., after conditioning on Ti )

3. No spatial autocorrelation: Ti |Xi are independent over observations.

Gary King () Ecological Inference 20 / 38

Page 86: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

2. No aggregation bias (a priori): βbi and βw

i mean independent of Xi .Allows a posteriori aggregation bias (i.e., after conditioning on Ti )

3. No spatial autocorrelation: Ti |Xi are independent over observations.

Gary King () Ecological Inference 20 / 38

Page 87: Advanced Quantitative Research Methodology Lecture Notes

The Model for Data Without Aggregation Bias, ButRobust in its Presence

2. No aggregation bias (a priori): βbi and βw

i mean independent of Xi .Allows a posteriori aggregation bias (i.e., after conditioning on Ti )

3. No spatial autocorrelation: Ti |Xi are independent over observations.

Gary King () Ecological Inference 20 / 38

Page 88: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 89: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 90: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 91: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 92: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 93: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 94: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}

These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 95: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 96: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 97: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 98: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

1. The story of the model is that we learn things in order

(a) (As in regression), everything is conditional on Xi , which means we learnit first.

(b) Then the world draws βbi and βw

i from a truncated normal, but we don’tget to see them.

(c) Finally, we learn Ti , which is computed via the accounting identitydeterministically: Ti = βb

i Xi + βwi (1− Xi ).

2. The random variable is then T (given X ), which is truncated bivarate normal

3. The five parameters of the truncated bivariate normal need to be estimated:

ψ̆ = {B̆b, B̆w , σ̆b, σ̆w , ρ̆} = {B̆, Σ̆}These are on the untruncated scale (and not quantities of interest) since:

TN(βbi , β

wi |B̆, Σ̆) = N(βb

i , βwi |B̆, Σ̆)

1(βbi , β

wi )

R(B̆, Σ̆)

where

R(B̆, Σ̆) =

Z 1

0

Z 1

0

N(βb, βw |B̆, Σ̆)dβbdβw (volume above unit square)

Gary King () Ecological Inference 21 / 38

Page 99: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 100: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 101: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 102: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 103: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 104: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 105: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)

=∏

Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 106: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

4. (From simulations of these parameters, we will compute quantities of interest:βb

i , βwi . Details shortly.)

5. The likelihood:

L(ψ̆|T ) ∝∏

Xi∈(0,1)

P(Ti |ψ̆)

=∏

Xi∈(0,1)

(What we observe

What we could have observed

)

=∏

Xi∈(0,1)

(Area above line segment

Volume above square

)

=∏

Xi∈(0,1)

(Area above line

Volume above plane

) (Area above line segment

Area above line

)(

Volume above squareVolume above plane

)=

∏Xi∈(0,1)

N(Ti |µi , σ2i )

S(B̆, Σ̆)

R(B̆, Σ̆)

Gary King () Ecological Inference 22 / 38

Page 107: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

where

E (Ti |Xi ) ≡ µi = B̆bXi + B̆w (1− Xi ),

V (Ti |Xi ) ≡ σ2i = (σ̆2

w ) + (2σ̆bw − 2σ̆2w )Xi + (σ̆2

b + σ̆2w − 2σ̆bw )X 2

i ,

S(B̆, Σ̆) =

∫ min“1,

TiXi

”max

“0,

T−(1−Xi )

Xi

” N

(βb

∣∣B̆b +ωi

σiεi , σ̆

2b −

ω2i

σ2i

)dβb

Gary King () Ecological Inference 23 / 38

Page 108: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

where

E (Ti |Xi ) ≡ µi = B̆bXi + B̆w (1− Xi ),

V (Ti |Xi ) ≡ σ2i = (σ̆2

w ) + (2σ̆bw − 2σ̆2w )Xi + (σ̆2

b + σ̆2w − 2σ̆bw )X 2

i ,

S(B̆, Σ̆) =

∫ min“1,

TiXi

”max

“0,

T−(1−Xi )

Xi

” N

(βb

∣∣B̆b +ωi

σiεi , σ̆

2b −

ω2i

σ2i

)dβb

Gary King () Ecological Inference 23 / 38

Page 109: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

where

E (Ti |Xi ) ≡ µi = B̆bXi + B̆w (1− Xi ),

V (Ti |Xi ) ≡ σ2i = (σ̆2

w ) + (2σ̆bw − 2σ̆2w )Xi + (σ̆2

b + σ̆2w − 2σ̆bw )X 2

i ,

S(B̆, Σ̆) =

∫ min“1,

TiXi

”max

“0,

T−(1−Xi )

Xi

” N

(βb

∣∣B̆b +ωi

σiεi , σ̆

2b −

ω2i

σ2i

)dβb

Gary King () Ecological Inference 23 / 38

Page 110: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

where

E (Ti |Xi ) ≡ µi = B̆bXi + B̆w (1− Xi ),

V (Ti |Xi ) ≡ σ2i = (σ̆2

w ) + (2σ̆bw − 2σ̆2w )Xi + (σ̆2

b + σ̆2w − 2σ̆bw )X 2

i ,

S(B̆, Σ̆) =

∫ min“1,

TiXi

”max

“0,

T−(1−Xi )

Xi

” N

(βb

∣∣B̆b +ωi

σiεi , σ̆

2b −

ω2i

σ2i

)dβb

Gary King () Ecological Inference 23 / 38

Page 111: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

where

E (Ti |Xi ) ≡ µi = B̆bXi + B̆w (1− Xi ),

V (Ti |Xi ) ≡ σ2i = (σ̆2

w ) + (2σ̆bw − 2σ̆2w )Xi + (σ̆2

b + σ̆2w − 2σ̆bw )X 2

i ,

S(B̆, Σ̆) =

∫ min“1,

TiXi

”max

“0,

T−(1−Xi )

Xi

” N

(βb

∣∣B̆b +ωi

σiεi , σ̆

2b −

ω2i

σ2i

)dβb

Gary King () Ecological Inference 23 / 38

Page 112: Advanced Quantitative Research Methodology Lecture Notes

Deriving the Likelihood Function

6. A visual version of the likelihood:

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Gary King () Ecological Inference 24 / 38

Page 113: Advanced Quantitative Research Methodology Lecture Notes

The Truncated Bivariate Normal Distribution’s Five Parameters Can be Estimated

From Aggregate Data: Intuition

••

••

••

••

••••

••

••

••

••

••

•••

••

••

•••

••••

•••••

•••

••••

•••••

•••••

••

•••

••••

•••••

••

••

••

••

•••

(a) 0.5 0.5 0.2 0.2 -0.95

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

••

••

••

••

••

••

•••

••

••

•••

••

••

••

••

•••

••

••

••

••

••

••

••

••

••

••

(b) 0.5 0.5 0.2 0.2 0.95

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

••

••

••

••••

••

••

••

••

•••

••

••

•••••••••

••

••

••

••

••

••

•••

•••

••

••

••

••

••

••

••

••

••

••

••

••

••

••

••

(c) 0.7 0.3 0.2 0.2 -0.64

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

••

•••

••

••

•••

••

••

••

••

••

••

••

••

••

••

••

••

•••

••

••

••

••

••••

••

•••

••

••

••

•••

••

••

••

••

••

••

••

•••

••••••

••

•••

(d) 0.8 0.5 0.1 0.3 0.4

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

••

•••

•••

•••

••

••••

••

••

••

•••

•••

••

•••

••

••••••

••

•••

••

••

••

•••

•••

•••

••

••

••

••

••

••

••

••

••

(e) 0.4 0.8 0.3 0.1 -0.48

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

••

••

••

••

••

••

••

•••

••

••

••

••

••

••

••

••

••

••

•••

••

••

••

•••

••

••

••

••

•••

••

••••

(f) 0.2 0.7 0.2 0.2 0.57

0 .25 .5 .75 1

0

.25

.5

.75

1

X i

Ti

Data were randomly generated from the model with parameter values B̆b, B̆w , σb, σw ,and ρ, at the top of each graph. The solid line is the expected value and dashed lines areat plus and minus one standard deviation.

Gary King () Ecological Inference 25 / 38

Page 114: Advanced Quantitative Research Methodology Lecture Notes

Another view of how the data change with the model

0 .25 .5 .75 1

0

.25

.5

.75

1(a) 0.5 0.7 0.2 0.2 -0.8

βbi

βwi

••

0 .25 .5 .75 1

0

.25

.5

.75

1(d) 0.1 0.1 0.2 0.3 -0.8

βbi

βwi

••

0 .25 .5 .75 1

0

.25

.5

.75

1(b) 0.5 0.7 0.2 0.2 0

βbi

βwi

••

0 .25 .5 .75 1

0

.25

.5

.75

1(e) 0.9 0.9 0.1 0.3 0

βbi

βwi

••

0 .25 .5 .75 1

0

.25

.5

.75

1(c) 0.5 0.7 0.2 0.2 0.8

βbi

βwi

••

0 .25 .5 .75 1

0

.25

.5

.75

1(f) -0.05 -0.05 0.2 0.3 0.8

βbi

βwi

••Observable Implications for Sample Parameter Values. The numbers at the topof each tomography plot are the parameter values for the distribution from whichdata were randomly generated: B̆b, B̆w , σ̆b, σ̆w , and ρ̆.

Gary King () Ecological Inference 26 / 38

Page 115: Advanced Quantitative Research Methodology Lecture Notes

Calculating Quantities of Interest: A story of X-Rays andtomography machines; then how to do it

Rearranging the basic accounting identity gives βwi as a linear function of

βbi :

βwi =

(Ti

1− Xi

)−

(Xi

1− Xi

)βb

i

Thus, knowing Ti and Xi in one precinct narrows the possible values ofβb

i , βwi to one line cut across this figure:

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

A Tomography Plot

Gary King () Ecological Inference 27 / 38

Page 116: Advanced Quantitative Research Methodology Lecture Notes

Calculating Quantities of Interest: A story of X-Rays andtomography machines; then how to do it

Rearranging the basic accounting identity gives βwi as a linear function of

βbi :

βwi =

(Ti

1− Xi

)−

(Xi

1− Xi

)βb

i

Thus, knowing Ti and Xi in one precinct narrows the possible values ofβb

i , βwi to one line cut across this figure:

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

A Tomography Plot

Gary King () Ecological Inference 27 / 38

Page 117: Advanced Quantitative Research Methodology Lecture Notes

Calculating Quantities of Interest: A story of X-Rays andtomography machines; then how to do it

Rearranging the basic accounting identity gives βwi as a linear function of

βbi :

βwi =

(Ti

1− Xi

)−

(Xi

1− Xi

)βb

i

Thus, knowing Ti and Xi in one precinct narrows the possible values ofβb

i , βwi to one line cut across this figure:

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

A Tomography Plot

Gary King () Ecological Inference 27 / 38

Page 118: Advanced Quantitative Research Methodology Lecture Notes

Calculating Quantities of Interest: A story of X-Rays andtomography machines; then how to do it

Rearranging the basic accounting identity gives βwi as a linear function of

βbi :

βwi =

(Ti

1− Xi

)−

(Xi

1− Xi

)βb

i

Thus, knowing Ti and Xi in one precinct narrows the possible values ofβb

i , βwi to one line cut across this figure:

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

A Tomography Plot

Gary King () Ecological Inference 27 / 38

Page 119: Advanced Quantitative Research Methodology Lecture Notes

Calculating Quantities of Interest: A story of X-Rays andtomography machines; then how to do it

05

10152025

P 48

05

10152025

P 115

05

10152025

P 195

05

10152025

P 238

0 .2 .4 .6 .8 1

βbi

Gary King () Ecological Inference 28 / 38

Page 120: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities

(a) Algorithm to take one draw of the district-level fraction of blacks whovote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 121: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities

(a) Algorithm to take one draw of the district-level fraction of blacks whovote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 122: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 123: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 124: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.

iii. Compute the weighted average of the simulated coefficients (weights basedon precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 125: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 126: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 127: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

1. Option 1. Simulate only (district level) aggregate quantities(a) Algorithm to take one draw of the district-level fraction of blacks who

vote:

i. Draw ψ̆ from its posterior or sampling density: an asymptotic normal withmean equal to point estimates and variance the inverse of the -Hessian atthe maximum.

ii. Draw βbi and βw

i from TN(βbi , β

wi |B̆, Σ̆), given the simulated parameters,

ψ̆ = {B̆, Σ̆}.iii. Compute the weighted average of the simulated coefficients (weights based

on precinct population):

B̃b =

pXi=1

Nb+i β̃b

i

Nb++

(b) Problem: We only get knowledge of the district-wide aggregate & its notrobust.

Gary King () Ecological Inference 29 / 38

Page 128: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:

(a) By the story of the model, if we know Ti , we learn the entire tomographyline (since Xi is known ex ante).

(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 129: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:

(a) By the story of the model, if we know Ti , we learn the entire tomographyline (since Xi is known ex ante).

(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 130: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).

(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 131: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.

(c) We could apply the Option 1 algorithm and use rejection sampling(discard simulations of βb

i , βwi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 132: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.

(d) Alternative algorithm for drawing simulations of βbi and βw

i .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 133: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 134: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).

ii. Draw ψ̆ from its posterior or sampling density (the same multivariatenormal as always).

iii. Insert the simulation into P(βbi |Ti , ψ̆) and draw out one simulated βb

i .iv. Compute βw

i , if desired, deterministically from reformulated accountingidentity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 135: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).

iii. Insert the simulation into P(βbi |Ti , ψ̆) and draw out one simulated βb

i .iv. Compute βw

i , if desired, deterministically from reformulated accountingidentity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 136: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 137: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 138: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

2. Option 2. use the knowledge that simulations for observation i mustcome from its tomography line:(a) By the story of the model, if we know Ti , we learn the entire tomography

line (since Xi is known ex ante).(b) So we will condition on Ti to make a prediction from the tomography line.(c) We could apply the Option 1 algorithm and use rejection sampling

(discard simulations of βbi , β

wi that are not on the tomography line), but

this would take forever.(d) Alternative algorithm for drawing simulations of βb

i and βwi .

i. Find the expression for P(βbi |Ti , ψ̆) analytically, which is a particular

truncated univariate normal (see King, 1997: Appendix C).ii. Draw ψ̆ from its posterior or sampling density (the same multivariate

normal as always).iii. Insert the simulation into P(βb

i |Ti , ψ̆) and draw out one simulated βbi .

iv. Compute βwi , if desired, deterministically from reformulated accounting

identity:

β̃wi =

„Ti

1− Xi

«−

„Xi

1− Xi

«β̃b

i

Gary King () Ecological Inference 30 / 38

Page 139: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)

4. Option 2 will also be improved by fixing the draws with importanceresampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 140: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 141: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 142: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 143: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 144: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 145: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 146: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)

4. Option 2 will also be improved by fixing the draws with importanceresampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 147: Advanced Quantitative Research Methodology Lecture Notes

How to Calculate Quantities of Interest

3. Option 2 will be improved via reparameterization.

φ1 =B̆b − 0.5

σ̆2b + 0.25

φ2 =B̆w − 0.5

σ̆2w + 0.25

both of which reduce correlations, and

φ3 = ln(σ̆b)

φ4 = ln(σ̆w )

φ5 = 0.5 ln

(1 + ρ̆

1− ρ̆

)4. Option 2 will also be improved by fixing the draws with importance

resampling (just as we did with Amelia, Lecture Notes 4)

Gary King () Ecological Inference 31 / 38

Page 148: Advanced Quantitative Research Methodology Lecture Notes

Advantages and Risks of Goodman

(1) (2)Assumptions correct? Yes No

Estimates accurate? Yes No

Advantages and Risks of EI

(1) (2) (3) (4) (5)Assumptions correct? Yes No No No NoBounds Informative? Yes No No No

Diagnostics Informative? Yes No NoQualitative Info. Useful? Yes No

Estimates accurate? Yes Yes Yes Yes No∗

∗Even in this case, EI is more robust than Goodman, and the maximumrisks are knowable ex ante.

Gary King () Ecological Inference 32 / 38

Page 149: Advanced Quantitative Research Methodology Lecture Notes

Advantages and Risks of Goodman

(1) (2)Assumptions correct? Yes No

Estimates accurate? Yes No

Advantages and Risks of EI

(1) (2) (3) (4) (5)Assumptions correct? Yes No No No NoBounds Informative? Yes No No No

Diagnostics Informative? Yes No NoQualitative Info. Useful? Yes No

Estimates accurate? Yes Yes Yes Yes No∗

∗Even in this case, EI is more robust than Goodman, and the maximumrisks are knowable ex ante.

Gary King () Ecological Inference 32 / 38

Page 150: Advanced Quantitative Research Methodology Lecture Notes

Advantages and Risks of Goodman

(1) (2)Assumptions correct? Yes No

Estimates accurate? Yes No

Advantages and Risks of EI

(1) (2) (3) (4) (5)Assumptions correct? Yes No No No NoBounds Informative? Yes No No No

Diagnostics Informative? Yes No NoQualitative Info. Useful? Yes No

Estimates accurate? Yes Yes Yes Yes No∗

∗Even in this case, EI is more robust than Goodman, and the maximumrisks are knowable ex ante.

Gary King () Ecological Inference 32 / 38

Page 151: Advanced Quantitative Research Methodology Lecture Notes

Advantages and Risks of Goodman

(1) (2)Assumptions correct? Yes No

Estimates accurate? Yes No

Advantages and Risks of EI

(1) (2) (3) (4) (5)Assumptions correct? Yes No No No NoBounds Informative? Yes No No No

Diagnostics Informative? Yes No NoQualitative Info. Useful? Yes No

Estimates accurate? Yes Yes Yes Yes No∗

∗Even in this case, EI is more robust than Goodman, and the maximumrisks are knowable ex ante.

Gary King () Ecological Inference 32 / 38

Page 152: Advanced Quantitative Research Methodology Lecture Notes

• •• •

••••

• •

• •

• ••• •

••••• •

•••• •

••

••••

• • ••

•• •••

•• •

••

• ••

••

• ••

•• •• •

•••

••

0 .25 .5 .75 1

0

.25

.5

.75

1(A) Tomography, No Bias

βbi

βwi

0 .25 .5 .75 1

0

.25

.5

.75

1(B) Scatter Plot, No Bias

Xi

Ti

•••

•• •••••

••••

• •

•••

••

••

•••

••

••••

•••

•••

•• ••

• ••

•••• ••

••

• ••

•••

• ••••

••••••

0 .25 .5 .75 1

0

.25

.5

.75

1(C) Tomography, Bias

βbi

βwi

0 .25 .5 .75 1

0

.25

.5

.75

1(D) Scatter Plot, Bias

Xi

Ti

Figure: The Worst of Aggregation Bias: Same Truth, Different Observable Implications. Graphs (A) and (B) represent data

randomly generated from the model with parameters B̆b = B̆w = 0.5, σ̆b = 0.4, σ̆w = 0.1, and ρ̆ = 0.2. Graphs (C) and

(D) represent data with the same values of βbi and βw

i but different aggregate data, created to maximize the degree ofaggregation bias while still leaving the generating distribution unchanged. Graphs (A) and (C) are tomography plots with truecoordinates appearing as black dots, and with contour ellipses estimated from aggregate data only. Graphs (B) and (D) arescatter plots of Xi by Ti with lines representing the true precinct parameters.

Gary King () Ecological Inference 33 / 38

Page 153: Advanced Quantitative Research Methodology Lecture Notes

• •• •

•• ••

• •

• •• ••• ••• •••••

••••

•••

•• • •••••

•• • •••

••••

••

• ••

••

••

•• ••• •• •

••

•••

••

0 .25 .5 .75 1

0

.25

.5

.75

1(A) Tomography, No Bias

βbi

βwi

0 .25 .5 .75 1

0

.25

.5

.75

1(B) Scatter Plot, No Bias

Xi

Ti

••• •

••••••

• ••

••••

•• ••

•••• •

••••••••

•• •• •••

• •••••••

••

•• • ••

• ••• •• ••

•• •••• •

0 .25 .5 .75 1

0

.25

.5

.75

1(C) Tomography, Bias

βbi

βwi

0 .25 .5 .75 1

0

.25

.5

.75

1(D) Scatter Plot, Bias

Xi

Ti

Figure: The Worst of Distributional Violations: Different True Parameters, Same Observable Implications. Graphs (A) and

(B) represent data randomly generated from the model with parameters B̆b = B̆w = 0.5, σ̆b = σ̆w = 0.1, and ρ̆ = 0.

Graphs (C) and (D) represent data with the same values of Xi and Ti but different values of βbi and βw

i , created to minimizethe fit of the truncated bivariate normal distribution while not introducing aggregation bias. Graphs (A) and (C) are tomographyplots with true coordinates appearing as black dots, and with contour ellipses estimated from aggregate data only. Graphs (B)and (D) are scatter plots of Xi by Ti with lines representing the true precinct parameters.

Gary King () Ecological Inference 34 / 38

Page 154: Advanced Quantitative Research Methodology Lecture Notes

Diagnostics

0 .25 .5 .75 1

0

.25

.5

.75

1

βbi

βwi

Evidence of Multiple Modes (King, 1997: Figure 9.7)

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Evidence of Aggregation Bias (King, 1997: Figure 9.3)

Gary King () Ecological Inference 35 / 38

Page 155: Advanced Quantitative Research Methodology Lecture Notes

Diagnostics

0 .25 .5 .75 1

0

.25

.5

.75

1

βbi

βwi

Evidence of Multiple Modes (King, 1997: Figure 9.7)

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Evidence of Aggregation Bias (King, 1997: Figure 9.3)

Gary King () Ecological Inference 35 / 38

Page 156: Advanced Quantitative Research Methodology Lecture Notes

Diagnostics

0 .25 .5 .75 1

0

.25

.5

.75

1

βbi

βwi

Evidence of Multiple Modes (King, 1997: Figure 9.7)

0 .25 .5 .75 10

.25

.5

.75

1

βbi

βwi

Evidence of Aggregation Bias (King, 1997: Figure 9.3)

Gary King () Ecological Inference 35 / 38

Page 157: Advanced Quantitative Research Methodology Lecture Notes

Diagnostics

••

••••

• ••• •

••

••

••

••

••

••

•••

• •••

••

•••

•• •

••

••• ••• •

• ••••

•• •

••

•••

••

••

•••

•••

••

••

••

••

••••

• •••

••

••••

•••

•••

••

••••

••••

••

• •••

••

•••

••••

• •••• ••

••

• ••

•••

••

•• •

•••

••

• ••

•••

•••

••••

•••••••

•• ••

• •• ••

••

••••• •

••

•••

••

• ••••

•••

••

••

••

• ••

••

• •••

••••

••

••

•• ••

•••

•••

••

••

0 .25 .5 .75 10

.25

.5

.75

1

X i

Tru

e βb i

• ••

• •••• •• •

•• •••

•• •••• • ••

• ••• ••

•••

•• •• ••• •• • •••••• • ••

•• ••• •• ••••••

•• ••••••• •••• ••••

••• •••••••

••••••••

••• •• •••

••••••• ••••••

••• •••

•••

•• ••••

••• ••••

•••

• ••••••••••

• ••

• • • ••• ••••

• •• •• •

••• ••• •• ••••

•• •

•• ••••••••

•• •• •• ••

• • •••

••••• ••••

••• ••

•• ••• ••

• •••• • ••

• • • •••

• •••

•••

• •••••

•• • ••• ••

••

• •••• •••

•• ••

0 .25 .5 .75 10

.25

.5

.75

1

X i

Tru

e βw i

Evidence of aggregation bias in real data: an alternative display (King:1997: Figure 13.2)

Gary King () Ecological Inference 36 / 38

Page 158: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 159: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 160: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 161: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 162: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 163: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 164: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 165: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

1. Adding a systematic component for the means, with covariates Zi :

B̆bi =

hφ1(σ̆

2b + 0.25) + 0.5

i+ (Z b

i − Z̄ b)αb

B̆wi =

hφ2(σ̆

2w + 0.25) + 0.5

i+ (Zw

i − Z̄w )αw

This bracketed term unwinds the reparameterization.

2. Allowing Zi = Xi works if bounds are informative (recall that it was not identifiedunder Goodman’s unbounded approach)

•0 .25 .5 .75 1 1.25

0

.025

.05

.1

.15

Mea

n A

bsol

ute

Err

or

α

τ= 0

τ= 0.1

τ= 0.2

τ= 0.5

τ= 1τ= ∞

τ= 0

τ= ∞

1. Figure (from King, 1997: Figure 9.6) is mean absolute error of district-levelquantities (averaged over 90 Monte Carlo simulations for each point) by the degreeof aggregation bias (α = αb = αw set at points on the horizontal axis) for modelswith prior N(α|0, τ).

Gary King () Ecological Inference 37 / 38

Page 166: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

2. In the case with τ = 0 (i.e., no covariate), MAE increases with α, butmaxes out — which is the robustness due to the bounds kicking in.

3. Goodman’s regression, if plotted, would be a straight line increasingwithout bound.

4. In the τ = ∞ (covariate included, no prior) case, MAE doesn’t changewith α. Its higher than τ = 0 case with small α and larger with higherα.

5. As usual with Bayesian analysis, the intermediate cases seem optimal.E.g., τ = 0.5 sacrifices trivially compared to the τ = 0 case when α issmall for a big gain when α is large.

Gary King () Ecological Inference 38 / 38

Page 167: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

2. In the case with τ = 0 (i.e., no covariate), MAE increases with α, butmaxes out — which is the robustness due to the bounds kicking in.

3. Goodman’s regression, if plotted, would be a straight line increasingwithout bound.

4. In the τ = ∞ (covariate included, no prior) case, MAE doesn’t changewith α. Its higher than τ = 0 case with small α and larger with higherα.

5. As usual with Bayesian analysis, the intermediate cases seem optimal.E.g., τ = 0.5 sacrifices trivially compared to the τ = 0 case when α issmall for a big gain when α is large.

Gary King () Ecological Inference 38 / 38

Page 168: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

2. In the case with τ = 0 (i.e., no covariate), MAE increases with α, butmaxes out — which is the robustness due to the bounds kicking in.

3. Goodman’s regression, if plotted, would be a straight line increasingwithout bound.

4. In the τ = ∞ (covariate included, no prior) case, MAE doesn’t changewith α. Its higher than τ = 0 case with small α and larger with higherα.

5. As usual with Bayesian analysis, the intermediate cases seem optimal.E.g., τ = 0.5 sacrifices trivially compared to the τ = 0 case when α issmall for a big gain when α is large.

Gary King () Ecological Inference 38 / 38

Page 169: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

2. In the case with τ = 0 (i.e., no covariate), MAE increases with α, butmaxes out — which is the robustness due to the bounds kicking in.

3. Goodman’s regression, if plotted, would be a straight line increasingwithout bound.

4. In the τ = ∞ (covariate included, no prior) case, MAE doesn’t changewith α. Its higher than τ = 0 case with small α and larger with higherα.

5. As usual with Bayesian analysis, the intermediate cases seem optimal.E.g., τ = 0.5 sacrifices trivially compared to the τ = 0 case when α issmall for a big gain when α is large.

Gary King () Ecological Inference 38 / 38

Page 170: Advanced Quantitative Research Methodology Lecture Notes

Model Extensions

2. In the case with τ = 0 (i.e., no covariate), MAE increases with α, butmaxes out — which is the robustness due to the bounds kicking in.

3. Goodman’s regression, if plotted, would be a straight line increasingwithout bound.

4. In the τ = ∞ (covariate included, no prior) case, MAE doesn’t changewith α. Its higher than τ = 0 case with small α and larger with higherα.

5. As usual with Bayesian analysis, the intermediate cases seem optimal.E.g., τ = 0.5 sacrifices trivially compared to the τ = 0 case when α issmall for a big gain when α is large.

Gary King () Ecological Inference 38 / 38