loglinear models for contingency tables

20
Loglinear Models for Contingency Tables

Upload: len

Post on 15-Feb-2016

57 views

Category:

Documents


0 download

DESCRIPTION

Loglinear Models for Contingency Tables. Consider an IxJ contingency table that cross-classifies a multinomial sample of n subjects on two categorical responses. The cell probabilities are (  i j ) and the expected frequencies are (  i j = n  i j ) . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Loglinear  Models for Contingency Tables

Loglinear Models forContingency Tables

Page 2: Loglinear  Models for Contingency Tables

• Consider an IxJ contingency table that cross-classifies a multinomial sample of n subjects on two categorical responses.

• The cell probabilities are (i j) and the expected frequencies are (i j = n i j ).

• Loglinear model formulas use (i j = n i j ) rather than (i j), so they also apply with Poisson sampling for N = IJ independent cell counts (Yi j) having {i j=E(Yi j) }.

• In either case we denote the observed cell counts by (nij)

Page 3: Loglinear  Models for Contingency Tables

Independence Model

Under statistical independence

For multinomial sampling

Denote the row variable by X and the column variable by YThe formula expressing independence is multiplicative

Page 4: Loglinear  Models for Contingency Tables

Thusfor a row effect and a column effectThis is the loglinear model of independence. As usual, identifiability requires constraints such as

Page 5: Loglinear  Models for Contingency Tables
Page 6: Loglinear  Models for Contingency Tables

• The tests using X2 and G2 are also goodness-of-fit tests of this loglinear model.

• Loglinear models for contingency tables are GLMs that treat the N cell counts as independent observations of a Poisson random component.

• Loglinear GLMs identify the data as the N cell counts rather than the individual classifications of the n subjects.

• The expected cell counts link to the explanatory terms using the log link

Page 7: Loglinear  Models for Contingency Tables

• The model does not distinguish between response and explanatory variables.

• It treats both jointly as responses, modeling ij for combinations of their levels.

• To interpret parameters, however, it is helpful to treat the variables asymmetrically.

Page 8: Loglinear  Models for Contingency Tables

• We illustrate with the independence model for Ix2 tables.

• In row i, the logit equals

Page 9: Loglinear  Models for Contingency Tables

• The final term does not depend on i; • that is, logit[P(Y=1| X=i)] is identical at each

level of X• Thus, independence implies a model of form, logit[P(Y=1| X=i)] = • In each row, the odds of response in column 1 equal exp() = exp(

Page 10: Loglinear  Models for Contingency Tables

An analogous property holds when J>2.• Differences between two parameters for a

given variable relate to the log odds of making one response, relative to the other, on that variable

Page 11: Loglinear  Models for Contingency Tables

Saturated Model

Statistically dependent variables satisfy a more complex loglinear model

The are association terms that reflect deviations from independence.The represent interactions between X and Y, whereby the effect of one variable on ij depends on the level of the other

Page 12: Loglinear  Models for Contingency Tables

direct relationships exist between log odds ratios and

Page 13: Loglinear  Models for Contingency Tables

Parameter Estimation

Let {ij} denote expected frequencies. Suppose all ijk >0 and let ij = log ij . A dot in a subscript denotes the average with respect to that index; for instance, We set

, ,

Page 14: Loglinear  Models for Contingency Tables

The sum of parameters for any index equals zero. That is

Page 15: Loglinear  Models for Contingency Tables

INFERENCE FOR LOGLINEAR MODELS

Chi-Squared Goodness-of-Fit Tests• As usual, X 2 and G2 test whether a model holds by

comparing cell fitted values to observed counts

• Where nijk = observed frequency and =expected frequency . Here df equals the number of cell counts minus the number of model parameters.

𝑋 2=∑𝑖∑𝑗

(𝑛𝑖𝑗− �̂�𝑖𝑗 )2

�̂�𝑖𝑗

=2

Page 16: Loglinear  Models for Contingency Tables

Example for Saturated ModelSex Party Total

Democrat Republic

Male 222 (204.32) 115 (132.68) 337

Female 240 (257.68) 185 (167.32) 425

Total 462 300 762

Sex Party Total

Democrat Republic

Male Log(204.32) = 5.32 Log(132.68) = 4.89 10.21

Female Log(257.68) = 5.55 Log(167.32) = 5.12 10.67

Total 10.87 10.01 20.88

Page 17: Loglinear  Models for Contingency Tables
Page 18: Loglinear  Models for Contingency Tables
Page 19: Loglinear  Models for Contingency Tables

)=204.38)=132.95)=257.24)=167.34

Page 20: Loglinear  Models for Contingency Tables

Model lengkap tidak sesuai