more topics in categorical data analysis

More Topics in Categorical Data Analysis

Topics covered in this lecture• Goodness-of-Fit Tests• Cochran-Armitage Test for Trend in

Proportions• Effect Modifiers and Confounding Factors - Stratification - Simpson’s Paradox - Cochran-Mantel-Haenszel Test• Measure of Association between two

nominal or ordinal variables (X and Y)

Goodness-of-Fit Tests• There are numerous goodness-of-fit

tests in statistics.• These tests are used to determine if

our sample on a single variable is consistent with a certain specified distribution.

• For example, we might test to see if a numeric random variable has a normal distribution or not.

Goodness-of-Fit Tests• As we are talking about categorical,

nominal, or ordinal variables in this lecture we will consider goodness-of-fit tests for these variable types only.

• To motivate the concept of a goodness-of-fit test we will consider a few examples where they could be used.

• We will then examine the statistical details of a goodness-of-fit test for these variable types.

• Finally we will return to the motivating examples and analyze the results.

Example 1: Gender Balance?• Are the percentage of babies born of

each gender in North Carolina equal?

The percentage of each gender if equal should be 50%. That is the distribution across gender should be balanced or uniform. Do these data suggest this is not the case?

Example 2: Monthly Births in NC

• Is there evidence that births in North Carolina are uniformly distributed throughout the year?

If births are uniformly distributed throughout the year, what percentage of total births should occur in each month?Answer: 1/12 = .0833 or 8.33%

Example 3:Trimester prenatal care begins in NC• A researcher believes that 85% of

mothers begin prenatal care during the first trimester of pregnancy in NC. They also believe that only 1% of mothers begin prenatal care during the third trimester. Do these data

provide evidence that this is not the case?

Goodness-of-Fit Tests• To conduct a goodness-of-fit test we need

to compare the observed frequencies in each category to what we expect to see if the null hypothesis is true.

• The null hypothesis will specify an exact distribution we expect to see across the categories of a nominal or ordinal variable.

• Specifically, the null will state what proportion or percentage of observations we expect to see in each category.

Goodness-of-Fit TestsAssuming the variable we are examining has k levels or categories the null hypothesis is specified as follows:

and the alternative says

or stated another way At least one of the hypothesized proportions under the null is incorrect.

Goodness-of-Fit TestsThe test statistic is the familiar Pearson Chi-square statistic.

Which follows a chi-square distribution with df = k – 1. Large values of this statistic indicate the observed frequencies are not consistent with the hypothesized distribution under the null. If the test statistic is large, the p-value will be small, i.e. indicate evidence against the null distribution.

Goodness-of-Fit Tests

Goodness-of-Fit Chi-square Statistic

The expected frequency in category i is given by the formula:

which is simply the overall sample size times the hypothesized proportion for category i.

Example 1: Gender Balance?• Are the percentage of babies born of

each gender in North Carolina equal?

Example 1: Gender Balance?

The expected frequencies based on a sample of n = 10,000 infants are:

Thus the Goodness-of-Fit chi-square statistic is given by:

Example 1: Gender Balance?The p-value associated with a chi-square statistic value of 10.12 with df = (2 – 1) = 1is p-value = .00147 < a = .05 so we reject the null.

Example 1: Gender Balance?Thus we conclude that the proportion of infants born of each gender in NC are not equal (p = .0015).

We estimate the percentage of male infants born in NC is between 50.61% and 52.57%, and the percentage of female infants born in NC is between 47.43% and 49.39% with 95% confidence.

Example 2: Monthly Births in NCIs there evidence that births in North Carolina are uniformly distributed throughout the year?

for all 12 months.

p-value < .0001

Example 2: Monthly Births in NC• Is there evidence that births in North

Carolina are uniformly distributed throughout the year?

We have strong evidence to suggest births are not uniformly distributed across the 12 months (p < .0001). The statistical significance of this finding could potentially due to the difference in the number of days per month.


mothers begin prenatal care during the first trimester of pregnancy in NC. They also believe that only 1% of mothers begin prenatal care during the third trimester. Does these

data provide evidence that this is not the case?


mothers begin prenatal care during the first trimester of pregnancy in NC. They also believe that only 1% of mothers begin prenatal care during the third trimester.

As this information was not recorded for 89 mothers/infants in the study n = 9911. Therefore the expected frequencies are .

Example 3:Trimester prenatal care begins in NC

The chi-square statistic value is

yielding a p-value < .0001. Thus we conclude the hypothesized distribution for the trimester prenatal care begins for expecting mothers in NC is not correct.

Commentary on these resultsWe notice in all three examples using the NC Birth Data we rejected the hypothesized null distribution. However, when we look at the observed proportions they really do not differ very much from the hypothesized proportions on a percentage point basis. This is an example where we almost have TOO MUCH POWER due to our large sample size (n = 10,000), thus we have good chance of detecting very small absolute differences!

Cochran-Armitage Test for Trend

• Sometimes we are interested in determining if there is an increasing or decreasing trend in proportions for a DICHOTOMOUS NOMINAL response variable (Y) as function of an ORDINAL predictor variable (X).

• The Cochran-Armitage Test for Trend allows to determine if there is statistically significance evidence of this type of trend in our study data.


• We might have a low, medium, and high dose of drug and wish to determine if the proportion of patients experience benefit increases with the ordinal dose variable.

• X = dose of drug (low, medium, high)• Y = response to treatment (success or failure)• Our research or alternative hypothesis might be

where p is the proportion of subjects successfully treated


• As second example, we might consider the 1-year survival probability of a liver cancer patient as a function of their cancer grade at the time of diagnosis.

• X = cancer or tumor grade at diagnosis (1, 2, 3, or 4)• Y = 1-year survival (died or survived)• Our research or alternative hypothesis might be where p is the proportion of subjects who survive one year after initial diagnosis.

Example: IUD Use and Infertility

In a case-control study of infertility the length of IUD device use in months was thought to be a risk factor. Length of IUD use was coded as an ordinal variable as follows: < 3 mos., 3 – 17 mos., 18 – 35 mos., 36+ mos. 1 2 3 4It is thought the amongst the cases the proportion of subjects in the IUD use levels will tend to increase .

Example: IUD Use and InfertilityThe results obtained from the study were as follows:

The proportion of cases in the IUD use categories does not appear to trend as expected.

Length of IUD Use

< 3 mos.

3 – 17 mos.

18 – 35 mos.

36+ mos.

RowTotals

Cases 10 23 20 36 89 (fixed)

Controls 53 200 168 219 640 (fixed)

Column Totals 63 223 188 255 n = 729

We probably expect to fail to reject given this lack of trend.


The results obtained from the study were as follows:

We will not cover the statistical details of this test, rather we will enter these data in JMP and conduct the test.

Length of IUD Use

< 3 mos.

3 – 17 mos.

18 – 35 mos.

36+ mos.

RowTotals

Cases 10 23 20 36 89 (fixed)

Controls 53 200 168 219 640 (fixed)

Column Totals 63 223 188 255 n = 729


The results entered in JMP

You should convince yourself that this data table matches the contingency table on the previous page exactly. The ordinal nature of the IUD Use categories is ensured through the use of 1 -, 2 - , 3 - , 4 – prefixes.

Example: IUD Use and InfertilityUse Fit Y by X with IUD Use (X) & Group (Y)

There is no evidence of an increasing trend in the cases as a function of the ordinal IUD use variable (p = .6099).

Confounding and StratificationConsider the following prospective study where researchers examined the relationship between heavy drinking and lung cancer.Drinking

Status Lung Cancer

No Lung Cancer Row Total

Heavy Drinker 33 1667 1700Non-drinker 27 2273 2300ColumnTotal 60 3940 n = 4000

RR = 1.65

OR = 1.67

.0484

We have evidence to suggest heavy drinking is positively associated with lung cancer (p < .0484).

Confounding and StratificationHowever if we separate smokers and non-

smokers at baseline we obtain the following two tables, both of which show no evidence of an association between drinking status and lung cancer. Non-smokers Smokers

RR = 1.0

OR = 1.0

RR = 1.0

OR = 1.0

Smoking is a POSITIVE CONFOUNDER, that is it is positively associated with the risk factor drinking status and the response lung cancer.

Confounding and Stratification

Positive Confounders are either:• positively associated with both the risk

factor and the disease• negatively associated with both the risk

factor and the disease

After adjusting (stratifying on) for a positive confounder the RR and OR will be lower than the unadjusted RR and OR.

Confounding and Stratification

Negative Confounders are either:• positively associated with the risk factor

and negatively associated with the disease• negatively associated with the risk factor

and positively associated with the disease

After adjusting (stratifying on) for a negative confounder the RR and OR will be higher than the unadjusted RR and OR.

Cochran Mantel Haenszel Test

Is a test which allows us to determine if a significant relationship between two factors exists after adjusting for a third factor.

For example, in the previous study the apparent relationship between drinking and lung cancer disappears when we taking smoking at baseline into account.


The unadjusted relationship between drinking status and lung cancer is significant (p = .0484) at a = .05 level.

The adjusted relationship between drinking status and lung cancer adjusting for smoking at baseline is NOT significant by using the CMH test (p = 1.00).


The details of this test are very formula intensive and can be found in some of the optional references for this course.

Rather than cover the details we will look at how to easily implement this test procedure in JMP.


These data are from the prospective lung cancer study. If we cross-tabulate Drinking Status and Lung Cancer we find a significant association using either Pearson’s Chi-Square Test or Fisher’s Exact Test (p = .0484).

To conduct the CMH Test we simply select the test from the Contingency Analysis pull-down menu and then select the variable Smoking at Baseline to stratify on. The CMH test will then give a p-value for the relationship between Drinking Status and Lung Cancer Status adjusting for smoking (p = 1.000).


The p-value from the CMH test is 1.000 indicating the apparent relationship between heavy drinking and lung cancer is has completely disappeared.

CMH Test another exampleConsider an experiment looking at the relationship between treatment and response to treatment in a migraine study. Both men and women subjects were used in the study.For all subjects the results are:

For all subjects we see significant benefit from active (non-placebo) treatment (p = .0049) with patients receiving active therapy 2.16 times more likely to get better.

CMH Test another exampleStratifying on gender we obtain:

Results for females still indicate a significant benefit from treatment (p = .0052) but not males (p = .2635). We can use the CMH test to stratify/block on gender.

CMH Test another exampleStratifying on gender and using the CMH test.

The Cochran Mantel Haenszel Test shows that there is still a significant benefit from treatment after adjusting for gender (p = .0040).

Simpson’s ParadoxConsider the following example where again stratifying can produce surprising results. A newspaper might report the results from the following table as “Condom Use Increases the Risk of STD’s”.Condom Use

DevelopedSTD

NoSTD

RowTotal

Yes 55 40 95No 45 60 105Column Total 100 100 n = 200

Fisher’s Exact test suggests that risk of developing an STD is greater for those who use condoms (p = .0236). The RR is 1.35 with a CI (1.02, 1.79).

STD RatesCondom users= 61% STD rateNon-users= 43% STD rate

Simpson’s ParadoxHowever if we stratify on the number of sexual partners

5 or more partners

Less than 5 partnersThere appears to be a slight benefit from using condoms

(where there wasn’t before) for both strata based on number of sexual partners. However, neither is statistically significant (p > .05) for both stratified tables.

Summary of Confounding and Stratification

• We need to adjust for and be aware of potential confounding factors when analyzing the results of observational studies.

• The CMH Test provides a means of performing that adjustment to ensure an observed benefit or risk is not affected by or driven by a lurking confounding factor.

Summary of Confounding and Stratification

• One way to take other factors into account is by using matched pairs or dependent samples to mitigate their potential effects on the results.

• The best way to take other factors into account when studying the potential benefit or risk associated with factors is to use logistic regression which we will be discussing later in the course.

• Logistic regression will be part of next week’s lecture.

Measures of Association

I will include more on measures of association for categorical/ordinal variables in the lecture for next week, along with the introduction to logistic regression.

more topics in categorical data analysis

Documents