analysis of a binary outcome variable

Analysis of a Binary Outcome Variable Using the FREQ and

the LOGISTIC Procedures

Arthur Li

A common application in the health care industry:

INTRODUCTION

Outcome(Y)

Exposure(X)

(smoking) (cancer)

Exposure(X1)

(age)

Exposure(X2)

(gender)

PROC FREQPROC LOGISTIC

One starting point create a contingency table

CONTINGENCY TABLE

BREATHING TEST

ABNORMAL NORMAL

SMOKING STATUS

CURRENT 131 927

NEVER 38 741

Forthofer & Lehnen (1981) (Agresti, 1990)Subjects: Caucasians who work in certain industrial

plants in HoustonResponse (Y): breathing testexplanatory variable (X) is smoking status

Three types of study design in observational studyCross-sectional : X and Y are collected at the same

time. Prevalence Ratio = P1 / P0

Cohort: X is collected first: Relative Risk (RR) = P1 / P0

Case-control: Y is collected first. You can’t calculate RR

STUDY DESIGN

P1=

AA+B

P0=

CC+D

Outcome (Y)

1 0

Exposure (X)1 A B

0 C D

ODDS RATIO

Outcome (Y)

1 0

Exposure (X)1 A B

0 C D

AOdds1 = B

Odds0 =

CD

Odds Ratio =

Odds1

Odds0

ADBC=

ODDS RATIO

Outcome (Y)

1 0

Exposure (X)1 A B

0 C D

0 1 infinity

OR measures the strength between X and Y

OR = 1 No AssociationOR > 1 Exposed Group (X = 1) has higher odds OR < 1 Non-exposed Group (X = 0) has higher odds

ODDS RATIO

Outcome (Y)

1 0

Exposure (X)1 A B

0 C D

0 1 infinity

To test the association between X and YUse the chi-square statistics Use 95% CI for OR – including 1 or not

OR measures the strength between X and Y

PROC FREQ

BREATHING TEST (Y)

ABNORMAL (1) NORMAL (0)

SMOKING STATUS (X)

CURRENT (1) 131 (A) 927 (B)

NEVER (0) 38 (C) 741 (D)

data breathTest; input test $ 1-8 neversmk $ 10-16 count;datalines;abnormal current 131normal current 927abnormal never 38normal never 741;

PROC FREQ

proc freq data=breathTest; weight count; tables neversmk*test;run;

the data is entered directly from the cell count of the table

The FREQ ProcedureTable of neversmk by testneversmk test

Frequency‚Percent ‚Row Pct ‚Col Pct ‚abnormal‚normal ‚ Totalƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆcurrent ‚ 131 ‚ 927 ‚ 1058 ‚ 7.13 ‚ 50.46 ‚ 57.59 ‚ 12.38 ‚ 87.62 ‚ ‚ 77.51 ‚ 55.58 ‚ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆnever ‚ 38 ‚ 741 ‚ 779 ‚ 2.07 ‚ 40.34 ‚ 42.41 ‚ 4.88 ‚ 95.12 ‚ ‚ 22.49 ‚ 44.42 ‚ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆTotal 169 1668 1837 9.20 90.80 100.00

PROC FREQ - RELRISK

proc freq data=breathTest; weight count; tables neversmk*test/relrisk;run;

Compute RR for col1 RR for col2ORBREATHING TEST (Y)

ABNORMAL (1) NORMAL (0)

SMOKING STATUS (X)

CURRENT (1) 131 (A) 927 (B)

NEVER (0) 38 (C) 741 (D)

col1 col2

PROC FREQ - RELRISK

proc freq data=breathTest; weight count; tables neversmk*test/relrisk;run;

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence LimitsƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCase-Control (Odds Ratio) 2.7557 1.8962 4.0047Cohort (Col1 Risk) 2.5383 1.7904 3.5987Cohort (Col2 Risk) 0.9211 0.8960 0.9470

Sample Size = 1837

Compute RR for col1 RR for col2OR

Odds of having an abnormal test result are about 2.8 times higher for current smokers compared to those who have never smoked (95% CI: 1.9 – 4.0).

PROC FREQ - CHISQ

proc freq data=breathTest; weight count; tables neversmk*test/relrisk chisq;run;

Statistics for Table of neversmk by test

Statistic DF Value ProbƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 1 30.2421 <.0001Likelihood Ratio Chi-Square 1 32.3820 <.0001Continuity Adj. Chi-Square 1 29.3505 <.0001Mantel-Haenszel Chi-Square 1 30.2257 <.0001Phi Coefficient 0.1283Contingency Coefficient 0.1273Cramer's V 0.1283

LOGISTIC REGRESSION MODEL

Use logistic regression to study the association between the “Breathing Test” & “Smoking”

For logistic regression, the MLE (not OLS) is used to estimate the parameters

Why not use a linear probability model?

[0,1]p βX;αp

The probability is bounded The relationship between p and X can be

nonlinear


A logistic regression is used for predicting the probability occurrence of an event by fitting data to a logit function

βXα plogit

log(odds) plogit

β)exp(α1

β)exp(α p


BREATHING TEST

ABNORMAL NORMAL

SMOKING STATUS

CURRENT 131 927

NEVER 38 741

mal);prob(abnor p βX;α plogit

Reference cell coding

β: the increment in log odds for current smokers compared to those that never smoked

βexpOR odds

oddslogβ 0 vs 1

0

1

αlogit(p never 0

βαlogit(p current 1X

0

1

)

)


proc logistic data=breathTest; class neversmk /param=ref; weight count; model test = neversmk;run;

The LOGISTIC Procedure

Model Information

Data Set WORK.BREATHTESTResponse Variable testNumber of Response Levels 2Weight Variable countModel binary logitOptimization Technique Fisher's scoring

Number of Observations Read 4Number of Observations Used 4Sum of Weights Read 1837Sum of Weights Used 1837



Response Profile

Ordered Total Total Value test Frequency Weight

1 abnormal 2 169.0000 2 normal 2 1668.0000

Probability modeled is test='abnormal'.

By default, PROC LOGISTIC models the probability of response levels with lower ordered value


proc logistic data=breathTest descending; class neversmk /param=ref; weight count; model test = neversmk;run;

To model probability of being “normal”

proc logistic data=breathTest; class neversmk /param=ref; weight count; model test (descending) = neversmk;run;

proc logistic data=breathTest; class neversmk /param=ref; weight count; model test (event="normal") = neversmk;run;



Class Level Information

DesignClass Value Variables

neversmk current 1 never 0

Reference cell coding estimates the difference between the effect of each level and the last level

Easy to interpret the result

Reference Cell Coding


proc logistic data=breathTest; class neversmk; weight count; model test = neversmk;run;



neversmk current 1 never -1

Effect coding estimates the difference between the effect of each level and the average effect over all levels

Effect Coding





neversmk current 1 never 0

By default, the last ordered value of the classification variable is considered the reference level


proc logistic data=breathTest; class neversmk (ref="never") /param=ref; weight count; model test = neversmk;run;

Model Fit Statistics

Intercept Intercept andCriterion Only Covariates

AIC 1130.417 1100.035SC 1129.803 1098.808-2 Log L 1128.417 1096.035

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 32.3820 1 <.0001Score 30.2421 1 <.0001Wald 28.2434 1 <.0001

Information for model selection

These are the goodness-of-fit measures that used to compare one model to another





AIC 1130.417 1100.035SC 1129.803 1098.808-2 Log L 1128.417 1096.035




Ho: All regression coefficients =0

Similar to overall F statistics in linear regression





AIC 1130.417 1100.035SC 1129.803 1098.808-2 Log L 1128.417 1096.035




Ho: All regression coefficients =0

LRT is more reliable, esp. for small N



Type 3 Analysis of Effects

WaldEffect DF Chi-Square Pr > ChiSq

neversmk 1 28.2434 <.0001

Analysis of Maximum Likelihood Estimates

Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -2.9704 0.1663 318.9365 <.0001neversmk current 1 1.0136 0.1907 28.2434 <.0001

NEVERSMK variable has only 1 df, test results will be identical



Type 3 Analysis of Effects

WaldEffect DF Chi-Square Pr > ChiSq

neversmk 1 28.2434 <.0001



Intercept 1 -2.9704 0.1663 318.9365 <.0001neversmk current 1 1.0136 0.1907 28.2434 <.0001

Current smoker has 1.01 increase in the log odds of having abnormal test compared to people who never smokedOR = exp(1.0136) = 2.756



Odds Ratio Estimates

Point 95% WaldEffect Estimate Confidence Limits

neversmk current vs never 2.756 1.896 4.004



Sample Size = 1837

Result from PROC FREQ:


proc logistic data=breathTest; class neversmk (ref="never") /param=ref; weight count; model test = neversmk; oddsratio 'smoking' neversmk;run;

ODDSRATIO <‘label’> variable </options>;

new to 9.2!

Wald Confidence Interval for Odds Ratios

Label Estimate 95% Confidence Limits

smoking neversmk current vs never 2.756 1.896 4.004


proc logistic data=breathTest; class neversmk (ref="never") /param=ref; weight count; model test = neversmk; oddsratio 'smoking' neversmk/cl=pl;run;

Profile Likelihood Confidence Interval for Odds Ratios


smoking neversmk current vs never 2.756 1.916 4.054

Wald CI is based on normal approximationPL CI is based the value of log-likelihoodPL CI is generally preferred for small sample size

CONFOUNDING

Smoking Test

Age

Not including Age can cause either over-/under-estimates of the relationship between Smoking & Test

CONFOUNDING

AgeNon smoker

Non smoker

smoker

smoker

< 40 ≥ 40

Log (odds)

Non smoker

smoker

Smoking Test

Age

Adjusting age, you are comparing smoker and non-smoker at the common values of age

INTERACTION

Interaction: if the relationship between “Smoking” and “Test” differs depending upon whether the Age is absent or not

Age

Non smoker

Non smoker

smoker

smoker

< 40 ≥ 40

Log (odds)

Age is referred to as an effect modifier

INTERACTION & CONFOUNDING

PROC FREQ: analyze the association of your interest when there is only one confounder or one effect modifier

If you want to control multiple confounder variables or include multiple effect modifiers in your model, you need to use the PROC LOGISTIC

THE PURPOSES AND STRATEGIES FOR MODEL BUILDING

The methods of fitting a regression model differ depending upon your research purpose

Two Purposes :Investigating the essential association between

an outcome variable with a set of explanatory variables - epidemiologic field

Predict the outcome variable by using a set of explanatory variables


Situations for building a prediction model: statistical decision makinggenerating (not testing) hypotheses for a future study

A prediction model needs to be validated in an independent sample to evaluate its usefulness

For building a prediction model, one only needs to consider the interaction effect

Technique for building a prediction model:forwardbackwardand stepwise, etc.

The focus of this talk is not on building a prediction model but rather estimating the relationship between a main explanatory variable and an outcome variable


For estimating association, interaction and confounding issues must be considered

Which should be evaluated first? Confounding effect or interaction effect?


Is the association between “Smoking” & “Test” different

in the 2 age groups?

There is an interaction. Report

age-specific OR

No Interaction.Is “Age” a

confounder?

Report Crude OR

Report Age-Adjusted OR

YN

YN


Effect Modification (interaction) can be detected via statistical testing

Confounding effect cannot be tested statistically

Outcome Main Var Covariate OR P Include?

Y X 2.3 <0.05

Y X Z 4.2 0.2 YES

Y X Z 2.4 0.01 MAYBE

PROC FREQ: INTERACTION EFFECT

data breathTestAge; input test $ 1-8 neversmk $ 10-16 over40 $ 18-20 count;datalines;normal never no 577abnormal never no 34normal current no 682abnormal current no 57normal never yes 164abnormal never yes 4normal current yes 245abnormal current yes 74;


proc freq data=breathTestAge; weight count; tables over40*neversmk*test/chisq relrisk cmh;run;

Cochran-Mantel-Haenszel statistics (test for association between the row and column variables after adjusting for the 3rd variable)

The adjusted Mantel-Haenszel and logit estimates of the odds ratio and relative risks

the Breslow-Day test for homogeneity of odds ratios

The CMH option:



Breslow-Day Test forHomogeneity of the Odds RatiosƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 18.0829DF 1Pr > ChiSq <.0001

Total Sample Size = 1837

the association between smoking status and the breathing test are not the same across different age groups



Statistics for Table 1 of neversmk by testControlling for over40=no

Statistic DF Value ProbƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 1 2.4559 0.1171Likelihood Ratio Chi-Square 1 2.4893 0.1146Continuity Adj. Chi-Square 1 2.1260 0.1448Mantel-Haenszel Chi-Square 1 2.4541 0.1172Phi Coefficient 0.0427Contingency Coefficient 0.0426Cramer's V 0.0427Statistics for Table 1 of neversmk by testControlling for over40=no Estimates of the Relative Risk (Row1/Row2)Type of Study Value 95% Confidence LimitsƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCase-Control (Odds Ratio) 1.4184 0.9144 2.2000Cohort (Col1 Risk) 1.3861 0.9190 2.0906Cohort (Col2 Risk) 0.9772 0.9499 1.0054Sample Size = 1350



Statistics for Table 2 of neversmk by testControlling for over40=yes

Statistic DF Value ProbƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 1 35.4510 <.0001Likelihood Ratio Chi-Square 1 45.1246 <.0001Continuity Adj. Chi-Square 1 33.9203 <.0001Mantel-Haenszel Chi-Square 1 35.3782 <.0001Phi Coefficient 0.2698Contingency Coefficient 0.2605Cramer's V 0.2698





Summary Statistics for neversmk by testControlling for over40

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Probƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 25.2444 <.0001 2 Row Mean Scores Differ 1 25.2444 <.0001 3 General Association 1 25.2444 <.0001

Estimates of the Common Relative Risk (Row1/Row2)

Type of Study Method Value 95% Confidence LimitsƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCase-Control Mantel-Haenszel 2.5683 1.7618 3.7441 (Odds Ratio) Logit 1.9840 1.3252 2.9702

Cohort Mantel-Haenszel 2.4174 1.6754 3.4879 (Col1 Risk) Logit 1.8475 1.2641 2.7001


These statistics and its adjusted OR are only useful if there is a homogeneity in the OR across each category of the adjusting variable

PROC LOGISTIC: INTERACTION EFFECT

no 0

yes1X

never 0

current 1X

XXβXβXβα plogit

age40

smoke

age40smookeintage40age40smokesmoke

proc logistic data=breathTestAge; class neversmk (ref="never") over40 (ref="no")/param=ref; weight count; model test = neversmk over40 neversmk*over40;run;



Analysis of Maximum Likelihood Estimates Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSqIntercept 1 -2.8315 0.1765 257.4193 <.0001neversmk current 1 0.3495 0.2240 2.4355 0.1186over40 yes 1 -0.8820 0.5359 2.7086 0.0998neversmk*over40 current yes 1 2.1668 0.5691 14.4985 0.0001

Wald Test:



Likelihood Ratio Test: age40smookeintage40age40smokesmoke1

age40age40smokesmoke0

XXβXβXβα plogit:H

XβXβα plogit:H

model)] logL(Full [-2 - model)] L(Reduced [-2log

model)] L(Reduced log -model) [logL(Full 2 LR

model) Reduced in term(# - model) Full in term(#df,χ~LR

model)] L(Reduced log -model) [logL(Full 2 LR2





AIC 1130.417 1055.467SC 1130.497 1055.785-2 Log L 1128.417 1047.467





proc logistic data=breathTestAge; class neversmk (ref="never") over40 (ref="no")/param=ref; weight count; model test = neversmk over40;run;



AIC 1130.417 1074.123SC 1130.497 1074.361-2 Log L 1128.417 1068.123





proc logistic data=breathTestAge; class neversmk (ref="never") over40 (ref="no")/param=ref; weight count; model test = neversmk over40 neversmk*over40; ods output FitStatistics = log2Ratio_full GlobalTests = df_full;

data _null_; set log2Ratio_full; if Criterion = '-2 Log L'; call symput('neg2L_full', InterceptAndCovariates);

data _null_; set df_full; if Test = 'Likelihood Ratio'; call symput('df_full', DF);


proc logistic data=breathTestAge; class neversmk (ref="never") over40 (ref="no")/param=ref; weight count; model test = neversmk over40; ods output FitStatistics = log2Ratio_reduce GlobalTests = df_reduce;data _null_; set log2Ratio_reduce; if Criterion = '-2 Log L'; call symput('neg2L_reduce', InterceptAndCovariates);

data _null_; set df_reduce; if Test = 'Likelihood Ratio'; call symput('df_reduce', DF);run;


data result; LR = &neg2L_reduce - &neg2L_full; df = &df_full - &df_reduce; p = 1-probchi(LR,df); label LR = 'Likelihood Ratio';

proc print data=result label noobs; title "Likelihood ratio test";run;

Likelihood ratio test Likelihood Ratio df p 20.6558 1 .000005497


proc logistic data=breathTestAge; class neversmk (ref="never") over40 (ref="no")/param=ref; weight count; model test = neversmk over40 neversmk*over40; oddsratio neversmk/ at (over40 ='no') ; oddsratio neversmk/ at (over40 ='yes');run;

Wald Confidence Interval for Odds Ratios


neversmk current vs never at over40=no 1.418 0.914 2.200neversmk current vs never at over40=yes 12.383 4.441 34.525

NURSE HEALTH STUDY

NHS - nurses aged 30 to 55 who were enrolled in 1976

Part of the study investigated the association between OC use and BC

NURSE HEALTH STUDY

data nurse_study; input bc age oc count;datalines;1 0 1 710 0 1 284181 0 0 350 0 0 122671 1 1 1430 1 1 206611 1 0 3210 1 0 44424;

BREAST CANCER

AGE 30 – 39 (0) AGE 40 – 55 (1)

CASE (1) CONTROL (0) CASE (1) CONTROL (0)

OC USE

YES (1) 71 28418 143 20651

NO (0) 35 12267 321 44424

NURSE HEALTH STUDY

proc freq data=nurse_study order=data; weight count; tables age*oc*bc/chisq relrisk cmh;run;

Breslow-Day Test forHomogeneity of the Odds RatiosƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 0.1521DF 1Pr > ChiSq 0.6966

There is no interactionCheck for confounding

NURSE HEALTH STUDY

Summary Statistics for oc by bcControlling for age

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Probƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 0.4361 0.5090 2 Row Mean Scores Differ 1 0.4361 0.5090 3 General Association 1 0.4361 0.5090

Estimates of the Common Relative Risk (Row1/Row2)

Type of Study Method Value 95% Confidence LimitsƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCase-Control Mantel-Haenszel 0.9419 0.7882 1.1256 (Odds Ratio) Logit 0.9415 0.7882 1.1246



NURSE HEALTH STUDY

proc freq data=nurse_study order=data; weight count; tables oc*bc/chisq relrisk;run;

Statistics for Table of oc by bcStatistic DF Value ProbƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒChi-Square 1 17.8881 <.0001Likelihood Ratio Chi-Square 1 18.1401 <.0001Continuity Adj. Chi-Square 1 17.5337 <.0001Mantel-Haenszel Chi-Square 1 17.8879 <.0001Phi Coefficient -0.0130Contingency Coefficient 0.0130Cramer's V -0.0130

Statistics for Table of oc by bc Estimates of the Relative Risk (Row1/Row2)Type of Study Value 95% Confidence LimitsƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCase-Control (Odds Ratio) 0.6944 0.5858 0.8230Cohort (Col1 Risk) 0.6957 0.5874 0.8239Cohort (Col2 Risk) 1.0019 1.0010 1.0028

NURSE HEALTH STUDY

Unadjusted OR = 0.69, Adjusted OR = 0.94 Age is a confounder

In this situation, the age-adjusted statistics and its odds ratio should be reported

After adjusting for age, there is no association between using OC and having BC (p = 0.51; age adjusted OR = 0.94, 95% CI = 0.79 – 1.13)

NURSE HEALTH STUDY

proc logistic data=nurse_study descending; weight count; model bc = oc age;run;



Intercept 1 -5.9083 0.1156 2612.5788 <.0001oc 1 -0.0602 0.0911 0.4360 0.5090age 1 0.9835 0.1133 75.3707 <.0001



oc 0.942 0.788 1.126age 2.674 2.141 3.338

ageageOCOC XβXβα plogit

NURSE HEALTH STUDY

proc logistic data=nurse_study descending; weight count; model bc = oc;run;



Intercept 1 -5.0704 0.0532 9095.8096 <.0001oc 1 -0.3646 0.0867 17.6834 <.0001



oc 0.694 0.586 0.823

CONCLUSION

Analyzing variables with dichotomized outcomes by using the FREQ and LOGISTIC procedures is a common task for statisticians in the health care industry

Simply knowing how to use the procedures is not sufficient

Understanding the goal of model building and following correct model-building steps are extremely important in order to obtain accurate and unbiased results

analysis of a binary outcome variable

Documents