what are degrees of freedom? - montefiore

11
RESEARCH NOTE What Are Degrees of Freedom? Shanta Pandey and Charlotte Lyn Bright A S we were teaching a multivariate statistics course for doctoral students, one of the stu- dents in the class asked,"What are degrees of freedom? I know it is not good to lose degrees of freedom, but what are they?" Other students in the class waited for a clear-cut response. As we tried to give a textbook answer, we were not satisfied and we did not get the sense that our students understood. We looked through our statistics books to deter- mine whether we could find a more clear way to explain this term to social work students.The wide variety of language used to define degrees ojfrecdom is enough to confuse any social worker! Definitions range from the broad, "Degrees of freedom are the number of values in a distribution that are free to vary for any particular statistic" (Healey, 1990, p. 214), to the technical; Statisticians start with the number of terms in the sum [of squares], then subtract the number of mean values that were calculated along the way. The result is called the degrees of freedom, for reasons that reside, believe it or not. in the theory of thermodynamics. (Norman & Streiiier, 2003, p. 43) Authors who have tried to be more specific have defined degrees of freedom in relation to sample size (Trochim,2005;Weinbach & Grinne]],2004), cell size (Salkind, 2004), the mmiber of relation- ships in the data (Walker, 1940),and the difference in dimensionahties of the parameter spaces (Good, 1973).The most common definition includes the number or pieces of information that are free to vary (Healey, 1990; Jaccard & Becker, 1990; Pagano, 2004; Warner, 2008; Wonnacott & Wonnacott, 1990). These specifications do not seem to aug- ment students' understanding of this term. Hence, degrees of freedom are conceptually difficult but are important to report to understand statistical analysis. For example, without degrees of freedom, we are unable to calculate or to understand any underlying population variability. Also, in a bivariate and multivariate analysis, degrees of freedom are a function of sample size, number of variables, and number of parameters to be estimated; therefore, degrees of freedom are also associated with statisti- cal power. This research note is intended to com- prehensively define degrees of freedom, to explain how they are calculated, and to give examples of the different types of degrees of freedom in some commonly used analyses. DEGREES OF FREEDOM DEFINED In any statistical analysis the goal is to understand how the variables (or parameters to be estimated) and observations are linked. Hence, degrees of freedom are a function of both sample size (N) (Trochim, 2005) and the number of independent variables (k) in one's model (Toothaker & Miller, 1996; Walker, 1940; Yu, 1997).The degrees of fi^edom are equal to the number of independent observations {N),or the number of subjects in the data, minus the number of parameters (k) estimated (Toothaker & Miller, 1996; Walker, 1940). A parameter (for example, slope) to be estimated is related to the value of an independent variable and included in a statistical equation (an additional parameter is estimated for an intercept iu a general linear model). A researcher may estimate parameters using different amounts or pieces of information,and the number of independent pieces of information he or she uses to estimate a statistic or a parameter are called the degrees of freedom {dfi (HyperStat Online, n.d.). For example,a researcher records income of N number of individuals from a community. Here he or she has Nindependent pieces of information (that is, N points of incomes) and one variable called income (t); in subsequent analysis of this data set, degrees of freedom are asociated with both Nand k. For instance, if this researcher wants to calculate sample variance to understand the extent to which incomes vary in this community, the degrees of freedom equal N - fc.The relation- ship between sample size and degrees of freedom is CCCCode: 107O-S3O9/O8 (3.00 O2008 National Association of Sotial Workers 119

Upload: others

Post on 09-Feb-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What Are Degrees of Freedom? - Montefiore

RESEARCH NOTE

What Are Degrees of Freedom?Shanta Pandey and Charlotte Lyn Bright

AS we were teaching a multivariate statisticscourse for doctoral students, one of the stu-dents in the class asked,"What are degrees

of freedom? I know it is not good to lose degrees offreedom, but what are they?" Other students in theclass waited for a clear-cut response. As we tried togive a textbook answer, we were not satisfied and wedid not get the sense that our students understood.We looked through our statistics books to deter-mine whether we could find a more clear way toexplain this term to social work students.The widevariety of language used to define degrees ojfrecdomis enough to confuse any social worker! Definitionsrange from the broad, "Degrees of freedom are thenumber of values in a distribution that are freeto vary for any particular statistic" (Healey, 1990,p. 214), to the technical;

Statisticians start with the number of terms inthe sum [of squares], then subtract the numberof mean values that were calculated along theway. The result is called the degrees of freedom,for reasons that reside, believe it or not. in thetheory of thermodynamics. (Norman & Streiiier,2003, p. 43)

Authors who have tried to be more specific havedefined degrees of freedom in relation to samplesize (Trochim,2005;Weinbach & Grinne]],2004),cell size (Salkind, 2004), the mmiber of relation-ships in the data (Walker, 1940),and the differencein dimensionahties of the parameter spaces (Good,1973).The most common definition includes thenumber or pieces of information that are free tovary (Healey, 1990; Jaccard & Becker, 1990; Pagano,2004; Warner, 2008; Wonnacott & Wonnacott,1990). These specifications do not seem to aug-ment students' understanding of this term. Hence,degrees of freedom are conceptually difficult butare important to report to understand statisticalanalysis. For example, without degrees of freedom,we are unable to calculate or to understand any

underlying population variability. Also, in a bivariateand multivariate analysis, degrees of freedom are afunction of sample size, number of variables, andnumber of parameters to be estimated; therefore,degrees of freedom are also associated with statisti-cal power. This research note is intended to com-prehensively define degrees of freedom, to explainhow they are calculated, and to give examples ofthe different types of degrees of freedom in somecommonly used analyses.

DEGREES OF FREEDOM DEFINEDIn any statistical analysis the goal is to understandhow the variables (or parameters to be estimated) andobservations are linked. Hence, degrees of freedomare a function of both sample size (N) (Trochim,2005) and the number of independent variables (k)in one's model (Toothaker & Miller, 1996; Walker,1940; Yu, 1997).The degrees of fi^edom are equal tothe number of independent observations {N),or thenumber of subjects in the data, minus the number ofparameters (k) estimated (Toothaker & Miller, 1996;Walker, 1940). A parameter (for example, slope) to beestimated is related to the value of an independentvariable and included in a statistical equation (anadditional parameter is estimated for an intercept iua general linear model). A researcher may estimateparameters using different amounts or pieces ofinformation,and the number of independent piecesof information he or she uses to estimate a statisticor a parameter are called the degrees of freedom {dfi(HyperStat Online, n.d.). For example,a researcherrecords income of N number of individuals from acommunity. Here he or she has Nindependent piecesof information (that is, N points of incomes) andone variable called income (t); in subsequent analysisof this data set, degrees of freedom are asociatedwith both Nand k. For instance, if this researcherwants to calculate sample variance to understand theextent to which incomes vary in this community,the degrees of freedom equal N - fc.The relation-ship between sample size and degrees of freedom is

CCCCode: 107O-S3O9/O8 (3.00 O2008 National Association of Sotial Workers 119

Page 2: What Are Degrees of Freedom? - Montefiore

positive; as sample size increases so do the degreesof freedom. On the other hand, the relationshipbetween the degrees of freedom and number of pa-rameters to be estimated is negative. In other words,the degrees of freedom decrease as the number ofparameters to be estimated increases. That is whysome statisticians define degrees of freedom as thenumber of independent values that are left after theresearcher has applied all the restrictions (Rosenthal,2001; Runyon & Haber, 1991); therefore, degreesof freedom vary from one statistical test to another(Salkind, 2004). For the purpose of clarification, letus look at some examples.

A Single Observation with One Parameterto Be EstimatedIf a researcher has measured income (k = 1) forone observation {N = 1) from a community, themean sample income is the same as the value ofthis observation. With this value, tbe researcher hassome idea ot the mean income of this communitybut does not know anything about the populationspread or variability (Wonnacott & Wonnacott,1990). Also, the researcher has only one indepen-dent observation (income) with a parameter that heor she needs to estimate. The degrees of freedomhere are equal to N - fc.Thus, there is no degreeof freedom in this example (1 - 1 = 0). In otherwords, the data point has no freedom to vary, andthe analysis is limited to the presentation of the valueof this data point (Wonnacott & Wonnacott, 1990;Yu, 1997). For us to understand data variability, Nmust be larger than 1.

Multiple Observations (N) with OneParameter to Be EstimatedSuppose there are N observations for income. Toexamine the variability in income, we need to esti-mate only one parameter (that is, sample variance)for income (k), leaving the degrees of freedom ofN — k. Because we know that we have only oneparameter to estimate, we may say that we have atotal of N — 1 degrees of freedom. Therefore, allunivariate sample characteristics that are computedwith the sum of squares including the standard de-viation and variance have N— 1 degrees of freedom(Warner, 2008).

Degrees of freedom vary from one statistical testto another as we move from univariate to bivari-ate and mtiltivariate statistical analysis, dependingon the nature of restrictions applied even when

sample size remains unchanged. In the examplesthat follow, we explain how degrees of freedom arecalculated in some of the commonly used bivariateand muJtivariate analyses.

1Wo Samples with One Parameter(or t Test)Suppose that the researcher has two samples, menand women, or n, + n^ observations. Here, one canuse an independent samples t test to analyze whetherthe mean incomes of these two groups are different.In the comparison of income variability betweenthese two independent means (or k number ofmeans), the researcher will have n^ + n.,-2 degreesof freedom. The total degrees of freedom are thesum of the number of cases in group 1 and group2 minus the number of groups. As a case in point,see the SAS and SPSS outputs of a t test comparingthe literacy rate (LITERACY, dependent variable) ofpoor and rich countries (GNPSPLIT, independentvariable) in Table l.AU in all, SAS output has fourdifferent values of degrees of freedom (two of whichare also given by SPSS).We review each of them inthe following paragraphs.

The first value for degrees of freedom under ttests is 100 (reported by both SAS and SPSS).Thetwo groups of countries (rich and poor) are assumedto have equal variances in their literacy rate, thedependent variable. This first value of degrees offreedom is calculated as M + n^-2 (the sum of thesample size of each group compared in the f testminus the number of groups being compared), thatis.64 + 3 8 - 2 = 100.

For the test of equality of variance, both SAS andSPSS use the F test. SAS uses two different valuesof degrees of freedom and reports folded F statistics.The numerator degrees of freedom are calculated as n— 1, that is 64 — 1 = 63. The denominator degrees offreedom are calculated as n - 1 or 38 - 1 = 37.Thesedegrees of freedom are used in testing the assump-tion that the variances in the two groups (rich andpoor countries, in our example) are not significantlydifferent.These two values are included in the cal-culations computed within the statistical programand are reported on SAS output as shown in Table1. SPSS, however, computes Levene's weighted Fstatistic (seeTable 1) and uses k-\ and N - kdegreesof freedom, where k stands for the number of groupsbeing compared and N stands for the total numberof observations in the sample; therefore, the degreesof freedom associated with the Levene's F statistic

120 Social Work Research VOLUME J I , NUMBER Z JUNE ZOO8

Page 3: What Are Degrees of Freedom? - Montefiore

GN

PS

PL

iT

25.6

471

18.0

712

46.5

63

88.9

74L

ITE

RA

CT

0

(poo

r)

1 [r

ich)

Leve

ne's

Test

for

Equa

lity

of V

aria

nces

.000

14,2

66L

ITE

RA

CY

Equ

al v

aria

nces

ass

umed

Equa

l va

rian

ces

not a

ssum

ed

o

1TUJ

1

' ^ c

# 5lA "

E renc

tt;Qu'Oc

Q

lU

a.— 00

— n-1— m00 O

<ri —

IT ^

r j IN

Z^ —^ -^r-i fSt t

o 5q p

100 96.9

"i 1 1 1 .L IliJ g S UJ > E S

LIT

ER

AC

Y

t,J

d s

>lA

0 00 0

.31a

-TJ (N cv r^rS oS fS -«•

Q p 00 lA ^

Q

S C- w

I. 2 Qo

3

I 13

Page 4: What Are Degrees of Freedom? - Montefiore

are the same (that is,fe-1 = 2 - 1 = \,N-k = 102- 2 = 100) as the degrees of fireedom associated with"equal" variance test discussed earlier,and thereforeSPSS does not report it separately.

If the assumption of equal variance is violatedand the two groups have different variances as isthe case in this example, where the folded F test orLevene's F weighted statistic is significant, indicat-ing that the two groups have significantly differentvariances, the value for degrees of freedom (100) isno longer accurate.Therefore, we need to estimatethe correct degrees of freedom (SAS Institute, 1985;also see Satterthwaite, 1946, for the computationsinvolved in this estimation).

We can estimate the degrees of freedom accord-ing to Satterthwaite's (1946) method by using thefollowing formula:

(//"Satterthwaite =

( H - ] ) ( » , - ! )

(« - 1 ) 1 -

where n^ - sample size of group \,n^= sample sizeof group 2, and S^ and S are the standard deviationsof groups 1 and 2,respectiveIy.By inserting subgroupdata from Table 1, we arrive at the more accuratedegrees of freedom as follows:

(64-1) (38-1)

(64-1) 1 -25.65= X 38

/ 25.65^ X 38V 18.07^ X 64,

+ (38-1)25.65- X 38

/ 25.65^ X 38

U 18.07^x64,

2331

63 1 -25000.9625000.96

h 20897.59;

+ 3725000.9625000.96

& 20897.59,

2331

.7 2_5_0(i(X9645898.55J '''[45898.55

233163[1 - .5447]^

2331 233163 X .207 + 37 X .2967 24.0379 = 96.97

Because the assumption of equality of variances isviolated, in the previous analysis the Satterthwaite's

value for degrees of freedom, 96.97 (SAS rounds itto 97), is accurate, and our earlier value, 100, is not.Fortunately, it is no longer necessary to hand cal-culate this as major statistical packages such as SASand SPSS provide the correct value for degrees offreedom when the assumption of equal variance isviolated and equal variances are not assumed. Thisis the fourth value for degrees of freedom in ourexample, which appears in Table 1 as 97 in SASand 96.967 in SPSS. Again, this value is the correctnumber to report, as the assumption of equal vari-ances is violated in our example.

Comparing the Means of g Groups withOne Parameter (Analysis of Variance)What if we have more than two groups to com-pare? Let us assume that we have «, + . . . + «groups of observations or countries grouped bypohtical freedom (FP^EDOMX) and that weare interested in differences in their literacy rates(LITERACY, the dependent variable ).We can testthe variability of^ means by using the analysis ofvariance (ANOVA).Thc ANOVA procedure pro-duces three different types ot degrees of freedom,calculated as follows:

• The first type of degrees of freedom is calledthe between-groups degrees of freedom or modeldegrees of freedom and can be determined byusing the number of group means we wantto compare. The ANOVA procedure teststhe assumption that the g groups have equalmeans and that the population mean is notstatistically different from the individual groupmeans. This assumption reflects the null hy-pothesis, which is that there is no statisticallysignificant difference between literacy ratesin g groups of countries (ji^= ^^ = fX^).Thealternative hypothesis is that the g samplemeans are significantly different from oneanother. There are g - 1 model degrees offreedom for testing the null hypothesis and forassessing variability among the^ means.Thisvalue of model degrees of freedom is used inthe numerator for calculating the F ratio inANOVA.

• The second type of degrees of freedom, calledthe within-groups degrees of freedom or error de-grees offreedom, is derived from subtracting themodel degrees offreedom from the correctedtotal degrees of freedom. The within-groups

122 Social Work Research VOLUME 32, NUMBER 2 JUNE 2008

Page 5: What Are Degrees of Freedom? - Montefiore

degrees of freedom equal the total numberof observations minus the number of groupsto be compared, n^ + . .. + n^-g. This valuealso accounts for the denominator degrees offreedom for calculating the F statistic in anANOVA.

• Calculating the third type of degrees offreedom is straightforward. We know thatthe sum of deviation from the mean or 2(Y- F) = O.We also know that the total sum ofsquares or 2(y - Y)^ is nothing but the sumof N- deviations from the mean. Therefore,to estimate the total sum of squares 2(y -?)-, we need only the sum of N - 1 devia-tions from the mean.Therefore.with the totalsample size we can obtain the total degreesof freedom, or corrected total degrees offreedom, by using the formula N - 1.

hiTable 2, we show the SAS and SPSS output withthese three different values of degrees of freedomusing the ANOVA procedure.The dependent vari-able.literacy rate.is continuous,and the independentvariable, political freedom or FREEDOMX, isnominal. Countries are classified into three groupson the basis of the amount of political freedom eachcountry enjoys: Countries that enjoy high politicalfreedom are coded as 1 (« = 32), countries that enjoymoderate political freedom are coded as 2 (n = 34),and countries that enjoy no pohticai freedom arecoded as 3 (tt = 36). The mean literacy rates (de-pendent variable) of these groups of countries areexamined.The null hypothesis tests the assumptionchat there is no significant difference in the literacyr.ites of these countries according to their level ofpolitical freedom.

The first of the three degrees of freedom, thebetween-groups degrees of freedom, equals g - \.Because there are three groups of countries in thisanalysis, we have 3 - 1 = 2 degrees of freedom.Thisaccounts for the numerator degrees ot freedom inestimating the Fsta tis tic. Second, the wi thin-groupsdegrees of freedom, which accounts for the de-nominator degrees of freedom for calculating the Fstatistic in ANOVA, equals ri^ + .. .+ n^-g. Thesedegrees of freedom are calculated as 32 + 34 + 36- 3 = 99. Finally, the third degrees of freedom, thetotal degrees of freedom, are calculated as N - 1(102-1 = 101).When reporting Fvalues and theirrespective degrees of freedom, researchers shouldreport them as follows: The independent and the

dependent variables are significantly related [F(2,99) ^ 16.64,p<.0001].

Degrees of Freedom in MultipleRegression AnalysisWe skip to multiple regression because degrees offreedom are the same in ANOVA and in simpleregression. In multiple regression analysis, there ismore than one independent variable and one de-pendent variable. Here, a parameter stands for therelationship between a dependent variable (Y) andeach independent variable (X). One must under-stand four different types of degrees of freedom inmultiple regression.

• The first type is the model (regremoti) degreesof freedom. Model degrees of freedom are as-sociated with the number of independentvariables in the model and can be understoodas follows:

A null model or a model without inde-pendent variables will have zero parameters tobe estimated. Therefore, predicted V* is equalto the mean of Vand the degrees of freedomequal 0.

A mode! with one independent variable hasone predictor or one piece of useful informa-tion (fe = 1) for estimation of variability in Y.This model must also estimate the point wherethe regression line originates or an intercept.Hence, in a model with one predictor, thereare (fe + t) parameters—k regression coef-ficients plus an intercept—to be estimated,with k signifying the number of predictors.Therefore,there are \{k + ])- l],orfe degreesof freedom for testing this regression model.

Accordingly, a multiple regression modelwith more than one independent variable hassome more useful information in estimatingthe variability in the dependent variable, andthe model degrees of freedom increase as thenumber of independent variables increase.Thenull hypothesis is that all of the predictonhave the same regression coefficient of zero,thus there is otily one common coefficientto be estimated (Dallal,2(X)3).The alternativehypothesis is that the regression coefficientsare not zero and that each variable explains adifferent amount of variance in the dependentvariable. Thus, the researcher must estimatefe coefficients plus the intercept. Therefore,

PANDEY AND BRtGHT / WhotArc Dtgrcfs ofFrefdom? 123

Page 6: What Are Degrees of Freedom? - Montefiore

Sum

of

Squa

res

.000

16.6

3812

126.

74!

728.

849

CV O

2425

3.48

3

7215

6.09

6

9640

9.57

8

Bet

wee

n G

roup

s

Wit

hin

Gro

ups

Tot

al

Subs

et (

a =

.05

)

-

1 FR

EE

DO

MX

84.3

! 3

1.00

0

47.4

17

57.5

29

.126

" t: 'S0 a yII II II

f ^ (N —1 <^

II i

I

Si a

fN lA 00

frt >O (TVl A l A Ors — Ml• * rs v irj r cv

I 3o

mbe

i

32

J

tical

il

Page 7: What Are Degrees of Freedom? - Montefiore

there are (fe + 1) - 1 or it degrees offreedomtor testing the null hypothesis (Dallal, 2003).In other words, the model degrees offreedomequal the number of useful pieces ofinforma-cioii available for estimation of variability inthe dependent variable.

• The second type is tbe residual, or error, degreesoffreedom. Residual degrees of freedom in mul-tiple regression involve information of bothsample size and predictor variables. In addition,we also need to account for the intercept. Forexample, if our sample size equals N, we needto estimate k + l parameters, or one regressioncocfiU-ient for each of the predictor variables{k) plus one for the intercept. The residualdegrees of freedom are calculated N - {k +l).This is the same as the formula for theerror, or within-groups, degrees offreedomin the ANOVA. It is important to note thatincreasing the number of predictor variableshas implications for the residual degrees offreedom. Each additional parameter to beestimated costs one residual degree offreedom{Dallal,2()03).The remaining residual degreesoffreedom are used to estimate variability inthe dependent variable.

• Tbe third type of degrees offreedom is thetotal, or corrected total, degrees offreedom. As in

ANOVA, this is calculated N- \.• Finally, the fourth rype of degrees offreedom

that SAS (and not SPSS) reports under theparameter estimate in multiple regression isworth mentioning. Here, the null hypothesisis that there is no relationship between eachindependent variable and the dependent vari-able. The degree of freedom is always t foreach relationship and therefore, some statisticalsofrware.sucb as SPSS,do not bother to reportit.

In the example of multiple regression analysis (seeTable 3), there are four different values of degreesof freedom. The first is the regression degrees offreedom. This is estimated as (fe + 1) - 1 or (6 +1) - 1 = 6 , where k is the number of independentvariables in the model. Second, the residual degreesoffreedom are estimated as N- {k + 1). Its valuehere is 99 - (6 + 1) = 92.ThinJ, the total degreesoffreedom are calculated N - 1 (or 99 -1 = 98).Finally, the degrees of freedom shown under pa-rameter estimates for each parameter always equal

1. as explained above. F values and the respectivedegrees of freedom from the current regressionoutput should be reported as follows: Tbe regres-sion model is statistically significant with F(6, 92)= 44.86,p< .0001.

Degrees of Freedom in aNonparametric TestPearson's chi square, or simply tbe chi-square sta-tistic, is an example ofa nonparametric test that iswidely used to examine the association between tvi'onominal level variables. According to Weiss (1968)"the number of degrees offreedom to be associatedwith a chi-square statistic is equal to the number ofindependent components that entered into its calcu-lation" (p. 262). He further explained that each cell ina chi-square statistic represents a single componentand that an independent component is one whereneither observed nor expected values are determinedby the frequencies in other cells. In other vfords, ina contingency table, one row and one column arefated and tbe remaining cells are independent and arefree to vary. Therefore, the chi-square distributionhas ( r - 1) X (c- 1) degrees of freedom, where r isthe number of rows and f is tbe number of columnsin the analysis (Cohen, 1988; Walker, 1940; Weiss,1968). We subtract one from both the number ofrows and columns simply because by knowing thevalues in other cells we can tell the values in the lastcells for both rows and columns; therefore, these lastcells are not independent. \

As an example, we ran a chi-square test to exatninewhether gross national product (GNP) per capitaof a country (GNPSPLIT) is related to its levelof political freedom (FREEDOMX). Countries(GNPSPLIT) are divided into two categories—richcountries or countries with higji GNP per capita(coded as 1) and poor countries or countries withlow GNP per capita (coded as O),and political free-dom (FREEDOMX) has three levels—free (codedas 1), partly free (coded as 2), not free (coded as 3)(see Table 4). In this analysis, the degrees offree-dom arc (2 - 1) X (3 - 1) = 2. In other words, byknowing the number of rich countries, we wouldautomatically know the number of poor countries.But by knowing the number of countries that arefree, we would not know the number of countriesthat are partly free and not free. Here, we need toknow two of the three components—for instance,the number of countries that are free and partlyfree—so that we will know the number of countries

I'ANDEV AND BRIGHT / What Are Degrees ofIivedom? 125

Page 8: What Are Degrees of Freedom? - Montefiore

Est

imat

eA

djus

ted

/^'

Mod

e!

16.1

919

,729

,745

so00

2

QUJ

part

]

o00

2r 1

IIu

Var

i

c

t3 Io

.2-a

Sum

of

Squa

res

Mod

el

coo

44.8

6111

761.

620

262.

177

v^ fN 00ON CN

7056

9.72

1

2412

0,23

9

9468

9.96

0

1 R

egre

ssio

n

Res

idua

l

Tot

al

o00

GN

P

2c0

UJ2

-9tV

ari

c

1VQuQSB

1 St

anda

:

-o

ncS

D

enrs

• p .

1 C

oeff

i.

v>

aU

effi

0

"

cc

CQ

del

o

(N

op

Q

1 3.

2

so

CN

(N

c

1(C

on

00

cs

(N

q

oo

<1

so

fSq

oof S•—1

CO

SO

part

^_

so1 / *

•<?•00

p

op

cow0 0

0 0

QZ

ED

FU

ooq

(N

"T

o

CN

oCO

PUPl

ooq

0 0

rN

CO

q

00

o90

d

SC

HC

00

q

CN

• " "

00

5q

„oq

ooo

c.Z

ON •«• (N

-i i- — "A ^»— — m

o

.20

o

.08

o

.39

o00

V

INI/N

OV

oV

92t

— 00 so

P S

I

-Is.

•<}• O O

111o o oJ4 .S ^S E E3 3a

Z Z 2

C INS so

1/5 — (S CN

IT p -V 00

t^

B -0•§

2 ^

JJ - . r ^* j i ^ w ^

P (N CN IN q

.g r (*) \o q

. ' I ON ^-^ — O

Page 9: What Are Degrees of Freedom? - Montefiore

nM

issi

ng

-a

<10

0.0

192

: ;

64.6

GN

PS

PL

IT •

FR

EE

DO

MX

Tot

alFR

EE

DO

MX

-

76

61.3

%

48

38.7

%

124

100.

0%

35

28,2

%

13

10.5

%

48

38.7

%

31

25-0

%

10

8.1% 41

33.1

%

10

8.1% 25

20.2

%

35

28.2

%

GN

PS

PL

IT

0 C

ount

% o

f Tot

al

1 C

ount

% o

f Tot

al

Tot

al

Cou

nt

% o

f Tot

al

Asy

mpt

otic

, p(2

-tai

lea)

Val

ue

.000

.000

.000

fN (N —

22.0

71"

22.0

18

14.8

57

124

Pear

son

X"

Lik

elih

ood

Rat

io

Lin

ear-

by-L

inea

r A

ssoc

iatio

n

Val

id C

ases

(N

)

3

o o oo o o

u

Page 10: What Are Degrees of Freedom? - Montefiore

that are not free.Therefore, in this analysis there aretwo independent components that are free to vary,and thus the degrees of freedom are 2.

Readers may note that there are three values underdegrees of freedom in Table 4. The first two valuesare calculated the same vi'ay as discussed earlier andhave the same values and are reported most widely.These are the values associated with the Pearsonchi-square and likelihood ratio chi-square tests.Thefmal test is rarely used. We explain this briefly.Thedegree of freedom for the Mantel-Haenszel chi-square statistic is calculated to test the hypothesisthat the relationship between two variables (rowand column variables) is linear; it is calculated as(N " 1) X f, where r is the Pearson product-mo-ment correlation between the row variable and thecolumn variable (SAS Institute. 1990).This degreeof freedom is always 1 and is useful only when bothrow and column variables are ordinal.

CONCLUSIONYu (1997) noted that "degree of freedom is anintimate stranger to statistics students" (p. l).Thisresearch note has attempted to decrease the strange-ness of this relationship with an introduction to thelogic of the use of degrees of freedom to correctlyinterpret statistical results. More advanced research-en, however, will note that the information providedin this article is limited and fairly elementary. Asdegrees of freedom vary by statistical test (Salkind,2004), space prohibits a more comprehensive dem-onstration. Anyone with a desire to learn moreabout degrees of freedom in statistical calculations isencouraged to consult more detailed resources,suchas Good (1973),Walker (1940),andYu (1997).

Finally, for illustrative purposes we used WorldData that reports information at country level. Inour analysis, we have treated each country as anindependent unit of analysis. Also, in the analysis,each country is given the same weight irrespec-tive of its population size or area. We have ignoredlimitations that are inherent in the use of such data.We warn readers to ignore the statistical findings ofour analysis and take away only the discussion thatpertains to degrees of feedom.

REFERENCESCoheti,J, (1988), Statistical power analysis for the behavioral

sciences (2nd ed,), Hillsdale, NJ: Lawrence Eribauni.Dallal. G. E, (2003). Decrees of freedom. Retrieved May 30,

2006. trotn http;//www.tufts,edu/~gdailal/dof,hemGood. LJ. (1973), What are degrees of freedom? Amerkim

Statistician, 21, 227-228.

Healey.J. F. (1990). Statistics: A tool for social research (2nded,), Belmont, CA: Wadsworth,

HyperStat Onlitie. (n,d,). Degrees of freedom. Retrieved May30, 2006, from http://davidmiane.com/hyperstat/A42408.htmt

Jaccard,J., & Becker, M. A. (1990). Statistiafor the behavioralsciences (2nd ed.), Belmont, CA:Wadsworth,

Norman. G. R,, & Streiner, D. L, (20113), PDQ statistics (3rded,). Hamilton, Ontario, Canada: BC Decker,

Pagano. R, R, (2004), Understanding statistics in the behavioralsciences (7th ed,). Belmont, CA: Wadsworth,

Rosenthal, j .A, (2001), Statistics and data interpretaliou for thehelping professions. Belmont. CA: Wadsworth.

Runyon. R. P,,& Haber,A. (1991). Fundamcntak of behav-ioral statistics (7th ed.). New York: McGraw-Hill,

Salkind, N.J, (21)04), Statistics for people who (think they)hate statistics (2nd ed.).Thousand Oaks, CA: SagePublications,

SAS Institute. (1985). SAS user's guide: Statistics, version 5.1.Cary, NC: SAS Institute.

SAS Institute, (1990), SAS procedure ^uide, version 6 (3nJed.). Cary, NC: SAS Institute, '

Satterthwaite, F. E. (1946), An approximate distribution ofestimates of variance components. Biometrics Bulletin,2, 110-114.

Toothaker, L, E., &: MiUer, L, (1996), Introductory statisticsfor the behavioral sciences (2nd ed.). Pacific Grove, CA:Brooks/Cole,

Trochim,W.M.K. (2005), Research methods:The conciseknowledge base. Cincinnati: Atomic Dog,

Walker, H,W, (1940), Degrees of freedom.^Hrrti/ ofEdiuational Psychology, 31, 2 5 3 - 2 6 9 .

Warner. R. M. (2008). Applied sMti.(fi«,Thousand Oaks,CA: Sage Publications,

Weinbach, R. W., & Griniiell, R. M.,Jr. (2004). Statistics forsocial workers (6th ed.). Boston: Pearson.

Weiss, R. S. (1968). Statistics in social research:An introdudion.New York: John Wiley & Sons.

Wonnacott,T, H., & Woiinacott, R.J, (1990). Introductorystatistics (5th ed,). New York: John Wiley & Sons.

Yu, C. H, (1997), Illustrating degrees of freedom in terms ofsample size and dimensionality. Retrieved November1,2007, from http://www.creative-wisdotii,com/computer/sas/df.hnnl

Shanta Pandey, PhD, is associate profes.<:or, and CharlotteLyn Bright, MSH^is a doctoral student, George Warren Brown

School of Social Work, Washington University, St. Louis. The

authors are thankful to a student whose curious mind inspired

them to work on this research note. Correspondence concerning

this article should be sent to Shanta Pcindey, Geotge Warren

Brown School of Social Work, Washington University, St. Louis,

MO 63130; e-mail:[email protected].

Original manuscript received August 29, 2006Final revision received November 15, 2007Accepted January 16, 2008

128 Social Work Research VOLUME 32, NUMBER 2 JUNE 1008

Page 11: What Are Degrees of Freedom? - Montefiore