chapter 4: variability. variability the goal for variability is to obtain a measure of how spread...

40
Chapter 4: Variability

Upload: logan-hubbard

Post on 18-Jan-2018

232 views

Category:

Documents


0 download

DESCRIPTION

Central Tendency and Variability Central tendency describes the central point of the distribution, and variability describes how the scores are scattered around that central point. Together, central tendency and variability are the two primary values that are used to describe a distribution of scores.

TRANSCRIPT

Page 1: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Chapter 4: Variability

Page 2: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Variability

• The goal for variability is to obtain a measure of how spread out the scores are in a distribution.

• A measure of variability usually accompanies a measure of central tendency as basic descriptive statistics for a set of scores.

Page 3: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Central Tendency and Variability

• Central tendency describes the central point of the distribution, and variability describes how the scores are scattered around that central point.

• Together, central tendency and variability are the two primary values that are used to describe a distribution of scores.

Page 4: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Variability

• Variability serves both as a descriptive measure and as an important component of most inferential statistics.

• As a descriptive statistic, variability measures the degree to which the scores are spread out or clustered together in a distribution.

• In the context of inferential statistics, variability provides a measure of how accurately any individual score or sample represents the entire population.

Page 5: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Variability (cont'd.)

• When the population variability is small, all of the scores are clustered close together and any individual score or sample will necessarily provide a good representation of the entire set.

• On the other hand, when variability is large and scores are widely spread, it is easy for one or two extreme scores to give a distorted picture of the general population.

Page 6: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Measuring Variability

• Variability can be measured with – The range– The standard deviation/variance

• In both cases, variability is determined by measuring distance. - distance between two scores- distance between a score and the mean

Page 7: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

The Range (p. 106)

• The range is the total distance covered by the distribution, from the highest score to the lowest score (using the upper and lower real limits of the range). range = URL for Xmax – LRL for Xmin

Page 8: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

The Range (cont'd.)

• Alternative definitions of range:– When scores are whole numbers or discrete

variables with numerical scores, the range tells us the number of measurement categories.

range = Xmax – Xmin + 1– Alternatively, the range can be defined as the

difference between the largest score and the smallest score. (commonly used definition, especially for discrete variables) range = Xmax – Xmin

Page 9: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

The Standard Deviation

• Standard deviation measures the standard (average) distance between a score and the mean.

• The calculation of standard deviation can be summarized as a four-step process:

Page 10: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Dispersion: Deviation (p. 107-)deviation = X – μ find the mean of the deviation not a good measure for variability... Why? because.....

mean serves as a balance point for the distribution Σ (X – μ) = 0 (see p. 107 example 4.1)mean deviation is always zero no matter what...• So we need to get rid of the +/- signsum of squared deviations (but you also squared the “unit of

measurement”, e.g. dollar square, age square )mean square deviation (standard/average measure) take a square root ( the unit of measurement is normal again!!,

e.g. dollar, age )

Page 11: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.1 (p.107)

• N = 4

X X-μ8 51 -23 00 -3

sum = 12 0mean

= 3

Page 12: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

The Standard Deviation (cont'd.)

1. Compute the deviation (distance from the mean) for each score.

2. Square each deviation.3. Compute the mean of the squared deviations. For a

population, this involves summing the squared deviations (sum of squares, SS) and then dividing by N. The resulting value is called the variance or mean square and measures the average squared distance from the mean.

For samples, variance is computed by dividing the sum of the squared deviations (SS) by n - 1, rather than N. The value, n - 1, is know as degrees of freedom (df)

and is used so that the sample variance will provide an unbiased estimate of the population variance.

4. Finally, take the square root of the variance to obtain the standard deviation.

Page 13: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability
Page 14: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

3-14

Measures of Dispersion

• Range

• Variance

• Standard Deviation

Page 15: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Computing the Variance

Steps in computing the variance:

Step 1: Find the mean.Step 2: Find the difference between each observation and the mean, and square that difference.Step 3: Sum all the squared differences found in Step 2.Step 4: Divide the sum of the squared differences by the number of items in the population.

Page 16: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Population Variance

2 2

1

2 2

1

1 ( )

1

N

ii

N

ii

xN

xN

Sum of Squares:

NXXXSS

222 )()(

NSS

2 : Variance=mean squared deviation

Page 17: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.2 (p.109)• N = 5• X = 1, 9, 5, 8, 7

X X-μ SS1 -5 259 3 95 -1 18 2 47 1 1

sum = 30 0 40mean

= 6 8 Sum of Squares Variance

Page 18: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Ex 4 (p.110)• N = 5, X = 4, 0, 7, 1, 3 • Calculate the variance

Sum of Squares Variance

X X-μ SS4 1 10 -3 97 4 161 -2 43 0 0

sum = 15 0 30

mean = 3 6

Page 19: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.3 (p.111)• N = 4• X = 1, 0, 6, 1

Sum of Squares Variance

X X-μ SS X*X1 -1 1 10 -2 4 06 4 16 361 -1 1 1

sum = 8 0 22 38

mean = 2 5.5

2216384838)( 22

2

NXXSS

Page 20: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Sample Variance

Page 21: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

2 2

1

2 2

1

1 ( )111

n

ii

n

ii

s x xn

x nxn

Sample Variance

2 22X X X n X

nXXMXSS

222 )()(

MX

Page 22: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Sample Standard Deviation

samplethe in nsobservatio of number the is n samplethe of mean the is x

samplethe in nobservatio each of value the is xvariance samplethe is s

:where2

Page 23: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df• Remember population mean won’t change no

matter what, but sample mean will change whenever you collect a new set of sample!

• Example 4.6 (p.117)• n = 3, M = 5, X = 2, 9, ?• the 3rd score is determined by M=ΣX/nonly n-1 scores are free to vary or independent

of each other and can have any valuesdegrees of freedom = n-1

Page 24: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df=1, n=2

cXXXXXMX

22

)( 212111

cXXXXXMX

22

)( 122122

Page 25: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df=1, n=2 (cont.)

222

1

2)( ccMXSSi

i

222

1cc

nSSs

Page 26: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df=2, n=3

1321321

11 32

3)( cXXXXXXXMX

2321321

22 32

3)( cXXXXXXXMX

)(3

23

)( 21321321

33 ccXXXXXXXMX

22

21

3

1

2)( ddMXSSi

i

Page 27: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df=2, n=3 (cont.)

2)( 21

1ccd

22

21

3

1

2)( ddMXSSi

i

• Actually, SS can be reformulated into 2 distinct “distances”:

6)(3 21

2ccd

Page 28: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

degrees of freedom: df=n-1

21

22

21

1

2 ...)(

n

n

ii dddMXSS

• with n scores in the sample• SS can be reformulated into n-1 distinct

“distances”, so “mean” variation has n-1 degrees of freedom:

1...

1

21

22

212

nddd

nSSs n

Page 29: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.5 (p.116)• n= 7• X = 1, 6, 4, 3, 8, 7, 6

Sum of Squares

Variance

X X-μ SS X*X1 -4 16 16 1 1 364 -1 1 163 -2 4 98 3 9 647 2 4 496 1 1 36

sum = 35 0 36 211

mean = 5 6

Page 30: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Ex 2 (p.118)• N = 5, X = 1, 5, 7, 3, 4• if instead this is a sample of n=5

Sum of Squares

X X-μ SS X*X1 -3 9 15 1 1 257 3 9 493 -1 1 94 0 0 16

sum = 20 0 20 100

Page 31: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Unbiased, Biased

If the average value of all possible sample statistics = population parameter unbiased

e.g. we select 100 samples (n=5) calculate 100 sample means and variances calculate the average mean (ΣM/100) and average variance (Σs2 /100), if

unbiasedOtherwise, the sample statistic is biased

100

M 22

100

s

Page 32: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.7 (p.120)• Population: N=6, X = 0, 0, 3, 3, 9, 9 μ = 4, σ2 = 14• Select 9 samples with n=2 (sample statistics in Table 4.1)

Sample M SS/n SS/(n-1)1 0 0 02 1.5 2.25 4.53 4.5 20.25 40.54 1.5 2.25 4.55 3 0 06 6 9 187 4.5 20.25 40.58 6 9 189 9 0 0

sum = 36 63 126mean = 4 7 14

Biased UnbiasedUnbiased

Page 33: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Visualize your data set by μ, σ or M, s• using histogram or other graphs• If your score = 85 high enough to get the scholarship?

Page 34: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Properties of the Standard Deviation• If a constant is added to every score in a

distribution, the standard deviation will not be changed.

• If you visualize the scores in a frequency distribution histogram, then adding a constant will move each score so that the entire distribution is shifted to a new location.

• The center of the distribution (the mean) changes, but the standard deviation remains the same.

)()(,)( XVXcVcXcE

Page 35: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Properties of the Standard Deviation (cont'd.)

• If each score is multiplied by a constant, the standard deviation will be multiplied by the same constant.

• Multiplying by a constant will multiply the distance between scores, and because the standard deviation is a measure of distance, it will also be multiplied.

)()(,)( 2 XVccXVccXE

Page 36: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

The Mean and Standard Deviation as Descriptive Statistics

• If you are given numerical values for the mean and the standard deviation, you should be able to construct a visual image (or a sketch) of the distribution of scores.

• As a general rule, about 70% of the scores will be within one standard deviation of the mean, and about 95% of the scores will be within a distance of two standard deviations of the mean.

Page 37: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

3-*

The Empirical Rule

Page 38: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Example 4.8 (p.124-125)

• any consistent difference between two treatment?

• Experiment A: 2 sets of data, sd is quite small so ∆M = 5 is distinct and easy to see

• Experiment B: 2 sets of data, sd is quite large ∆M = 5: it is difficult to discern/recognize the

differences• error variance: the result of unsystematic differences

(unexplained and uncontrolled differences between scores), e.g. static noise (radio)

Page 39: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability
Page 40: Chapter 4: Variability. Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability

Demo 4.1 (p. 129)• Sample: n=6, X = 10, 7, 6, 10, 6, 15• compute the variance and standard deviation.ΣX = 54, ΣX2 = 546,

SS = 546-(542/6) = 546 – 486 = 60s2 = SS/(n-1) = 60/5 = 12s = 3.4641

nXXMXSS

222 )()(