Download - 8.19 16 25 177 - 170 165 7 = 3.2 7 == 2.188 Critical Value = 1.64 X = 177 = 170 S = 16 N = 25 Z =
8.19
16 25
177 - 170
16 5
7=
3.2
7= = 2.188
Critical Value = 1.64
X = 177= 170S = 16N = 25
Z =
8.25
21.2 100
130.1 - 120
21.2 10
10.1=
2.12
10.1= = 4.764
Critical Value = 1.662
X = 130.1= 120S = 21.2N = 100
t =
9.10 Pre/post
Before After Difference D squaredTotal 108.2 108.4 3 105
t paired = t p = d - 0
Standard error of d
= -------------d - 0
S d 2
N
d = D/N
N = 15
d 2 = D 2 – ( D) 2 / N
S d2 = d 2 / N - 1
= 3/15 =.2
= 105 - 9/15 = 175
= 175 / 15 – 1 = 12.5
= 0.2 / 0.694
= 0.288
= 0.2 / 12.5 / 15
= 0.2 / 0.833
df = N – 1 = 140.05 > 1.7613
9.11 Pre/post
Before After Difference D squaredTotal 114.2 114.7 15 279
t paired = t p = d - 0
Standard error of d
= -------------d - 0
S d 2
N
d = D/N
N = 30
d 2 = D 2 – ( D) 2 / N
S d2 = d 2 / N - 1
= 15/30 =.5
= 279 - 225/30 = 271.5
= 271.5 / 30 – 1 = 9.362
= 0.5 / 0.559
= 0.894
= 0.5 / 9.362 / 30
= 0.5 / 0.312
df = N – 1 = 290.05 > 1.6991
Pooled estimate of the SED (SEDp)
1Estimate of the 1
N sN ns
+SEDp of x E - x C = Sp2
Sp2 = Pooled estimate of the variance
(x s - x s)2 + (x ns - x ns )2 Sp2 =
N s + N ns - 2
9.22
Pooled estimate of the SED (SEDp)
1Estimate of the 1
14 s18 ns
+SEDp of x E - x C = Sp2
1399.5 + 1814.5 Sp2 =
14 + 18 - 2
3214Sp2 =
30= 107.133
9.22
t-Test(Two Tailed)
x s - x ns - 0
t =
Sp2 [ ( 1/Ns ) + ( 1/Nns) ]
d f = N s + N ns - 2
9.22
t-Test(Two Tailed)
83.5 - 72.5
t =
107.133 [ ( 1/14 ) + ( 1/18) ]
d f = 14 + 18 - 2
9.22
t-Test (Two Tailed)83.5 - 72.5
t =107.133[ ( 1/14 ) + ( 1/18) ]
d f = 14 + 18 - 2 = 30
9.22
83.5 - 72.5
107.133[ ( 1/14 ) + ( 1/18) ]
t =11
107.133 (0.071 + 0.055)
=11
107.133 (0.126)=
13.498
11=
3.674
11= 2.994
Critical value = 2.0423
ANOVA
Analysis of Variance
• Allows the statistician to analyze multiple data sets.
• Number of combinations to be made take two groups at a time– N(N-1)/2
• If individual z tests were performed on each combination of a large number of groups the number of calculations would be prohibitive.
Assumptions underlying the use of ANOVA
1. The individuals in the various subgroups should be selected on the basis of random sampling from normally distributed populations.
2. The variance of the subgroups should be homogeneous. (H0: s1 = s2 = … = sn)
3. The samples comprising the groups should be independent.
Single classification ANOVAGroup A
X
Group B
X
Group C
X
Group A
X2
Group B
X2
Group C
X2
12 18 6 144 324 36
18 17 4 324 289 16
16 16 14 256 256 196
8 18 4 64 324 16
6 12 6 36 144 36
12 17 12 144 289 144
10 10 14 100 100 196
X =82 108 60 X2 = 1068
1726 640
X = 11.71 15.43 8.57Xt = 11.90
Values needed for ANOVA
• The Total Sum of the Squaresx2
t = X2 – (X)2 / N
• The “Between” Sum of Squaresx2
b = (X – XT)2 n
• The “Within” Sum of Squaresx2 = X2 – (X)2 / n for each group orx2
w = X2t - x2
b
• The Degrees of FreedomN between groups –1 plus N within groups -1
Values needed for ANOVA
• The Total Sum of the Squaresx2
t = X2 – (X)2 / N = 1068+1726+640-[(82+108+60)2/21] = 457.8
• The “Between” Sum of Squaresx2
b = [(X )2/ n] - x2t/N
=[(82)2/7 + (108)2/7 +(60)2/7] – (250)2/21 =165.0
• The “Within” Sum of Squaresx2 = X2 – (X)2 / n for each group orx2
w = X2t - x2
b = 457.8 - 165.0 = 292.8
• The Degrees of FreedomN between groups –1 plus N within groups –13 – 1 + (7 – 1 + 7 – 1 + 7 – 1) = 2 + 18 = 20
ANOVA TableSource of variation df
Sum of
Squares
Mean
Square
“Between”
Groups2 165.0 82.5
“Within”
Groups18 292.8 16.3
Total 20 457.8
The F-Test
F = mean square for “between”groups mean square for “within” groups
= 82.5 16.3
= 5.06
“Between” df = 2 “Within” df = 18
Value of F needed of significance at the 5% level = 3.55
Page 325
Tests after the F test
• F = (X1– X2)2/s 2w (N1+ N2)/ N1 N2
• A vs. B
F = (11.71– 15.43)2/16.3 (14)/49 = (3.72)2/4.66 = 2.97
• A vs. C F = (11.71– 8.57)2/16.3 (14)/49 = (3.14)2/4.66 = 2.12
• B vs. CF = (15.43– 8.57)2/16.3 (14)/49 = (6.86)2/4.66 = 10.1
Page 181
X = 11.71
A B C D
1 7 9 8
1 7 6 6
1 7 5 4
1 7 3 1
1 7 2 1
5 35 25 20
Page 181
X
A X2 B X2 C X2 D X2
1 1 7 49 9 81 8 64
1 1 7 49 6 36 6 36
1 1 7 49 5 25 4 16
1 1 7 49 3 9 1 1
1 1 7 49 2 4 1 1
5 35 25 20
5 245 155 118X2
Page 181
X
A X2 B X2 C X2 D X2
1 1 7 49 9 81 8 64
1 1 7 49 6 36 6 36
1 1 7 49 5 25 4 16
1 1 7 49 3 9 1 1
1 1 7 49 2 4 1 1
5 35 25 20
5 245 155 118X2
Xa = 1 Xb = 7 Xc = 5 Xd = 4
Xt = 4.25
=85
=523
Values needed for ANOVA
• The Total Sum of the Squaresx2
t = X2 – (X)2 / N = [5+245+155+118] -[(5+35+25+20)2/20] = 161.75
• The “Between” Sum of Squaresx2
b = [(X )2/ n] - x2t/N
=[(5)2/5 + (35)2/5 +(25)2/5+(20)2/5 ] – (85)2/20 =93.75
• The “Within” Sum of Squaresx2 = X2 – (X)2 / n for each group orx2
w = X2t - x2
b = 161.75 – 93.75 = 68
• The Degrees of FreedomN between groups –1 plus N within groups –14 – 1 + (5 – 1 + 5 – 1 + 5 – 1+ 5 - 1) = 3 + 16
= 19
ANOVA TableSource of variation df
Sum of
Squares
Mean
Square
“Between”
Groups3 93.75 31.25
“Within”
Groups16 68 4.25
Total 20 161.75
F = 31/25/4.25 = 7.35
HSD = 4.05 4.25
5= 4.05(9.22) = 3.73
Tukey’s HSD test = 0.5k = 4n – k = 16Appendix C: q = 4.05
Pair Mean DifferenceA-B 6A-C 4A-D 3B-C 2B-D 3C-D 1
CHAPTER 11
Inferences Regarding Proportions
OUTLINE
11.1 INFERENCES WITH QUALITATIVE DATA
Discusses the problem of inference in qualitative data
11.2 MEAN AND STANDARD DEVIATION OF THE BINOMIAL DISTRIBUTION
Explains how to compute a mean and a standard deviation for the binomial distribution
11.3 APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION
Shows that, using the normal approximation it is possible to compute a Z score for a number of successes
11.4 TEST OF SIGNIFICANCE OF A BINOMIAL PROPORTION
Gives instructions on how to test hypothesis regarding proportions if the distribution of the proportion of successes is known
11.5 TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN
Illustrates that, because the difference between two proportions is approximately normally distributed, a hypothesis test for the difference may be easily set up
11.6 CONFIDENCE INTERVALS
Discusses and illustrates confidence intervals for
LEARNING OBJECTIVES
•
• 1. Compute the mean and the standard deviation of a binomial distribution
• 2. Compute Z scores for specific points on a binomial distribution
• 3. Perform significance tests of a binomial proportion and of the difference between two binomial proportions
• 4. Calculate confidence intervals for a binomial proportion and for the difference between two proportions
•
INFERENCES WITH QUALITATIVE DATA
• • A. Qualitative data – data for which individual quantitative measurements are not
available but that relate to the presence or absence of some characteristic
• B. p the estimate of the true proportion, , of individuals who possess a certain characteristic
• C. To best understand the difference between the distribution of binomial events (x) and the distribution of binomial proportion (p)
– 1. Compare these distributions with those in the approximate analogous quantitative situation
– 2. The x’s of a binomial distribution with a mean and a standard error
•
MEAN AND STANDARD DEVIATION OF THE BINOMIAL DISTRIBUTION
A. Probability of x successful outcomes in n independent trials is given by:•
– 1.
• • where P is the probability of a success in one individual trial• will be used to designate the probability of x successful
outcomes
• B. In a binomial distribution the mean for the number of successes, x, is• •
• • and the standard deviation is
xnxn
x
pp
1
n
)1( n
APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION
• A. The normal distribution is a reasonable approximation to the binomial distribution when n is large
• B. We can find a point on the Z distribution that corresponds to a point x on the binomial distribution by using
)1(
n
nxZ
APPROXIMATION OF THE NORMAL TO THE BINOMIAL DISTRIBUTION
• C. Because we are using a normal (continuous) distribution to approximate a discrete one, we may apply the continuity correction to achieve an adjustment. The correction is made by subtracting ½ from the absolute value of the numerator, that is,
• • •
• D. When n is very large and is very small, another important distribution, the Poisson distribution, is a good approximation to the binomial
)1(2
1
n
nxZ
75.12
5.3
)8)(.2(.252
159
TEST OF SIGNIFICANCE OF A BINOMIAL PROPORTION
• • A. The mean of the distribution of a binomial proportion p is given by
the population parameter• •
• and the standard error of p is given by• •
• B. When p appears to be normally distributed, providing n is reasonably large, we can find the Z score corresponding to a particular p and perform a test of significance
populationincasesofnumber
populationinsuccessesofnumber
n
x
npSE
)1()(
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN
• • A. In order to compare proportions from two different samples we must:
– 1. assume that the proportions are equal, that is, in estimating
– 2. learn if , the proportion with the given characteristic in one sample differs significantly from , the proportion with the same characteristic in the second
sample
• B. Three thing that must be know to determine if the proportions are significantly different
– 1. the distribution of the differences -
– 2. the mean -
– 3. the standard error of this distribution – (SE)
• C. Statisticians have shown that follows a nearly normal distribution
21 )( 21 ppSE
1p2p
21 pp
21 pp
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN
• D. The standard error is estimated by
•
•
• where
•
• and
•
•
• and
21
21 n
qp
n
qpppSE
tttt
ttt pqandnn
xxp
121
21
1
11 n
xp
2
22 n
xp
TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN
•
• Knowing the mean and the standard error of the distribution differences, we can calculate a Z score:
•
•
• If , the formula for is
•
21
2121
ppSE
ppZ
21 21 ppSE
2
22
1
11 11
nn
CONFIDENCE INTERVALS
•
• A. Although hypothesis testing is useful, we often go a step further to learn:
– 1. the true proportion
– 2. the true difference in proportion between the baseline data and the revised data
• B. To answer these questions we compute confidence intervals for and for by employing a method to the one used for computing confidence intervals for and
21 21
CONFIDENCE INTERVALS
• • C. Confidence interval for • • Chapter 8 version:•
• Similar version• • This expression presents a dilemma: it requires that we know , which is an
unknown. Solution is to have a sufficiently large sample size, permitting the use of p as an estimate of
• • The expression then becomes•
nZx
n
Zp
1
n
ppZp
1
CONFIDENCE INTERVALS
•
• A. Confidence interval for
•
• The confidence interval for the difference of two means is:
•
•
•
• The confidence interval for the difference of two proportions is similar:
•
21
212121 xxSEZxxforCI
2
22
1
112121
11
n
pp
n
ppZppforCI
CONCLUSION
The normal approximation to the binomial distribution is a useful statistical tool. It helps answer questions regarding qualitative data involving proportions where individuals are classified into two categories. With an understanding of the distribution of the binomial proportion p and of the distribution of the difference between two proportions we can perform tests of significance and calculate confidence intervals.