statstictics problems
TRANSCRIPT
-
7/22/2019 Statstictics Problems
1/40
Statstictics Problems
Group :
Name: Roll no
Anuth Siddharth 127Abir Banerjee 116
Ninad Tatke 175Maulik Chandarana 168Madhurima Chatterjee 159Tulsi Zaveri 214Daniel Fernandes 135Deepika Singh 136
-
7/22/2019 Statstictics Problems
2/40
Confidence Interval (Single Population) 4 problemsProblem: 1
A ketchup manufacturer is in the process of deciding whether to promote anew extra-spicy brand. The companys marketing-research department used
a national telephone survey of 6000 households and found that the extraspicy ketchup would be purchased by 335 of them. A much more extensivestudy made 2 years ago showed that 5 percent of the households wouldpurchase the brand then. At a 2 percent significance level, should thecompany conclude that there is an increased interest in the extra-spicy flavor?
Solution:
N=6000
H0: p=0.05 H1: p>0.05 =0.02
The upper limit of the acceptance region is z=2.05, orp = pH0 + z((pH0*qH0)/n) = 0.05 + 2.05((0.05*0.95)/6000) = 0.05577
Because the observed z value = (p - pH0)/(pH0qH0/n)
= (0.055830.05)/(0.05*0.95/6000)
=2.07
>2.05 (or p>0.05577), we should reject H0. Thecurrent interest is significantly greater than the interest of
2 years ago.
-
7/22/2019 Statstictics Problems
3/40
Problem: 2
Steve Cutter sells Big Blade lawn mowers in his hardware store, and he isinterested in comparing the reliability of the mowers he sells with the reliabilityof Big Blade mowers sold nationwide. Steve knows that only 15 percent of all
Big Blade mowers sold nationwide require repairs during the first year ofownership. A sample of 120 of Steves customers revealed that exactly 22 ofthem required mower repairs in the first year of ownership. At the 0.02 level ofsignificance, is there evidence that Steves Big Blade mowers differ inreliability from those sold nationwide?
Solution:
N=120
p = 22/120 = 0.1833
H0: p = 0.15H1: p 0.15
= 0.02
The limits of the acceptance region are z = 2.33, or
p = pH0 + z((pH0*qH0)/n) = 0.15 2.33((0.15*0.85)/120)
= (0.0741, 0.2259)
Because the observed z value = (p - pH0)/(pH0qH0/n)
= (0.18330.15)/(0.15*0.85/120)
= 1.02
-
7/22/2019 Statstictics Problems
4/40
PROBLEM 3-
In a mobile phone manufacturing company, a random sample of 81 phones istaken producing a sample mean of 47 and a sample standard variation of5.89. Construct a 90% confidence interval assuming that the number of
camera phones among normal phones is evenly distributed. Find the intervalwidth?
Answer-
Here:
s=5.89
x=47
n=81
Formula- x+Z*SD/ n
=47+1.65*5.89/81
(Where 1.65=> in the given table, the value for .45 is 1.65)
=48.07 (upper limit)
Now,
=47-1.65*5.89/81=45.93 (lower limit)
Answer- We are 90% confident that the number of camera phones amongnormal phones will lie between 45.93 and 48.07.
-
7/22/2019 Statstictics Problems
5/40
PROBLEM 4-
In a new food home delivery service business, there is a loss of 12 dollars.Suppose this was resulted from a random sample of 25 households, wherethe SD is 21 $, compute a 98% confidence interval on this sample result. How
wide is the interval?Answer-
Here:
s=21
x=12
n=25
Formula- x+Z*SD/ n
=12+2.33*21/25
=21.78 (upper limit)
Now,
=12-2.33*21/25
=2.22 (lower limit)
Answer- We are 98% confident that the result is within this width.
-
7/22/2019 Statstictics Problems
6/40
Hypothesis Testing (Single Population) 4
1. Hypothesis Testing (Single Population)
Problem 1
An insurance company is reviewing its current policy rates. Whenoriginally setting the rates they believed that the average claim amountwas $1,800. They are concerned that the true mean is actually higherthan this, because they could potentially lose a lot of money. Theyrandomly select 40 claims, and calculate a sample mean of $1,950.Assuming that the standard deviation of claims is $500, and set =0.05 test to see if the insurance company should be concerned.
Solut ion
n = 40
= 1950 = 0.05
= 1800
1800 > 1800
= 1.96 (two tailed hypothesis) =
=
= 1.897
Answer
Do not reject as 1.897 falls in the confidence region. We cannotconclude anything statistically significant from this test, and cannot tellthe insurance company whether or not they should be concerned abouttheir current policies.
-
7/22/2019 Statstictics Problems
7/40
Problem 2
A car manufacturer claimed that their car averaged at least 31 milesper gallon of gasoline. A sample of 9 cars was selected and each carwas driven with one gallon of regular gasoline. The sample showed a
mean of 29.43 miles with a standard deviation of 3 miles. = 0.05.What do you conclude about the manufacturers claim?
Solut ion
n = 9
= 29.43 = 0.05
= 31
31 < 31
= -1.860
= =
= - 1.57
Answer
We cannot reject
. There is insufficient evidence to doubt the
manufacturers claim concerning the gas mileage.
-
7/22/2019 Statstictics Problems
8/40
Problem3 General Electric has developed a new bulb whose design specifications call
for a light output of 960 lumens compared to an earlier model thatproduced only 750 lumens. The companys data indicate that thestandard deviation of light output for this type of bulb is 18.4 lumens.From a sample of 20 new bulbs, the testing committee found anaverage light output of 954 lumens per bulb. At a 0.05 significancelevel, can General Electric conclude that its new bulb is producing thespecified 960 lumen output?
Solut ion = 18.4
n = 20
= 954
= 960
= 0.05
= 960 = 960
< 960 = - 1.65 =
= = - 1.45
Answer Do not reject. The new bulb is meeting specifications.
-
7/22/2019 Statstictics Problems
9/40
Problem 4 BSNL provides telephone services in Coimbatore. According to the
companys records the average length of calls placed through thecompany is 11.44 minutes. The company wants to check if the meanlength of the current calls is different from 11.44 minutes. A sample of150 such calls placed through this company gave a mean length of12.71 minutes with a standard deviation of 2.65 minutes. Can youconclude that the mean length of all current calls is different from 11.44minutes? Use = 0.05.
Solut ion s = 2.65
n = 150
= 12.71
= 11.44
= 0.05 = 11.44
11.44
= 1.65 =
=
= 5.87 Answer
Reject . It is concluded that the mean length of current calls is different from11.44 minutes.
-
7/22/2019 Statstictics Problems
10/40
Confidence Interval (Two Populations) 9
Problem 1: Small Samples
Suppose that simple random samples of college freshman are selected from
two universities - 15 students from school A and 20 students from school B.
On a standardized test, the sample from school A has an average score of
1000 with a standard deviation of 100. The sample from school B has an
average score of 950 with a standard deviation of 90.
What is the 90% confidence interval for the difference in test scores at the two
schools, assuming that test scores came from normal distributions in bothschools? (Hint: Since the sample sizes are small, use a t score as the critical
value.)
(A) 50 + 1.70 (B) 50 + 28.49 (C) 50 + 32.74 (D) 50 + 55.66 (E) Noneof the above
Solution
The correct answer is (D). The approach that we used to solve this problem is
valid when the following conditions are met.
The sampling method must be simple random sampling. This
condition is satisfied; the problem statement says that we used simple
random sampling.
The samples must be independent. Since responses from one
sample did not affect responses from the other sample, the samples
are independent.
The sampling distribution should be approximately normally
distributed. The problem states that test scores in each population are
normally distributed, so the difference between test scores will also be
normally distributed.
Since the above requirements are satisfied, we can use the following four-
step approach to construct a confidence interval.
Identify a sample statistic. Since we are trying to estimate the
http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval -
7/22/2019 Statstictics Problems
11/40
difference between population means, we choose the difference
between sample means as the sample statistic. Thus, x1 - x2 = 1000 -
950 = 50. Select a confidence level. In this analysis, the confidence level isdefined for us in the problem. We are working with a 90% confidence
level. Find the margin of error. Elsewhere on this site, we show how to
compute the margin of error when the sampling distribution is
approximately normal. The key steps are shown below. Find standard deviation. Using the sample standard
deviations, we estimate the standard deviation of the difference
between sample means (SD). SD = sqrt [ s21 / n1 + s22 / n2 ]
SD = sqrt [(100)2 / 15 + (90)2 / 20] SD = sqrt (10,000/15 +8100/20) = sqrt(666.67 + 405) = 32.74
Find critical value. The critical value is a factor used to
compute the margin of error. Because the sample sizes are
small, we express the critical value as a t score rather than a z
score. To find the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)
= 1 - 90/100 = 0.10
Find the critical probability (p*): p* = 1 - /2 = 1 -
0.10/2 = 0.95
Find the degrees of freedom (df): DF = (s12/n1 +
s22/n2)
2 / { [ (s12 / n1)
2 / (n1 - 1) ] + [ (s22 / n2)
2 / (n2 - 1) ] }
DF = (1002/15 + 902/20)2 / { [ (1002 /15)2 / 14 ] + [ (902/20)2 / 19 ] } DF = (666.67 + 405}2 / (31746.03 +8632.89) = 1150614.5 / 40378.92 = 28.495 Rounding offto the nearest whole number, we conclude that there are
28 degrees of freedom.
The critical value is the t score having 28 degrees
of freedom and a cumulative probability equal to 0.95.
From the t Distribution Calculator, we find that the critical
value is 1.7.
http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx -
7/22/2019 Statstictics Problems
12/40
Compute margin of error (ME): ME = critical value *
standard deviation = 1.7 * 32.74 = 55.66
Specify the confidence interval. The range of the confidence interval isdefined by the sample statistic+ margin of error. And the uncertainty is
denoted by the confidence level.
Therefore, the 90% confidence interval is -5.66 to 100.66. That is, we are 99%
confident that the true difference in population means is in the range defined
by 50 + 55.66.
-
7/22/2019 Statstictics Problems
13/40
Problem 2: Large Samples
The local baseball team conducts a study to find the amount spent on
refreshments at the ball park. Over the course of the season they gather
simple random samples of 50 men and 100 women. For men, the average
expenditure was $20, with a standard deviation of $3. For women, it was $15,
with a standard deviation of $2.
What is the 99% confidence interval for the spending difference between men
and women? Assume that the two populations are independent and normally
distributed.
(A) $5 + $0.47 (B) $5 + $1.21 (C) $5 + $2.58 (D) $5 + $5.00 (E)None of the above
Solution
The correct answer is (B). The approach that we used to solve this problem is
valid when the following conditions are met.
The sampling method must be simple random sampling. This condition
is satisfied; the problem statement says that we used simple random
sampling.
The samples must be independent. Again, the problem statement
satisfies this condition.
The sampling distribution should be approximately normally distributed.
The problem states that test scores in each population are normally
distributed, so the difference between test scores will also be normally
distributed.
Since the above requirements are satisfied, we can use the following four-
step approach to construct a confidence interval.
Identify a sample statistic. Since we are trying to estimate the
difference between population means, we choose the difference
between sample means as the sample statistic. Thus, x1 - x2 = $20 -
http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval -
7/22/2019 Statstictics Problems
14/40
$15 = $5. Select a confidence level. In this analysis, the confidence level is
defined for us in the problem. We are working with a 99% confidence
level. Find the margin of error. Elsewhere on this site, we show how to
compute the margin of error when the sampling distribution is
approximately normal. The key steps are shown below. Find standard deviation. Since we do not know the
standard deviation of the populations, we use the sample
standard deviations to estimate the standard deviation of the
difference between sample means (SD). SD = sqrt [ s21 / n1 + s22
/ n2 ] SD = sqrt [(3)2 / 50 + (2)2 / 100] = sqrt (9/50 + 4/100) =sqrt(0.18 + 0.04) = 0.47
Find critical value. The critical value is a factor used to
compute the margin of error. Because the sample sizes are
large enough, we express the critical value as a z score. To find
the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)
= 1 - 99/100 = 0.01Find the critical probability (p*): p* = 1 - /2 = 1 -
0.01/2 = 0.995
The critical value is the z score having a
cumulative probability equal to 0.995. From the Normal
Distribution Calculator, we find that the critical value is
2.58.
Compute margin of error (ME): ME = critical value *
standard deviation = 2.58 * 0.47 = 1.21
Specify the confidence interval. The range of the confidence interval is
defined by the sample statistic+ margin of error. And the uncertainty is
denoted by the confidence level.Therefore, the 99% confidence interval is $3.79 to $6.21. That is, we are 99%confident that men outspend women at the ballpark by about $5 + $1.21.
http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx -
7/22/2019 Statstictics Problems
15/40
Problem: 3
A large hotel chain is trying to decide whether to convert more of its rooms tonon-smoking rooms. In a random sample of 400 guests last year, 166 hadrequested non-smoking rooms. This year 205 guests in a sample of 380preferred the non-smoking rooms. Would you recommend that the hotel chainconvert more rooms to non-smoking? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.
Solution:
n1 = 400
p1 = 0.415n2 = 380
p2 = 0.5395
H0: p1 = p2
H1: p1
-
7/22/2019 Statstictics Problems
16/40
Problem:4
Two different areas of a large eastern city are being considered as sites forday care centres. Of 200 households surveyed in one section, the proportionin which mothers worked full time was 0.52. In another section, 40 percent ofhouseholds surveyed had mothers working at full time jobs. At the 0.04 levelof significance, is there a significant difference in the proportions of workingmothers in the two areas of the city?
Solution:
n1 = 200
p1 = 0.52n2 = 150
p2 = 0.40
H0: p1 = p2
H1: p1p2
= 0.04
p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)
= 0.4686
Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539
The lower limit of the acceptance region is z = 2.05, or
p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105
Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539
= 2.23
>2.05, we reject H0. The proportions of workingmothers in the two areas differ significantly.
-
7/22/2019 Statstictics Problems
17/40
Problem: 5
ABC Airlines wants to find out whether to include more of non vegetarian fooditems in its menu. In a random sample of 600 guests last year, 200 had theinclusion of more of non vegetarian items. This year 300 guests in a sample of500 preferred non vegetarian items. Help the airline decide whether it shouldinclude more of non veg items or not? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.
Solution:
n1 = 600
p1 = 0.333n2 = 500
p2 = 0.600
H0: p1 = p2
H1: p1
-
7/22/2019 Statstictics Problems
18/40
Problem:6
In a school Two different classes are being considered as being rankednumber one. Of the 200 students surveyed in one section, the proportion inwhich students obtained full marks was 0.52. In another section, 40 percent ofthe students surveyed had students obtaining full marks. At the 0.04 level ofsignificance, is there a significant difference in the proportions of studentsgetting full marks?
Solution:
n1 = 200p1 = 0.52
n2 = 150
p2 = 0.40
H0: p1 = p2
H1: p1p2
= 0.04
p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)
= 0.4686
Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539
The lower limit of the acceptance region is z = 2.05, or
p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105
Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539
= 2.23
>2.05, we reject H0.
The proportions of the students obtaining results differ significantly.
-
7/22/2019 Statstictics Problems
19/40
Hypothesis Testing (Two Populations) 9
Problem 1
A machine produced 20 defective articles in a batch of 400. After overhauling, it produced 10 defectives in a batch of 300. Has the machineimproved?
n1= 400, n2 = 300
p1 = 20 / 400 = 0.05 P2 =10/300 =0.033
a) Statement of null and alternate hypothesis:H0 : P1 = P2
H1 : P1 > P 2
b) Level of Significance:Let = 0.05 be the significance. According, H 1 we use one tailed
test.
c) Test Statistic and observed value:
z = p 1 - p2
_________________P Q (1/n1 + 1/n2)
P= 20+100 /700= 3/70
Q= 67/70
Z0 = 0.05 - .0.033____________________________________ 3/70. 67/70 (1/400 +1/300)
= 1.103
d) expected value of statictic
z = p 1 - p2
_________________
P Q (1/n1 + 1/n2)has standard normal distribution for n1 and n2 /= 30. From normaldistribution table z e= 1.645 for = 0.05
-
7/22/2019 Statstictics Problems
20/40
e) Decision and ConclusionZ0= 1.103 ze=1.645 since z0 < ze accept H0 and interpret that machine is notimproved due to overhauling
-
7/22/2019 Statstictics Problems
21/40
Problem 2
A study was conducted to investigate the effectiveness of hypnotism inreducing pain. Results for randomly selected subjects are shown in the table.
The "before" value is matched to an "after" value.
TABLE 1
Subject: A B C D E F G H
Before 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6
After 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0
Are the sensory measurements, on average, lower after hypnotism? Test at a
5% significance level.
Corresponding "before" and "after" values form matched pairs.
TABLE 2
After Data Before Data Difference
6.8 6.6 0.2
2.4 6.5 -4.1
7.4 9 -1.6
8.5 10.3 -1.88.1 11.3 -3.2
6.1 8.1 -2
3.4 6.3 -2.9
2 11.6 -9.6
The data for the test are the differences: {0.2, -4.1, -1.6, -1.8, -3.2, -2, -2.9, -
9.6}
The sample mean and sample standard deviation of the differences are: xd=-
3.13 and sd=2.91 Verify these values.
Let d be the population mean for the differences. We use the subscript d to
denote "differences."
Random Variable: Xd = the average difference of the sensory measurements
Ho:d0
(2)
There is no improvement. (dis the population mean of the differences.)
-
7/22/2019 Statstictics Problems
22/40
Ha:dp-value.
Make a decision: Since >p-value, reject Ho.
This means that d
-
7/22/2019 Statstictics Problems
23/40
PROBLEM 3-
A weight reducing cream manufacturing company wanted to see whether theusage of the cream is beneficial or not. They are sceptical about the launch ofthe same and hence they sampled monthly usage by 6 of its users before and
after using the same, where the significance level is .02, find out the change.The results are as follows-
EMPLOYEE 1 2 3 4 5 6MONTH BEFOREUSE
219 205 226 198 209 216
MONTH AFTERUSE
235 186 240 203 221 205
H0: Sigma = 0
H1: Sigma>= 0
D = {Di/n
SD= {(Di-D)*2/n-1
Formula to be used- D- u D/ S/ n
Di= 16, -19, 14, 5 12, -11
{Di= 17D = 17/6= 2.8
{(Di-D)*2/n-1
=1054.84/5
=14.52095
Here, t test is conducted,
-
7/22/2019 Statstictics Problems
24/40
D u D/ (SD/ n)
=2.83/14.52/2.44
=.6177
So, Ho cannot be rejected. Hence we can say that there is no significantchange.
-
7/22/2019 Statstictics Problems
25/40
PROBLEM 4-
2 different telecom companies are trying to know the usage of its free calls atnight. The first company sampled 90 people and produced an average of 8.5hours of relief and a sample SD of 18 hours. The second company sampled
80 people producing an average of 7.9 hours of relief and sample SD of 2.1hours at .05 level of significance. Does the 2nd company have less usage?
Ho= u1-u2=0
H1= u1-u2>0
Here,
N1=90 N2=80
X1=8.5 X2= 7.9
S1=1.8 S2=2.1
Ho= u1-u2=0
H1= u1-u2>0
Z test= (X1 - X2)-(u1-u2)/ s1*2/n1 + s2*2/ n2
8.5-7.9/ (1.8)*2/90+ (2.1)*2 /80.06/.30
=1.983
Hence here Ho is rejected.
-
7/22/2019 Statstictics Problems
26/40
Problem : 5
Within a school district, students were randomly assigned to one of two Math
teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had
30 students, and Mrs. Jones had 25 students.
At the end of the year, each class took the same standardized test. Mrs.
Smith's students had an average test score of 78, with a standard deviation of
10; and Mrs. Jones' students had an average test score of 85, with a standard
deviation of 15.
Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective
teachers. Use a 0.10 level of significance. (Assume that student performance
is approximately normal.)
Solut ion:
The solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:
Null hypothesis: 1 - 2 = 0
Alternative hypothesis: 1 - 2 0
For this analysis, the significance level is 0.10. Using sample data, we
will conduct a two-sample t-test of the null hypothesis.
Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE = sqrt[(s12/n1) + (s2
2/n2)]
SE = sqrt[(102/30) + (152/25] = sqrt(3.33 + 9) = sqrt(12.33) = 3.51
DF = (s12/n1 + s2
2/n2)2 / { [ (s1
2 / n1)2 / (n1 - 1) ] + [ (s2
2 / n2)2 / (n2 - 1) ] }
DF = (102/30 + 152/25)2 / { [ (102 / 30)2 / (29) ] + [ (152 / 25)2 / (24) ] }
DF = (3.33 + 9)2 / { [ (3.33)2 / (29) ] + [ (9)2 / (24) ] } = 152.03 / (0.382 +
3.375) = 152.03/3.757 = 40.47
t = [ (x1 - x2) - d ] / SE = [ (78 - 85) - 0 ] / 3.51 = -7/3.51 = -1.99
http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test -
7/22/2019 Statstictics Problems
27/40
where s1 is the standard deviation of sample 1, s2 is the standard
deviation of sample 2, n1 is the size of sample 1, n2 is the size of
sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is
the hypothesized difference between the population means, and SE is
the standard error.
Since we have a two-tailed test, the P-value is the probability that a t-
score having 40 degrees of freedom is more extreme than -1.99; that
is, less than -1.99 or greater than 1.99.
We use the t Distribution Calculatorto find P(t < -1.99) = 0.027, and P(t
> 1.99) = 0.027. Thus, the P-value = 0.027 + 0.027 = 0.054.
Since the P-value (0.054) is less than the significance level (0.10), we cannot
accept the null hypothesis.
http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation -
7/22/2019 Statstictics Problems
28/40
Problem 6
In a restaurant there are 2 different departments i.e house keeping and
maintenance. House keeping department had 45 waiters and Maintenance
department had 55 waiters.
At the end of the year, each department took the same standardized test to
measure its performance.House keeping department had an average test
score of 65, with a standard deviation of 10; and Mrs. Jones' students had an
average test score of 75, with a standard deviation of 15.
Test the hypothesis that house keeping department and maintenance
department are equally effective . Use a 0.10 level of significance. (Assume
that waiters performance is approximately normal.)
Solut ion:
The solution to this problem takes four steps: (1) state the hypotheses, (2)
formulate an analysis plan, (3) analyze sample data, and (4) interpret results.
We work through those steps below:
Null hypothesis: 1 - 2 = 0
Alternative hypothesis: 1 - 2 0
For this analysis, the significance level is 0.10. Using sample data, we
will conduct a two-sample t-test of the null hypothesis.
Using sample data, we compute the standard error (SE), degrees of
freedom (DF), and the t-score test statistic (t).
SE = sqrt[(s12/n1) + (s2
2/n2)]
SE = sqrt[(102/45) + (152/55] = 2.51
DF = (s12/n1 + s2
2/n2)2 / { [ (s1
2 / n1)2 / (n1 - 1) ] + [ (s2
2 / n2)2 / (n2 - 1) ] }
DF = (102/45 + 152/55)2 / { [ (102 / 45)2 / (44) ] + [ (152 / 55)2 / (54) ] }
=8.05
t = [ (x1 - x2) - d ] / SE = [ (65 - 75) - 0 ] / 2.51 = -3.98
http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test -
7/22/2019 Statstictics Problems
29/40
where s1 is the standard deviation of sample 1, s2 is the standard
deviation of sample 2, n1 is the size of sample 1, n2 is the size of
sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is
the hypothesized difference between the population means, and SE is
the standard error.
Since we have a two-tailed test, the P-value is the probability that a t-
score having 40 degrees of freedom is more extreme than -3.98; that
is, less than -3.98 or greater than 3.98.
We use the t Distribution Calculatorto find P(t < -3.98) = 0.153, and P(t
> 3.98) = 0.153. Thus, the P-value = 0.153 + 0.153 = 0.306.
Since the P-value (0.306) is more than the significance level (0.10), we
accept the null hypothesis.
http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation -
7/22/2019 Statstictics Problems
30/40
Problem 7A credit-insurance organization has developed a new high-tech method of
training new sales personnel. The company sampled 16 employeeswho were trained the original way and found average daily sales to be$688 and the sample standard deviation was $32.63. They also
sampled 11 employees who were trained using the new method andfound average daily sales to be $ 706 and the sample standarddeviation was $24.84. At alpha = 0.05, can the company conclude thataverage daily sales have increased under the new plan?
Solut ion
n1 = 16 n2 = 11 n = n1 + n2 = 27
1 = 688 2 = 206
1 = 32.63 2 = 24.84
= 0.05 1 - 2 = 0
1 - 2 0 = - 1.708
=
=
= 885.64
=
=
= - 1.545 Answer
Do not reject. Average daily sales have not increased significantly.
-
7/22/2019 Statstictics Problems
31/40
Problem 8
Block Enterprises, a manufacturer of chips for computers, is in theprocess of deciding whether to replace its current semi automatedassembly line with a fully automated assembly line. Block has gathered
some preliminary test data about hourly chip production, which issummarized in the following table and it would like to know whether itshould upgrade its assembly line. At = 0.02, state and test thehypothesis to help Block decide.
nSemi automaticLine
198 32 150
Automatic Line 206 29 200
Solut ion
1 = 198 1 = 32 n1 = 1502 = 206 2 = 29 n2 = 200 = 0.02
1 - 2 = 0 1 - 2 0
= - 2.06 = ()( )
=()()
= - 2.408
Answer
is rejected. Block should upgrade to an automatic line.
-
7/22/2019 Statstictics Problems
32/40
One Way Anova 1
Problem:
A quality control supervisor for an automobile manufacturer is concerned withuniformity in the number of defects in cars coming off the assembly line. If oneassembly line has significantly more variability in the number of defects, thenchanges have to be made. The supervisor has collected the following data:
Number of Defects
Assembly Line A Assembly Line B
Mean 10 11
Variance 9 25
Sample Size 20 16
Does Assembly line B have significantly more variability in the number ofdefects? Test at the 0.05 significance level.
Solution:
H0: SigmaB = SigmaA
H1: SigmaB > SigmaA
Observed F =SB/SA
= 25/9 = 2.778
Fcrit = F0.05(15,19)
= 2.23
Thus we reject H0; assembly line B does have significantly more variability inthe number of defects, so some changes have to be made.
-
7/22/2019 Statstictics Problems
33/40
Chi Sq Test 3
Chi-Square Goodness of Fit Test
Problem
English test grade distributions have changed from last year, with grade B'ssomewhat lower. Is this significant?
English testresults
Grade A Grade B Grade C Grade D Grade EThis year, O 23 32 20 15 10Last year 25 20 15 25 10
Solution:
The given statement is H0.
The table below shows the calculation. First, the expected values arecreated by scaling last year's results to be equivalent to this year. Then thetest statistic is calculated as SUM((O - E)^2/E).
English test
resultsGrade A Grade B Grade C Grade D Grade E Sum
This year, O 23 32 20 15 10 100Last year 25 20 15 25 10 95Scaled last year, E 26 21 16 26 11 100(O - E) -3.3 10.9 4.2 -11.3 -0.5(O - E)^2 11.0 119.8 17.7 128.0 0.3(O - E)^2/E 0.4 5.7 1.1 4.9 0.0 12.1
Chi-square is found to be 12.1 and the degrees of freedom are (5-1) = 4
(there are five possible grades). Looking this up in the Chi Square tableshows the probability is between 5% (9.49) and 1% (13.28), so H0 isadequately falsified and a significant change can be claimed.
-
7/22/2019 Statstictics Problems
34/40
Chi-Square Test of Independence
Problem
A year group in school chooses between drama and history as below. Isthere any difference between boys' and girls' choices?
Chosedrama
Chosehistory
Boys 43 55
Girls 52 54
Solution:Observed
Chosedrama
Chosehistory Total
Boys 43 55 98Girls 52 54 106Total 95 109 204
Expected = (row tot * col tot)/overall totChosedrama
Chosehistory Total
Boys 45.6 52.4 98Girls 49.4 56.6 106Total 95 109 204
(observed - expected)^2/expectedChose
drama
Chose
history TotalBoys 0.2 0.1Girls 0.1 0.1Total 0.55
Chi-square is 0.55. There are (2-1)*(2-1) = 1 degree of freedom. Checkingthe Chi Square table shows 0.55 is between 0.004 and 3.84, so noconclusion can be drawn about independence or similarity between boys'and girls' choices.
-
7/22/2019 Statstictics Problems
35/40
Chi-Square Test Equality of Proportions
Problem
A wholesale merchant received a shipment of goods which is claimed to becontaining 5% defective items. The merchant decided to verify this. He drewa sample of 15 items and found 3 defective items. Test the claim. Use =0.05.
Solution
Given p=the proportion of defectives in the whole shipment
p=p0=5/100
=0.05
n=15, p=x/n=3/15
a. Statement of null and alternate hypothesis
H0: p=0.05H1: p>0.05
b. Level of significance:
Given = 0.05. So we use right-tailed test.c. Test statistic and observed value:
x0 = 3,number of defectives
P = 3/15 = 0.2d. Expected value of test statistic:
P(X3)=(x=3,15) ()() , p0 = 0.05= 0.0362
e. Decision & Conclusion:
x0 = 3 lies in rejection region, since 0.0362 is less than 0.05.therefore
we reject H0 and conclude shipment contains more than 5% defective
items, and hence merchant is advised to reject shipment.
Binomial Dist 11. Binomial Distribution
Problem
-
7/22/2019 Statstictics Problems
36/40
Harley Davidson, director of quality control for the Kyoto MotorCompany is conducting his monthly spot check of automatictransmissions. In this procedure, 10 transmissions are removed fromthe pool of components and are checked for manufacturing defects.Historically, only 2 percent of the transmissions have flaws.
a. What is the probability that Harleys sample contains more than twotransmissions with manufacturing flaws?
b. What is the probability that none of the selected transmissions has
any manufacturing flaws?
Solut ion
n = 10
p = 0.02
1 p = 0.98
FormulaP(X = ) = ()( )
P ( = 0) = ()()
= 0.8170
P ( = 1) = ()()
= 0.16674
-
7/22/2019 Statstictics Problems
37/40
P ( = 2) = ()()
= 0.0153
P ( > 2) = 1 p ( 2)= 1 [p ( = 0) + p ( = 1) + p ( = 2)]= 1 [0.8170 + 0.16674 + 0.0153]
= 1 [0.9991]
= 0.0009
Answer
a. The probability that Harleys sample contains more than twotransmissions with manufacturing flaws is 0.0009.
b. The probability that none of the selected transmissions has any
manufacturing flaws is 0.8170.
-
7/22/2019 Statstictics Problems
38/40
Poisson Dist 1Poisson Distribution
Problem Southwestern Electronics has developed a new calculator that performs a
series of functions not yet performed by any other calculator. Themarketing department is planning to demonstrate this calculator to agroup of potential customers, but is worried about some initialproblems, which have resulted in 4 percent of the new calculatorsdeveloping mathematical inconsistencies. The marketing VP Isplanning on randomly selecting a group of calculators for thisdemonstration and is worried about the chances of selecting acalculator that could start malfunctioning. He believes that whether ornot a calculator functions is a Bernoulli process and he is convincedthat the probability of malfunction is really about 0.04.
Assuming that the VP selects exactly 50 calculators to use in thedemonstration, and using the Poisson distribution as anapproximation of the binomial, what is the chance of getting atleast three calculators that malfunction?
No calculators malfunctioning?Solut ion
n = 50
p = 0.04
= np = 2
= 0.13533
Formula: P(X=) = P ( =0) = = 0.13533
P ( =1) = = 0.27066
P ( =2) = = 0.27066
P ( =3) = = 0.18044
P ( 3) = 1 P ( 2)
-
7/22/2019 Statstictics Problems
39/40
= 1 [P ( =0) + P ( =1) + P ( =2)]
= 1 [0.13533 + 0.27066 +0.27066]
= 0.32335 Answer
The chance of getting at least three calculators that malfunction is32.33%
The chance of no calculators malfunctioning is 13.53%
-
7/22/2019 Statstictics Problems
40/40
Normal Dis 1
Normal Distribution
Problem
Regulations concerning the maximum number of people who can occupy a liftare to be set. The total weight of 8 people chosen at random follows a normaldistribution with a mean of 550kg and a standard deviation of 150kg. Whatsthe probability that the total weight of 8 people exceeds 600kg?
Solution
The mean is 550kg and we are interested in the area that is greater than600kg.
z = ( x - m ) / s
Here x = 600kgm , the mean = 550kgs, the standard deviation = 150kg
z = ( 600 - 550 ) / 150z = 50 / 150z = 0.33
Looking in the table for z = 0.3, and across under 0.03.
The number in the table is the tail area for z=0.33 which is 0.3707.
This is the probability that the weight will exceed 600kg.
Therefore, the probability that the total weight of 8 people exceeds 600kg is0.37 correct to 2 figures.