statstictics problems

7/22/2019 Statstictics Problems

1/40

Statstictics Problems

Group :

Name: Roll no

Anuth Siddharth 127Abir Banerjee 116

Ninad Tatke 175Maulik Chandarana 168Madhurima Chatterjee 159Tulsi Zaveri 214Daniel Fernandes 135Deepika Singh 136


2/40

Confidence Interval (Single Population) 4 problemsProblem: 1

A ketchup manufacturer is in the process of deciding whether to promote anew extra-spicy brand. The companys marketing-research department used

a national telephone survey of 6000 households and found that the extraspicy ketchup would be purchased by 335 of them. A much more extensivestudy made 2 years ago showed that 5 percent of the households wouldpurchase the brand then. At a 2 percent significance level, should thecompany conclude that there is an increased interest in the extra-spicy flavor?

Solution:

N=6000

H0: p=0.05 H1: p>0.05 =0.02

The upper limit of the acceptance region is z=2.05, orp = pH0 + z((pH0*qH0)/n) = 0.05 + 2.05((0.05*0.95)/6000) = 0.05577

Because the observed z value = (p - pH0)/(pH0qH0/n)

= (0.055830.05)/(0.05*0.95/6000)

=2.07

>2.05 (or p>0.05577), we should reject H0. Thecurrent interest is significantly greater than the interest of

2 years ago.


3/40

Problem: 2

Steve Cutter sells Big Blade lawn mowers in his hardware store, and he isinterested in comparing the reliability of the mowers he sells with the reliabilityof Big Blade mowers sold nationwide. Steve knows that only 15 percent of all

Big Blade mowers sold nationwide require repairs during the first year ofownership. A sample of 120 of Steves customers revealed that exactly 22 ofthem required mower repairs in the first year of ownership. At the 0.02 level ofsignificance, is there evidence that Steves Big Blade mowers differ inreliability from those sold nationwide?

Solution:

N=120

p = 22/120 = 0.1833

H0: p = 0.15H1: p 0.15

= 0.02

The limits of the acceptance region are z = 2.33, or

p = pH0 + z((pH0*qH0)/n) = 0.15 2.33((0.15*0.85)/120)

= (0.0741, 0.2259)

Because the observed z value = (p - pH0)/(pH0qH0/n)

= (0.18330.15)/(0.15*0.85/120)

= 1.02


4/40

PROBLEM 3-

In a mobile phone manufacturing company, a random sample of 81 phones istaken producing a sample mean of 47 and a sample standard variation of5.89. Construct a 90% confidence interval assuming that the number of

camera phones among normal phones is evenly distributed. Find the intervalwidth?

Answer-

Here:

s=5.89

x=47

n=81

Formula- x+Z*SD/ n

=47+1.65*5.89/81

(Where 1.65=> in the given table, the value for .45 is 1.65)

=48.07 (upper limit)

Now,

=47-1.65*5.89/81=45.93 (lower limit)

Answer- We are 90% confident that the number of camera phones amongnormal phones will lie between 45.93 and 48.07.


5/40

PROBLEM 4-

In a new food home delivery service business, there is a loss of 12 dollars.Suppose this was resulted from a random sample of 25 households, wherethe SD is 21 $, compute a 98% confidence interval on this sample result. How

wide is the interval?Answer-

Here:

s=21

x=12

n=25

Formula- x+Z*SD/ n

=12+2.33*21/25

=21.78 (upper limit)

Now,

=12-2.33*21/25

=2.22 (lower limit)

Answer- We are 98% confident that the result is within this width.


6/40

Hypothesis Testing (Single Population) 4

1. Hypothesis Testing (Single Population)

Problem 1

An insurance company is reviewing its current policy rates. Whenoriginally setting the rates they believed that the average claim amountwas $1,800. They are concerned that the true mean is actually higherthan this, because they could potentially lose a lot of money. Theyrandomly select 40 claims, and calculate a sample mean of $1,950.Assuming that the standard deviation of claims is $500, and set =0.05 test to see if the insurance company should be concerned.

Solut ion

n = 40

= 1950 = 0.05

= 1800

1800 > 1800

= 1.96 (two tailed hypothesis) =

=

= 1.897

Answer

Do not reject as 1.897 falls in the confidence region. We cannotconclude anything statistically significant from this test, and cannot tellthe insurance company whether or not they should be concerned abouttheir current policies.


7/40

Problem 2

A car manufacturer claimed that their car averaged at least 31 milesper gallon of gasoline. A sample of 9 cars was selected and each carwas driven with one gallon of regular gasoline. The sample showed a

mean of 29.43 miles with a standard deviation of 3 miles. = 0.05.What do you conclude about the manufacturers claim?

Solut ion

n = 9

= 29.43 = 0.05

= 31

31 < 31

= -1.860

= =

= - 1.57

Answer

We cannot reject

. There is insufficient evidence to doubt the

manufacturers claim concerning the gas mileage.


8/40

Problem3 General Electric has developed a new bulb whose design specifications call

for a light output of 960 lumens compared to an earlier model thatproduced only 750 lumens. The companys data indicate that thestandard deviation of light output for this type of bulb is 18.4 lumens.From a sample of 20 new bulbs, the testing committee found anaverage light output of 954 lumens per bulb. At a 0.05 significancelevel, can General Electric conclude that its new bulb is producing thespecified 960 lumen output?

Solut ion = 18.4

n = 20

= 954

= 960

= 0.05

= 960 = 960

< 960 = - 1.65 =

= = - 1.45

Answer Do not reject. The new bulb is meeting specifications.


9/40

Problem 4 BSNL provides telephone services in Coimbatore. According to the

companys records the average length of calls placed through thecompany is 11.44 minutes. The company wants to check if the meanlength of the current calls is different from 11.44 minutes. A sample of150 such calls placed through this company gave a mean length of12.71 minutes with a standard deviation of 2.65 minutes. Can youconclude that the mean length of all current calls is different from 11.44minutes? Use = 0.05.

Solut ion s = 2.65

n = 150

= 12.71

= 11.44

= 0.05 = 11.44

11.44

= 1.65 =

=

= 5.87 Answer

Reject . It is concluded that the mean length of current calls is different from11.44 minutes.


10/40

Confidence Interval (Two Populations) 9

Problem 1: Small Samples

Suppose that simple random samples of college freshman are selected from

two universities - 15 students from school A and 20 students from school B.

On a standardized test, the sample from school A has an average score of

1000 with a standard deviation of 100. The sample from school B has an

average score of 950 with a standard deviation of 90.

What is the 90% confidence interval for the difference in test scores at the two

schools, assuming that test scores came from normal distributions in bothschools? (Hint: Since the sample sizes are small, use a t score as the critical

value.)

(A) 50 + 1.70 (B) 50 + 28.49 (C) 50 + 32.74 (D) 50 + 55.66 (E) Noneof the above

Solution

The correct answer is (D). The approach that we used to solve this problem is

valid when the following conditions are met.

The sampling method must be simple random sampling. This

condition is satisfied; the problem statement says that we used simple

random sampling.

The samples must be independent. Since responses from one

sample did not affect responses from the other sample, the samples

are independent.

The sampling distribution should be approximately normally

distributed. The problem states that test scores in each population are

normally distributed, so the difference between test scores will also be

normally distributed.

Since the above requirements are satisfied, we can use the following four-

step approach to construct a confidence interval.

Identify a sample statistic. Since we are trying to estimate the
http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval


11/40

difference between population means, we choose the difference

between sample means as the sample statistic. Thus, x1 - x2 = 1000 -

950 = 50. Select a confidence level. In this analysis, the confidence level isdefined for us in the problem. We are working with a 90% confidence

level. Find the margin of error. Elsewhere on this site, we show how to

compute the margin of error when the sampling distribution is

approximately normal. The key steps are shown below. Find standard deviation. Using the sample standard

deviations, we estimate the standard deviation of the difference

between sample means (SD). SD = sqrt [ s21 / n1 + s22 / n2 ]

SD = sqrt [(100)2 / 15 + (90)2 / 20] SD = sqrt (10,000/15 +8100/20) = sqrt(666.67 + 405) = 32.74

Find critical value. The critical value is a factor used to

compute the margin of error. Because the sample sizes are

small, we express the critical value as a t score rather than a z

score. To find the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)

= 1 - 90/100 = 0.10

Find the critical probability (p*): p* = 1 - /2 = 1 -

0.10/2 = 0.95

Find the degrees of freedom (df): DF = (s12/n1 +

s22/n2)

2 / { [ (s12 / n1)

2 / (n1 - 1) ] + [ (s22 / n2)

2 / (n2 - 1) ] }

DF = (1002/15 + 902/20)2 / { [ (1002 /15)2 / 14 ] + [ (902/20)2 / 19 ] } DF = (666.67 + 405}2 / (31746.03 +8632.89) = 1150614.5 / 40378.92 = 28.495 Rounding offto the nearest whole number, we conclude that there are

28 degrees of freedom.

The critical value is the t score having 28 degrees

of freedom and a cumulative probability equal to 0.95.

From the t Distribution Calculator, we find that the critical

value is 1.7.
http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx


12/40

Compute margin of error (ME): ME = critical value *

standard deviation = 1.7 * 32.74 = 55.66

Specify the confidence interval. The range of the confidence interval isdefined by the sample statistic+ margin of error. And the uncertainty is

denoted by the confidence level.

Therefore, the 90% confidence interval is -5.66 to 100.66. That is, we are 99%

confident that the true difference in population means is in the range defined

by 50 + 55.66.


13/40

Problem 2: Large Samples

The local baseball team conducts a study to find the amount spent on

refreshments at the ball park. Over the course of the season they gather

simple random samples of 50 men and 100 women. For men, the average

expenditure was $20, with a standard deviation of $3. For women, it was $15,

with a standard deviation of $2.

What is the 99% confidence interval for the spending difference between men

and women? Assume that the two populations are independent and normally

distributed.

(A) $5 + $0.47 (B) $5 + $1.21 (C) $5 + $2.58 (D) $5 + $5.00 (E)None of the above

Solution

The correct answer is (B). The approach that we used to solve this problem is

valid when the following conditions are met.

The sampling method must be simple random sampling. This condition

is satisfied; the problem statement says that we used simple random

sampling.

The samples must be independent. Again, the problem statement

satisfies this condition.

The sampling distribution should be approximately normally distributed.

The problem states that test scores in each population are normally

distributed, so the difference between test scores will also be normally

distributed.

Since the above requirements are satisfied, we can use the following four-

step approach to construct a confidence interval.

Identify a sample statistic. Since we are trying to estimate the

difference between population means, we choose the difference

between sample means as the sample statistic. Thus, x1 - x2 = $20 -
http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval


14/40

$15 = $5. Select a confidence level. In this analysis, the confidence level is

defined for us in the problem. We are working with a 99% confidence

level. Find the margin of error. Elsewhere on this site, we show how to

compute the margin of error when the sampling distribution is

approximately normal. The key steps are shown below. Find standard deviation. Since we do not know the

standard deviation of the populations, we use the sample

standard deviations to estimate the standard deviation of the

difference between sample means (SD). SD = sqrt [ s21 / n1 + s22

/ n2 ] SD = sqrt [(3)2 / 50 + (2)2 / 100] = sqrt (9/50 + 4/100) =sqrt(0.18 + 0.04) = 0.47

Find critical value. The critical value is a factor used to

compute the margin of error. Because the sample sizes are

large enough, we express the critical value as a z score. To find

the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)

= 1 - 99/100 = 0.01Find the critical probability (p*): p* = 1 - /2 = 1 -

0.01/2 = 0.995

The critical value is the z score having a

cumulative probability equal to 0.995. From the Normal

Distribution Calculator, we find that the critical value is

2.58.

Compute margin of error (ME): ME = critical value *

standard deviation = 2.58 * 0.47 = 1.21

Specify the confidence interval. The range of the confidence interval is

defined by the sample statistic+ margin of error. And the uncertainty is

denoted by the confidence level.Therefore, the 99% confidence interval is $3.79 to $6.21. That is, we are 99%confident that men outspend women at the ballpark by about $5 + $1.21.
http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx


15/40

Problem: 3

A large hotel chain is trying to decide whether to convert more of its rooms tonon-smoking rooms. In a random sample of 400 guests last year, 166 hadrequested non-smoking rooms. This year 205 guests in a sample of 380preferred the non-smoking rooms. Would you recommend that the hotel chainconvert more rooms to non-smoking? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.

Solution:

n1 = 400

p1 = 0.415n2 = 380

p2 = 0.5395

H0: p1 = p2

H1: p1


16/40

Problem:4

Two different areas of a large eastern city are being considered as sites forday care centres. Of 200 households surveyed in one section, the proportionin which mothers worked full time was 0.52. In another section, 40 percent ofhouseholds surveyed had mothers working at full time jobs. At the 0.04 levelof significance, is there a significant difference in the proportions of workingmothers in the two areas of the city?

Solution:

n1 = 200

p1 = 0.52n2 = 150

p2 = 0.40

H0: p1 = p2

H1: p1p2

= 0.04

p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)

= 0.4686

Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539

The lower limit of the acceptance region is z = 2.05, or

p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105

Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539

= 2.23

>2.05, we reject H0. The proportions of workingmothers in the two areas differ significantly.


17/40

Problem: 5

ABC Airlines wants to find out whether to include more of non vegetarian fooditems in its menu. In a random sample of 600 guests last year, 200 had theinclusion of more of non vegetarian items. This year 300 guests in a sample of500 preferred non vegetarian items. Help the airline decide whether it shouldinclude more of non veg items or not? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.

Solution:

n1 = 600

p1 = 0.333n2 = 500

p2 = 0.600

H0: p1 = p2

H1: p1


18/40

Problem:6

In a school Two different classes are being considered as being rankednumber one. Of the 200 students surveyed in one section, the proportion inwhich students obtained full marks was 0.52. In another section, 40 percent ofthe students surveyed had students obtaining full marks. At the 0.04 level ofsignificance, is there a significant difference in the proportions of studentsgetting full marks?

Solution:

n1 = 200p1 = 0.52

n2 = 150

p2 = 0.40

H0: p1 = p2

H1: p1p2

= 0.04

p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)

= 0.4686

Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539

The lower limit of the acceptance region is z = 2.05, or

p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105

Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539

= 2.23

>2.05, we reject H0.

The proportions of the students obtaining results differ significantly.


19/40

Hypothesis Testing (Two Populations) 9

Problem 1

A machine produced 20 defective articles in a batch of 400. After overhauling, it produced 10 defectives in a batch of 300. Has the machineimproved?

n1= 400, n2 = 300

p1 = 20 / 400 = 0.05 P2 =10/300 =0.033

a) Statement of null and alternate hypothesis:H0 : P1 = P2

H1 : P1 > P 2

b) Level of Significance:Let = 0.05 be the significance. According, H 1 we use one tailed

test.

c) Test Statistic and observed value:

z = p 1 - p2

_________________P Q (1/n1 + 1/n2)

P= 20+100 /700= 3/70

Q= 67/70

Z0 = 0.05 - .0.033____________________________________ 3/70. 67/70 (1/400 +1/300)

= 1.103

d) expected value of statictic

z = p 1 - p2

_________________

P Q (1/n1 + 1/n2)has standard normal distribution for n1 and n2 /= 30. From normaldistribution table z e= 1.645 for = 0.05


20/40

e) Decision and ConclusionZ0= 1.103 ze=1.645 since z0 < ze accept H0 and interpret that machine is notimproved due to overhauling


21/40

Problem 2

A study was conducted to investigate the effectiveness of hypnotism inreducing pain. Results for randomly selected subjects are shown in the table.

The "before" value is matched to an "after" value.

TABLE 1

Subject: A B C D E F G H

Before 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6

After 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0

Are the sensory measurements, on average, lower after hypnotism? Test at a

5% significance level.

Corresponding "before" and "after" values form matched pairs.

TABLE 2

After Data Before Data Difference

6.8 6.6 0.2

2.4 6.5 -4.1

7.4 9 -1.6

8.5 10.3 -1.88.1 11.3 -3.2

6.1 8.1 -2

3.4 6.3 -2.9

2 11.6 -9.6

The data for the test are the differences: {0.2, -4.1, -1.6, -1.8, -3.2, -2, -2.9, -

9.6}

The sample mean and sample standard deviation of the differences are: xd=-

3.13 and sd=2.91 Verify these values.

Let d be the population mean for the differences. We use the subscript d to

denote "differences."

Random Variable: Xd = the average difference of the sensory measurements

Ho:d0

(2)

There is no improvement. (dis the population mean of the differences.)


22/40

Ha:dp-value.

Make a decision: Since >p-value, reject Ho.

This means that d


23/40

PROBLEM 3-

A weight reducing cream manufacturing company wanted to see whether theusage of the cream is beneficial or not. They are sceptical about the launch ofthe same and hence they sampled monthly usage by 6 of its users before and

after using the same, where the significance level is .02, find out the change.The results are as follows-

EMPLOYEE 1 2 3 4 5 6MONTH BEFOREUSE

219 205 226 198 209 216

MONTH AFTERUSE

235 186 240 203 221 205

H0: Sigma = 0

H1: Sigma>= 0

D = {Di/n

SD= {(Di-D)*2/n-1

Formula to be used- D- u D/ S/ n

Di= 16, -19, 14, 5 12, -11

{Di= 17D = 17/6= 2.8

{(Di-D)*2/n-1

=1054.84/5

=14.52095

Here, t test is conducted,


24/40

D u D/ (SD/ n)

=2.83/14.52/2.44

=.6177

So, Ho cannot be rejected. Hence we can say that there is no significantchange.


25/40

PROBLEM 4-

2 different telecom companies are trying to know the usage of its free calls atnight. The first company sampled 90 people and produced an average of 8.5hours of relief and a sample SD of 18 hours. The second company sampled

80 people producing an average of 7.9 hours of relief and sample SD of 2.1hours at .05 level of significance. Does the 2nd company have less usage?

Ho= u1-u2=0

H1= u1-u2>0

Here,

N1=90 N2=80

X1=8.5 X2= 7.9

S1=1.8 S2=2.1

Ho= u1-u2=0

H1= u1-u2>0

Z test= (X1 - X2)-(u1-u2)/ s1*2/n1 + s2*2/ n2

8.5-7.9/ (1.8)*2/90+ (2.1)*2 /80.06/.30

=1.983

Hence here Ho is rejected.


26/40

Problem : 5

Within a school district, students were randomly assigned to one of two Math

teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had

30 students, and Mrs. Jones had 25 students.

At the end of the year, each class took the same standardized test. Mrs.

Smith's students had an average test score of 78, with a standard deviation of

10; and Mrs. Jones' students had an average test score of 85, with a standard

deviation of 15.

Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective

teachers. Use a 0.10 level of significance. (Assume that student performance

is approximately normal.)

Solut ion:

The solution to this problem takes four steps: (1) state the hypotheses, (2)

formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

We work through those steps below:

Null hypothesis: 1 - 2 = 0

Alternative hypothesis: 1 - 2 0

For this analysis, the significance level is 0.10. Using sample data, we

will conduct a two-sample t-test of the null hypothesis.

Using sample data, we compute the standard error (SE), degrees of

freedom (DF), and the t-score test statistic (t).

SE = sqrt[(s12/n1) + (s2

2/n2)]

SE = sqrt[(102/30) + (152/25] = sqrt(3.33 + 9) = sqrt(12.33) = 3.51

DF = (s12/n1 + s2

2/n2)2 / { [ (s1

2 / n1)2 / (n1 - 1) ] + [ (s2

2 / n2)2 / (n2 - 1) ] }

DF = (102/30 + 152/25)2 / { [ (102 / 30)2 / (29) ] + [ (152 / 25)2 / (24) ] }

DF = (3.33 + 9)2 / { [ (3.33)2 / (29) ] + [ (9)2 / (24) ] } = 152.03 / (0.382 +

3.375) = 152.03/3.757 = 40.47

t = [ (x1 - x2) - d ] / SE = [ (78 - 85) - 0 ] / 3.51 = -7/3.51 = -1.99
http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test


27/40

where s1 is the standard deviation of sample 1, s2 is the standard

deviation of sample 2, n1 is the size of sample 1, n2 is the size of

sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is

the hypothesized difference between the population means, and SE is

the standard error.

Since we have a two-tailed test, the P-value is the probability that a t-

score having 40 degrees of freedom is more extreme than -1.99; that

is, less than -1.99 or greater than 1.99.

We use the t Distribution Calculatorto find P(t < -1.99) = 0.027, and P(t

> 1.99) = 0.027. Thus, the P-value = 0.027 + 0.027 = 0.054.

Since the P-value (0.054) is less than the significance level (0.10), we cannot

accept the null hypothesis.
http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation


28/40

Problem 6

In a restaurant there are 2 different departments i.e house keeping and

maintenance. House keeping department had 45 waiters and Maintenance

department had 55 waiters.

At the end of the year, each department took the same standardized test to

measure its performance.House keeping department had an average test

score of 65, with a standard deviation of 10; and Mrs. Jones' students had an

average test score of 75, with a standard deviation of 15.

Test the hypothesis that house keeping department and maintenance

department are equally effective . Use a 0.10 level of significance. (Assume

that waiters performance is approximately normal.)

Solut ion:

The solution to this problem takes four steps: (1) state the hypotheses, (2)

formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

We work through those steps below:

Null hypothesis: 1 - 2 = 0

Alternative hypothesis: 1 - 2 0

For this analysis, the significance level is 0.10. Using sample data, we

will conduct a two-sample t-test of the null hypothesis.

Using sample data, we compute the standard error (SE), degrees of

freedom (DF), and the t-score test statistic (t).

SE = sqrt[(s12/n1) + (s2

2/n2)]

SE = sqrt[(102/45) + (152/55] = 2.51

DF = (s12/n1 + s2

2/n2)2 / { [ (s1

2 / n1)2 / (n1 - 1) ] + [ (s2

2 / n2)2 / (n2 - 1) ] }

DF = (102/45 + 152/55)2 / { [ (102 / 45)2 / (44) ] + [ (152 / 55)2 / (54) ] }

=8.05

t = [ (x1 - x2) - d ] / SE = [ (65 - 75) - 0 ] / 2.51 = -3.98
http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test


29/40

where s1 is the standard deviation of sample 1, s2 is the standard

deviation of sample 2, n1 is the size of sample 1, n2 is the size of

sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is

the hypothesized difference between the population means, and SE is

the standard error.

Since we have a two-tailed test, the P-value is the probability that a t-

score having 40 degrees of freedom is more extreme than -3.98; that

is, less than -3.98 or greater than 3.98.

We use the t Distribution Calculatorto find P(t < -3.98) = 0.153, and P(t

> 3.98) = 0.153. Thus, the P-value = 0.153 + 0.153 = 0.306.

Since the P-value (0.306) is more than the significance level (0.10), we

accept the null hypothesis.
http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation


30/40

Problem 7A credit-insurance organization has developed a new high-tech method of

training new sales personnel. The company sampled 16 employeeswho were trained the original way and found average daily sales to be$688 and the sample standard deviation was $32.63. They also

sampled 11 employees who were trained using the new method andfound average daily sales to be $ 706 and the sample standarddeviation was $24.84. At alpha = 0.05, can the company conclude thataverage daily sales have increased under the new plan?

Solut ion

n1 = 16 n2 = 11 n = n1 + n2 = 27

1 = 688 2 = 206

1 = 32.63 2 = 24.84

= 0.05 1 - 2 = 0

1 - 2 0 = - 1.708

=

=

= 885.64

=

=

= - 1.545 Answer

Do not reject. Average daily sales have not increased significantly.


31/40

Problem 8

Block Enterprises, a manufacturer of chips for computers, is in theprocess of deciding whether to replace its current semi automatedassembly line with a fully automated assembly line. Block has gathered

some preliminary test data about hourly chip production, which issummarized in the following table and it would like to know whether itshould upgrade its assembly line. At = 0.02, state and test thehypothesis to help Block decide.

nSemi automaticLine

198 32 150

Automatic Line 206 29 200

Solut ion

1 = 198 1 = 32 n1 = 1502 = 206 2 = 29 n2 = 200 = 0.02

1 - 2 = 0 1 - 2 0

= - 2.06 = ()( )

=()()

= - 2.408

Answer

is rejected. Block should upgrade to an automatic line.


32/40

One Way Anova 1

Problem:

A quality control supervisor for an automobile manufacturer is concerned withuniformity in the number of defects in cars coming off the assembly line. If oneassembly line has significantly more variability in the number of defects, thenchanges have to be made. The supervisor has collected the following data:

Number of Defects

Assembly Line A Assembly Line B

Mean 10 11

Variance 9 25

Sample Size 20 16

Does Assembly line B have significantly more variability in the number ofdefects? Test at the 0.05 significance level.

Solution:

H0: SigmaB = SigmaA

H1: SigmaB > SigmaA

Observed F =SB/SA

= 25/9 = 2.778

Fcrit = F0.05(15,19)

= 2.23

Thus we reject H0; assembly line B does have significantly more variability inthe number of defects, so some changes have to be made.


33/40

Chi Sq Test 3

Chi-Square Goodness of Fit Test

Problem

English test grade distributions have changed from last year, with grade B'ssomewhat lower. Is this significant?

English testresults

Grade A Grade B Grade C Grade D Grade EThis year, O 23 32 20 15 10Last year 25 20 15 25 10

Solution:

The given statement is H0.

The table below shows the calculation. First, the expected values arecreated by scaling last year's results to be equivalent to this year. Then thetest statistic is calculated as SUM((O - E)^2/E).

English test

resultsGrade A Grade B Grade C Grade D Grade E Sum

This year, O 23 32 20 15 10 100Last year 25 20 15 25 10 95Scaled last year, E 26 21 16 26 11 100(O - E) -3.3 10.9 4.2 -11.3 -0.5(O - E)^2 11.0 119.8 17.7 128.0 0.3(O - E)^2/E 0.4 5.7 1.1 4.9 0.0 12.1

Chi-square is found to be 12.1 and the degrees of freedom are (5-1) = 4

(there are five possible grades). Looking this up in the Chi Square tableshows the probability is between 5% (9.49) and 1% (13.28), so H0 isadequately falsified and a significant change can be claimed.


34/40

Chi-Square Test of Independence

Problem

A year group in school chooses between drama and history as below. Isthere any difference between boys' and girls' choices?

Chosedrama

Chosehistory

Boys 43 55

Girls 52 54

Solution:Observed

Chosedrama

Chosehistory Total

Boys 43 55 98Girls 52 54 106Total 95 109 204

Expected = (row tot * col tot)/overall totChosedrama

Chosehistory Total

Boys 45.6 52.4 98Girls 49.4 56.6 106Total 95 109 204

(observed - expected)^2/expectedChose

drama

Chose

history TotalBoys 0.2 0.1Girls 0.1 0.1Total 0.55

Chi-square is 0.55. There are (2-1)*(2-1) = 1 degree of freedom. Checkingthe Chi Square table shows 0.55 is between 0.004 and 3.84, so noconclusion can be drawn about independence or similarity between boys'and girls' choices.


35/40

Chi-Square Test Equality of Proportions

Problem

A wholesale merchant received a shipment of goods which is claimed to becontaining 5% defective items. The merchant decided to verify this. He drewa sample of 15 items and found 3 defective items. Test the claim. Use =0.05.

Solution

Given p=the proportion of defectives in the whole shipment

p=p0=5/100

=0.05

n=15, p=x/n=3/15

a. Statement of null and alternate hypothesis

H0: p=0.05H1: p>0.05

b. Level of significance:

Given = 0.05. So we use right-tailed test.c. Test statistic and observed value:

x0 = 3,number of defectives

P = 3/15 = 0.2d. Expected value of test statistic:

P(X3)=(x=3,15) ()() , p0 = 0.05= 0.0362

e. Decision & Conclusion:

x0 = 3 lies in rejection region, since 0.0362 is less than 0.05.therefore

we reject H0 and conclude shipment contains more than 5% defective

items, and hence merchant is advised to reject shipment.

Binomial Dist 11. Binomial Distribution

Problem


36/40

Harley Davidson, director of quality control for the Kyoto MotorCompany is conducting his monthly spot check of automatictransmissions. In this procedure, 10 transmissions are removed fromthe pool of components and are checked for manufacturing defects.Historically, only 2 percent of the transmissions have flaws.

a. What is the probability that Harleys sample contains more than twotransmissions with manufacturing flaws?

b. What is the probability that none of the selected transmissions has

any manufacturing flaws?

Solut ion

n = 10

p = 0.02

1 p = 0.98

FormulaP(X = ) = ()( )

P ( = 0) = ()()

= 0.8170

P ( = 1) = ()()

= 0.16674


37/40

P ( = 2) = ()()

= 0.0153

P ( > 2) = 1 p ( 2)= 1 [p ( = 0) + p ( = 1) + p ( = 2)]= 1 [0.8170 + 0.16674 + 0.0153]

= 1 [0.9991]

= 0.0009

Answer

a. The probability that Harleys sample contains more than twotransmissions with manufacturing flaws is 0.0009.

b. The probability that none of the selected transmissions has any

manufacturing flaws is 0.8170.


38/40

Poisson Dist 1Poisson Distribution

Problem Southwestern Electronics has developed a new calculator that performs a

series of functions not yet performed by any other calculator. Themarketing department is planning to demonstrate this calculator to agroup of potential customers, but is worried about some initialproblems, which have resulted in 4 percent of the new calculatorsdeveloping mathematical inconsistencies. The marketing VP Isplanning on randomly selecting a group of calculators for thisdemonstration and is worried about the chances of selecting acalculator that could start malfunctioning. He believes that whether ornot a calculator functions is a Bernoulli process and he is convincedthat the probability of malfunction is really about 0.04.

Assuming that the VP selects exactly 50 calculators to use in thedemonstration, and using the Poisson distribution as anapproximation of the binomial, what is the chance of getting atleast three calculators that malfunction?

No calculators malfunctioning?Solut ion

n = 50

p = 0.04

= np = 2

= 0.13533

Formula: P(X=) = P ( =0) = = 0.13533

P ( =1) = = 0.27066

P ( =2) = = 0.27066

P ( =3) = = 0.18044

P ( 3) = 1 P ( 2)


39/40

= 1 [P ( =0) + P ( =1) + P ( =2)]

= 1 [0.13533 + 0.27066 +0.27066]

= 0.32335 Answer

The chance of getting at least three calculators that malfunction is32.33%

The chance of no calculators malfunctioning is 13.53%


40/40

Normal Dis 1

Normal Distribution

Problem

Regulations concerning the maximum number of people who can occupy a liftare to be set. The total weight of 8 people chosen at random follows a normaldistribution with a mean of 550kg and a standard deviation of 150kg. Whatsthe probability that the total weight of 8 people exceeds 600kg?

Solution

The mean is 550kg and we are interested in the area that is greater than600kg.

z = ( x - m ) / s

Here x = 600kgm , the mean = 550kgs, the standard deviation = 150kg

z = ( 600 - 550 ) / 150z = 50 / 150z = 0.33

Looking in the table for z = 0.3, and across under 0.03.

The number in the table is the tail area for z=0.33 which is 0.3707.

This is the probability that the weight will exceed 600kg.

Therefore, the probability that the total weight of 8 people exceeds 600kg is0.37 correct to 2 figures.

statstictics problems

Documents