statstictics problems

Upload: anuth-siddharth

Post on 10-Feb-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

  • 7/22/2019 Statstictics Problems

    1/40

    Statstictics Problems

    Group :

    Name: Roll no

    Anuth Siddharth 127Abir Banerjee 116

    Ninad Tatke 175Maulik Chandarana 168Madhurima Chatterjee 159Tulsi Zaveri 214Daniel Fernandes 135Deepika Singh 136

  • 7/22/2019 Statstictics Problems

    2/40

    Confidence Interval (Single Population) 4 problemsProblem: 1

    A ketchup manufacturer is in the process of deciding whether to promote anew extra-spicy brand. The companys marketing-research department used

    a national telephone survey of 6000 households and found that the extraspicy ketchup would be purchased by 335 of them. A much more extensivestudy made 2 years ago showed that 5 percent of the households wouldpurchase the brand then. At a 2 percent significance level, should thecompany conclude that there is an increased interest in the extra-spicy flavor?

    Solution:

    N=6000

    H0: p=0.05 H1: p>0.05 =0.02

    The upper limit of the acceptance region is z=2.05, orp = pH0 + z((pH0*qH0)/n) = 0.05 + 2.05((0.05*0.95)/6000) = 0.05577

    Because the observed z value = (p - pH0)/(pH0qH0/n)

    = (0.055830.05)/(0.05*0.95/6000)

    =2.07

    >2.05 (or p>0.05577), we should reject H0. Thecurrent interest is significantly greater than the interest of

    2 years ago.

  • 7/22/2019 Statstictics Problems

    3/40

    Problem: 2

    Steve Cutter sells Big Blade lawn mowers in his hardware store, and he isinterested in comparing the reliability of the mowers he sells with the reliabilityof Big Blade mowers sold nationwide. Steve knows that only 15 percent of all

    Big Blade mowers sold nationwide require repairs during the first year ofownership. A sample of 120 of Steves customers revealed that exactly 22 ofthem required mower repairs in the first year of ownership. At the 0.02 level ofsignificance, is there evidence that Steves Big Blade mowers differ inreliability from those sold nationwide?

    Solution:

    N=120

    p = 22/120 = 0.1833

    H0: p = 0.15H1: p 0.15

    = 0.02

    The limits of the acceptance region are z = 2.33, or

    p = pH0 + z((pH0*qH0)/n) = 0.15 2.33((0.15*0.85)/120)

    = (0.0741, 0.2259)

    Because the observed z value = (p - pH0)/(pH0qH0/n)

    = (0.18330.15)/(0.15*0.85/120)

    = 1.02

  • 7/22/2019 Statstictics Problems

    4/40

    PROBLEM 3-

    In a mobile phone manufacturing company, a random sample of 81 phones istaken producing a sample mean of 47 and a sample standard variation of5.89. Construct a 90% confidence interval assuming that the number of

    camera phones among normal phones is evenly distributed. Find the intervalwidth?

    Answer-

    Here:

    s=5.89

    x=47

    n=81

    Formula- x+Z*SD/ n

    =47+1.65*5.89/81

    (Where 1.65=> in the given table, the value for .45 is 1.65)

    =48.07 (upper limit)

    Now,

    =47-1.65*5.89/81=45.93 (lower limit)

    Answer- We are 90% confident that the number of camera phones amongnormal phones will lie between 45.93 and 48.07.

  • 7/22/2019 Statstictics Problems

    5/40

    PROBLEM 4-

    In a new food home delivery service business, there is a loss of 12 dollars.Suppose this was resulted from a random sample of 25 households, wherethe SD is 21 $, compute a 98% confidence interval on this sample result. How

    wide is the interval?Answer-

    Here:

    s=21

    x=12

    n=25

    Formula- x+Z*SD/ n

    =12+2.33*21/25

    =21.78 (upper limit)

    Now,

    =12-2.33*21/25

    =2.22 (lower limit)

    Answer- We are 98% confident that the result is within this width.

  • 7/22/2019 Statstictics Problems

    6/40

    Hypothesis Testing (Single Population) 4

    1. Hypothesis Testing (Single Population)

    Problem 1

    An insurance company is reviewing its current policy rates. Whenoriginally setting the rates they believed that the average claim amountwas $1,800. They are concerned that the true mean is actually higherthan this, because they could potentially lose a lot of money. Theyrandomly select 40 claims, and calculate a sample mean of $1,950.Assuming that the standard deviation of claims is $500, and set =0.05 test to see if the insurance company should be concerned.

    Solut ion

    n = 40

    = 1950 = 0.05

    = 1800

    1800 > 1800

    = 1.96 (two tailed hypothesis) =

    =

    = 1.897

    Answer

    Do not reject as 1.897 falls in the confidence region. We cannotconclude anything statistically significant from this test, and cannot tellthe insurance company whether or not they should be concerned abouttheir current policies.

  • 7/22/2019 Statstictics Problems

    7/40

    Problem 2

    A car manufacturer claimed that their car averaged at least 31 milesper gallon of gasoline. A sample of 9 cars was selected and each carwas driven with one gallon of regular gasoline. The sample showed a

    mean of 29.43 miles with a standard deviation of 3 miles. = 0.05.What do you conclude about the manufacturers claim?

    Solut ion

    n = 9

    = 29.43 = 0.05

    = 31

    31 < 31

    = -1.860

    = =

    = - 1.57

    Answer

    We cannot reject

    . There is insufficient evidence to doubt the

    manufacturers claim concerning the gas mileage.

  • 7/22/2019 Statstictics Problems

    8/40

    Problem3 General Electric has developed a new bulb whose design specifications call

    for a light output of 960 lumens compared to an earlier model thatproduced only 750 lumens. The companys data indicate that thestandard deviation of light output for this type of bulb is 18.4 lumens.From a sample of 20 new bulbs, the testing committee found anaverage light output of 954 lumens per bulb. At a 0.05 significancelevel, can General Electric conclude that its new bulb is producing thespecified 960 lumen output?

    Solut ion = 18.4

    n = 20

    = 954

    = 960

    = 0.05

    = 960 = 960

    < 960 = - 1.65 =

    = = - 1.45

    Answer Do not reject. The new bulb is meeting specifications.

  • 7/22/2019 Statstictics Problems

    9/40

    Problem 4 BSNL provides telephone services in Coimbatore. According to the

    companys records the average length of calls placed through thecompany is 11.44 minutes. The company wants to check if the meanlength of the current calls is different from 11.44 minutes. A sample of150 such calls placed through this company gave a mean length of12.71 minutes with a standard deviation of 2.65 minutes. Can youconclude that the mean length of all current calls is different from 11.44minutes? Use = 0.05.

    Solut ion s = 2.65

    n = 150

    = 12.71

    = 11.44

    = 0.05 = 11.44

    11.44

    = 1.65 =

    =

    = 5.87 Answer

    Reject . It is concluded that the mean length of current calls is different from11.44 minutes.

  • 7/22/2019 Statstictics Problems

    10/40

    Confidence Interval (Two Populations) 9

    Problem 1: Small Samples

    Suppose that simple random samples of college freshman are selected from

    two universities - 15 students from school A and 20 students from school B.

    On a standardized test, the sample from school A has an average score of

    1000 with a standard deviation of 100. The sample from school B has an

    average score of 950 with a standard deviation of 90.

    What is the 90% confidence interval for the difference in test scores at the two

    schools, assuming that test scores came from normal distributions in bothschools? (Hint: Since the sample sizes are small, use a t score as the critical

    value.)

    (A) 50 + 1.70 (B) 50 + 28.49 (C) 50 + 32.74 (D) 50 + 55.66 (E) Noneof the above

    Solution

    The correct answer is (D). The approach that we used to solve this problem is

    valid when the following conditions are met.

    The sampling method must be simple random sampling. This

    condition is satisfied; the problem statement says that we used simple

    random sampling.

    The samples must be independent. Since responses from one

    sample did not affect responses from the other sample, the samples

    are independent.

    The sampling distribution should be approximately normally

    distributed. The problem states that test scores in each population are

    normally distributed, so the difference between test scores will also be

    normally distributed.

    Since the above requirements are satisfied, we can use the following four-

    step approach to construct a confidence interval.

    Identify a sample statistic. Since we are trying to estimate the

    http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=critical%20valuehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval
  • 7/22/2019 Statstictics Problems

    11/40

    difference between population means, we choose the difference

    between sample means as the sample statistic. Thus, x1 - x2 = 1000 -

    950 = 50. Select a confidence level. In this analysis, the confidence level isdefined for us in the problem. We are working with a 90% confidence

    level. Find the margin of error. Elsewhere on this site, we show how to

    compute the margin of error when the sampling distribution is

    approximately normal. The key steps are shown below. Find standard deviation. Using the sample standard

    deviations, we estimate the standard deviation of the difference

    between sample means (SD). SD = sqrt [ s21 / n1 + s22 / n2 ]

    SD = sqrt [(100)2 / 15 + (90)2 / 20] SD = sqrt (10,000/15 +8100/20) = sqrt(666.67 + 405) = 32.74

    Find critical value. The critical value is a factor used to

    compute the margin of error. Because the sample sizes are

    small, we express the critical value as a t score rather than a z

    score. To find the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)

    = 1 - 90/100 = 0.10

    Find the critical probability (p*): p* = 1 - /2 = 1 -

    0.10/2 = 0.95

    Find the degrees of freedom (df): DF = (s12/n1 +

    s22/n2)

    2 / { [ (s12 / n1)

    2 / (n1 - 1) ] + [ (s22 / n2)

    2 / (n2 - 1) ] }

    DF = (1002/15 + 902/20)2 / { [ (1002 /15)2 / 14 ] + [ (902/20)2 / 19 ] } DF = (666.67 + 405}2 / (31746.03 +8632.89) = 1150614.5 / 40378.92 = 28.495 Rounding offto the nearest whole number, we conclude that there are

    28 degrees of freedom.

    The critical value is the t score having 28 degrees

    of freedom and a cumulative probability equal to 0.95.

    From the t Distribution Calculator, we find that the critical

    value is 1.7.

    http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Tables/T.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=Degrees%20of%20freedomhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=t%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx
  • 7/22/2019 Statstictics Problems

    12/40

    Compute margin of error (ME): ME = critical value *

    standard deviation = 1.7 * 32.74 = 55.66

    Specify the confidence interval. The range of the confidence interval isdefined by the sample statistic+ margin of error. And the uncertainty is

    denoted by the confidence level.

    Therefore, the 90% confidence interval is -5.66 to 100.66. That is, we are 99%

    confident that the true difference in population means is in the range defined

    by 50 + 55.66.

  • 7/22/2019 Statstictics Problems

    13/40

    Problem 2: Large Samples

    The local baseball team conducts a study to find the amount spent on

    refreshments at the ball park. Over the course of the season they gather

    simple random samples of 50 men and 100 women. For men, the average

    expenditure was $20, with a standard deviation of $3. For women, it was $15,

    with a standard deviation of $2.

    What is the 99% confidence interval for the spending difference between men

    and women? Assume that the two populations are independent and normally

    distributed.

    (A) $5 + $0.47 (B) $5 + $1.21 (C) $5 + $2.58 (D) $5 + $5.00 (E)None of the above

    Solution

    The correct answer is (B). The approach that we used to solve this problem is

    valid when the following conditions are met.

    The sampling method must be simple random sampling. This condition

    is satisfied; the problem statement says that we used simple random

    sampling.

    The samples must be independent. Again, the problem statement

    satisfies this condition.

    The sampling distribution should be approximately normally distributed.

    The problem states that test scores in each population are normally

    distributed, so the difference between test scores will also be normally

    distributed.

    Since the above requirements are satisfied, we can use the following four-

    step approach to construct a confidence interval.

    Identify a sample statistic. Since we are trying to estimate the

    difference between population means, we choose the difference

    between sample means as the sample statistic. Thus, x1 - x2 = $20 -

    http://stattrek.com/Help/Glossary.aspx?Target=Confidence_intervalhttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Sampling_distributionhttp://stattrek.com/Help/Glossary.aspx?Target=Independenthttp://stattrek.com/Help/Glossary.aspx?Target=Simple%20random%20samplinghttp://stattrek.com/Help/Glossary.aspx?Target=Confidence_interval
  • 7/22/2019 Statstictics Problems

    14/40

    $15 = $5. Select a confidence level. In this analysis, the confidence level is

    defined for us in the problem. We are working with a 99% confidence

    level. Find the margin of error. Elsewhere on this site, we show how to

    compute the margin of error when the sampling distribution is

    approximately normal. The key steps are shown below. Find standard deviation. Since we do not know the

    standard deviation of the populations, we use the sample

    standard deviations to estimate the standard deviation of the

    difference between sample means (SD). SD = sqrt [ s21 / n1 + s22

    / n2 ] SD = sqrt [(3)2 / 50 + (2)2 / 100] = sqrt (9/50 + 4/100) =sqrt(0.18 + 0.04) = 0.47

    Find critical value. The critical value is a factor used to

    compute the margin of error. Because the sample sizes are

    large enough, we express the critical value as a z score. To find

    the critical value, we take these steps. Compute alpha (): = 1 - (confidence level / 100)

    = 1 - 99/100 = 0.01Find the critical probability (p*): p* = 1 - /2 = 1 -

    0.01/2 = 0.995

    The critical value is the z score having a

    cumulative probability equal to 0.995. From the Normal

    Distribution Calculator, we find that the critical value is

    2.58.

    Compute margin of error (ME): ME = critical value *

    standard deviation = 2.58 * 0.47 = 1.21

    Specify the confidence interval. The range of the confidence interval is

    defined by the sample statistic+ margin of error. And the uncertainty is

    denoted by the confidence level.Therefore, the 99% confidence interval is $3.79 to $6.21. That is, we are 99%confident that men outspend women at the ballpark by about $5 + $1.21.

    http://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Tables/Normal.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Cumulative%20probabilityhttp://stattrek.com/Help/Glossary.aspx?Target=z%20scorehttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspxhttp://stattrek.com/AP-Statistics-4/Margin-Of-Error.aspx
  • 7/22/2019 Statstictics Problems

    15/40

    Problem: 3

    A large hotel chain is trying to decide whether to convert more of its rooms tonon-smoking rooms. In a random sample of 400 guests last year, 166 hadrequested non-smoking rooms. This year 205 guests in a sample of 380preferred the non-smoking rooms. Would you recommend that the hotel chainconvert more rooms to non-smoking? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.

    Solution:

    n1 = 400

    p1 = 0.415n2 = 380

    p2 = 0.5395

    H0: p1 = p2

    H1: p1

  • 7/22/2019 Statstictics Problems

    16/40

    Problem:4

    Two different areas of a large eastern city are being considered as sites forday care centres. Of 200 households surveyed in one section, the proportionin which mothers worked full time was 0.52. In another section, 40 percent ofhouseholds surveyed had mothers working at full time jobs. At the 0.04 levelof significance, is there a significant difference in the proportions of workingmothers in the two areas of the city?

    Solution:

    n1 = 200

    p1 = 0.52n2 = 150

    p2 = 0.40

    H0: p1 = p2

    H1: p1p2

    = 0.04

    p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)

    = 0.4686

    Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539

    The lower limit of the acceptance region is z = 2.05, or

    p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105

    Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539

    = 2.23

    >2.05, we reject H0. The proportions of workingmothers in the two areas differ significantly.

  • 7/22/2019 Statstictics Problems

    17/40

    Problem: 5

    ABC Airlines wants to find out whether to include more of non vegetarian fooditems in its menu. In a random sample of 600 guests last year, 200 had theinclusion of more of non vegetarian items. This year 300 guests in a sample of500 preferred non vegetarian items. Help the airline decide whether it shouldinclude more of non veg items or not? Support your recommendation bytesting the appropriate hypothesis at 0.01 level of significance.

    Solution:

    n1 = 600

    p1 = 0.333n2 = 500

    p2 = 0.600

    H0: p1 = p2

    H1: p1

  • 7/22/2019 Statstictics Problems

    18/40

    Problem:6

    In a school Two different classes are being considered as being rankednumber one. Of the 200 students surveyed in one section, the proportion inwhich students obtained full marks was 0.52. In another section, 40 percent ofthe students surveyed had students obtaining full marks. At the 0.04 level ofsignificance, is there a significant difference in the proportions of studentsgetting full marks?

    Solution:

    n1 = 200p1 = 0.52

    n2 = 150

    p2 = 0.40

    H0: p1 = p2

    H1: p1p2

    = 0.04

    p = (n1 p1 + n2 p2)/(n1+n2) = (200(0.52) + 150(.40))/(200+150)

    = 0.4686

    Sigma p1-p2 = (p q(1/n1 + 1/n2)) = (0.4686(0.5314)((1/200 + 1/150))) =0.0539

    The lower limit of the acceptance region is z = 2.05, or

    p1 - p2 = 0 - zSigma p1-p2 = 2.05 (0.0539) = 0.1105

    Because the observed z value = (p1 - p2)/(Sigma p1-p2) = (0.52 0.40) /0.0539

    = 2.23

    >2.05, we reject H0.

    The proportions of the students obtaining results differ significantly.

  • 7/22/2019 Statstictics Problems

    19/40

    Hypothesis Testing (Two Populations) 9

    Problem 1

    A machine produced 20 defective articles in a batch of 400. After overhauling, it produced 10 defectives in a batch of 300. Has the machineimproved?

    n1= 400, n2 = 300

    p1 = 20 / 400 = 0.05 P2 =10/300 =0.033

    a) Statement of null and alternate hypothesis:H0 : P1 = P2

    H1 : P1 > P 2

    b) Level of Significance:Let = 0.05 be the significance. According, H 1 we use one tailed

    test.

    c) Test Statistic and observed value:

    z = p 1 - p2

    _________________P Q (1/n1 + 1/n2)

    P= 20+100 /700= 3/70

    Q= 67/70

    Z0 = 0.05 - .0.033____________________________________ 3/70. 67/70 (1/400 +1/300)

    = 1.103

    d) expected value of statictic

    z = p 1 - p2

    _________________

    P Q (1/n1 + 1/n2)has standard normal distribution for n1 and n2 /= 30. From normaldistribution table z e= 1.645 for = 0.05

  • 7/22/2019 Statstictics Problems

    20/40

    e) Decision and ConclusionZ0= 1.103 ze=1.645 since z0 < ze accept H0 and interpret that machine is notimproved due to overhauling

  • 7/22/2019 Statstictics Problems

    21/40

    Problem 2

    A study was conducted to investigate the effectiveness of hypnotism inreducing pain. Results for randomly selected subjects are shown in the table.

    The "before" value is matched to an "after" value.

    TABLE 1

    Subject: A B C D E F G H

    Before 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6

    After 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0

    Are the sensory measurements, on average, lower after hypnotism? Test at a

    5% significance level.

    Corresponding "before" and "after" values form matched pairs.

    TABLE 2

    After Data Before Data Difference

    6.8 6.6 0.2

    2.4 6.5 -4.1

    7.4 9 -1.6

    8.5 10.3 -1.88.1 11.3 -3.2

    6.1 8.1 -2

    3.4 6.3 -2.9

    2 11.6 -9.6

    The data for the test are the differences: {0.2, -4.1, -1.6, -1.8, -3.2, -2, -2.9, -

    9.6}

    The sample mean and sample standard deviation of the differences are: xd=-

    3.13 and sd=2.91 Verify these values.

    Let d be the population mean for the differences. We use the subscript d to

    denote "differences."

    Random Variable: Xd = the average difference of the sensory measurements

    Ho:d0

    (2)

    There is no improvement. (dis the population mean of the differences.)

  • 7/22/2019 Statstictics Problems

    22/40

    Ha:dp-value.

    Make a decision: Since >p-value, reject Ho.

    This means that d

  • 7/22/2019 Statstictics Problems

    23/40

    PROBLEM 3-

    A weight reducing cream manufacturing company wanted to see whether theusage of the cream is beneficial or not. They are sceptical about the launch ofthe same and hence they sampled monthly usage by 6 of its users before and

    after using the same, where the significance level is .02, find out the change.The results are as follows-

    EMPLOYEE 1 2 3 4 5 6MONTH BEFOREUSE

    219 205 226 198 209 216

    MONTH AFTERUSE

    235 186 240 203 221 205

    H0: Sigma = 0

    H1: Sigma>= 0

    D = {Di/n

    SD= {(Di-D)*2/n-1

    Formula to be used- D- u D/ S/ n

    Di= 16, -19, 14, 5 12, -11

    {Di= 17D = 17/6= 2.8

    {(Di-D)*2/n-1

    =1054.84/5

    =14.52095

    Here, t test is conducted,

  • 7/22/2019 Statstictics Problems

    24/40

    D u D/ (SD/ n)

    =2.83/14.52/2.44

    =.6177

    So, Ho cannot be rejected. Hence we can say that there is no significantchange.

  • 7/22/2019 Statstictics Problems

    25/40

    PROBLEM 4-

    2 different telecom companies are trying to know the usage of its free calls atnight. The first company sampled 90 people and produced an average of 8.5hours of relief and a sample SD of 18 hours. The second company sampled

    80 people producing an average of 7.9 hours of relief and sample SD of 2.1hours at .05 level of significance. Does the 2nd company have less usage?

    Ho= u1-u2=0

    H1= u1-u2>0

    Here,

    N1=90 N2=80

    X1=8.5 X2= 7.9

    S1=1.8 S2=2.1

    Ho= u1-u2=0

    H1= u1-u2>0

    Z test= (X1 - X2)-(u1-u2)/ s1*2/n1 + s2*2/ n2

    8.5-7.9/ (1.8)*2/90+ (2.1)*2 /80.06/.30

    =1.983

    Hence here Ho is rejected.

  • 7/22/2019 Statstictics Problems

    26/40

    Problem : 5

    Within a school district, students were randomly assigned to one of two Math

    teachers - Mrs. Smith and Mrs. Jones. After the assignment, Mrs. Smith had

    30 students, and Mrs. Jones had 25 students.

    At the end of the year, each class took the same standardized test. Mrs.

    Smith's students had an average test score of 78, with a standard deviation of

    10; and Mrs. Jones' students had an average test score of 85, with a standard

    deviation of 15.

    Test the hypothesis that Mrs. Smith and Mrs. Jones are equally effective

    teachers. Use a 0.10 level of significance. (Assume that student performance

    is approximately normal.)

    Solut ion:

    The solution to this problem takes four steps: (1) state the hypotheses, (2)

    formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

    We work through those steps below:

    Null hypothesis: 1 - 2 = 0

    Alternative hypothesis: 1 - 2 0

    For this analysis, the significance level is 0.10. Using sample data, we

    will conduct a two-sample t-test of the null hypothesis.

    Using sample data, we compute the standard error (SE), degrees of

    freedom (DF), and the t-score test statistic (t).

    SE = sqrt[(s12/n1) + (s2

    2/n2)]

    SE = sqrt[(102/30) + (152/25] = sqrt(3.33 + 9) = sqrt(12.33) = 3.51

    DF = (s12/n1 + s2

    2/n2)2 / { [ (s1

    2 / n1)2 / (n1 - 1) ] + [ (s2

    2 / n2)2 / (n2 - 1) ] }

    DF = (102/30 + 152/25)2 / { [ (102 / 30)2 / (29) ] + [ (152 / 25)2 / (24) ] }

    DF = (3.33 + 9)2 / { [ (3.33)2 / (29) ] + [ (9)2 / (24) ] } = 152.03 / (0.382 +

    3.375) = 152.03/3.757 = 40.47

    t = [ (x1 - x2) - d ] / SE = [ (78 - 85) - 0 ] / 3.51 = -7/3.51 = -1.99

    http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test
  • 7/22/2019 Statstictics Problems

    27/40

    where s1 is the standard deviation of sample 1, s2 is the standard

    deviation of sample 2, n1 is the size of sample 1, n2 is the size of

    sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is

    the hypothesized difference between the population means, and SE is

    the standard error.

    Since we have a two-tailed test, the P-value is the probability that a t-

    score having 40 degrees of freedom is more extreme than -1.99; that

    is, less than -1.99 or greater than 1.99.

    We use the t Distribution Calculatorto find P(t < -1.99) = 0.027, and P(t

    > 1.99) = 0.027. Thus, the P-value = 0.027 + 0.027 = 0.054.

    Since the P-value (0.054) is less than the significance level (0.10), we cannot

    accept the null hypothesis.

    http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation
  • 7/22/2019 Statstictics Problems

    28/40

    Problem 6

    In a restaurant there are 2 different departments i.e house keeping and

    maintenance. House keeping department had 45 waiters and Maintenance

    department had 55 waiters.

    At the end of the year, each department took the same standardized test to

    measure its performance.House keeping department had an average test

    score of 65, with a standard deviation of 10; and Mrs. Jones' students had an

    average test score of 75, with a standard deviation of 15.

    Test the hypothesis that house keeping department and maintenance

    department are equally effective . Use a 0.10 level of significance. (Assume

    that waiters performance is approximately normal.)

    Solut ion:

    The solution to this problem takes four steps: (1) state the hypotheses, (2)

    formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

    We work through those steps below:

    Null hypothesis: 1 - 2 = 0

    Alternative hypothesis: 1 - 2 0

    For this analysis, the significance level is 0.10. Using sample data, we

    will conduct a two-sample t-test of the null hypothesis.

    Using sample data, we compute the standard error (SE), degrees of

    freedom (DF), and the t-score test statistic (t).

    SE = sqrt[(s12/n1) + (s2

    2/n2)]

    SE = sqrt[(102/45) + (152/55] = 2.51

    DF = (s12/n1 + s2

    2/n2)2 / { [ (s1

    2 / n1)2 / (n1 - 1) ] + [ (s2

    2 / n2)2 / (n2 - 1) ] }

    DF = (102/45 + 152/55)2 / { [ (102 / 45)2 / (44) ] + [ (152 / 55)2 / (54) ] }

    =8.05

    t = [ (x1 - x2) - d ] / SE = [ (65 - 75) - 0 ] / 2.51 = -3.98

    http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-testhttp://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test
  • 7/22/2019 Statstictics Problems

    29/40

    where s1 is the standard deviation of sample 1, s2 is the standard

    deviation of sample 2, n1 is the size of sample 1, n2 is the size of

    sample 2, x1 is the mean of sample 1, x2 is the mean of sample 2, d is

    the hypothesized difference between the population means, and SE is

    the standard error.

    Since we have a two-tailed test, the P-value is the probability that a t-

    score having 40 degrees of freedom is more extreme than -3.98; that

    is, less than -3.98 or greater than 3.98.

    We use the t Distribution Calculatorto find P(t < -3.98) = 0.153, and P(t

    > 3.98) = 0.153. Thus, the P-value = 0.153 + 0.153 = 0.306.

    Since the P-value (0.306) is more than the significance level (0.10), we

    accept the null hypothesis.

    http://stattrek.com/Help/Glossary.aspx?Target=standard%20deviationhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Tables/t.aspxhttp://stattrek.com/Help/Glossary.aspx?Target=Two-tailed%20testhttp://stattrek.com/Help/Glossary.aspx?Target=standard%20deviation
  • 7/22/2019 Statstictics Problems

    30/40

    Problem 7A credit-insurance organization has developed a new high-tech method of

    training new sales personnel. The company sampled 16 employeeswho were trained the original way and found average daily sales to be$688 and the sample standard deviation was $32.63. They also

    sampled 11 employees who were trained using the new method andfound average daily sales to be $ 706 and the sample standarddeviation was $24.84. At alpha = 0.05, can the company conclude thataverage daily sales have increased under the new plan?

    Solut ion

    n1 = 16 n2 = 11 n = n1 + n2 = 27

    1 = 688 2 = 206

    1 = 32.63 2 = 24.84

    = 0.05 1 - 2 = 0

    1 - 2 0 = - 1.708

    =

    =

    = 885.64

    =

    =

    = - 1.545 Answer

    Do not reject. Average daily sales have not increased significantly.

  • 7/22/2019 Statstictics Problems

    31/40

    Problem 8

    Block Enterprises, a manufacturer of chips for computers, is in theprocess of deciding whether to replace its current semi automatedassembly line with a fully automated assembly line. Block has gathered

    some preliminary test data about hourly chip production, which issummarized in the following table and it would like to know whether itshould upgrade its assembly line. At = 0.02, state and test thehypothesis to help Block decide.

    nSemi automaticLine

    198 32 150

    Automatic Line 206 29 200

    Solut ion

    1 = 198 1 = 32 n1 = 1502 = 206 2 = 29 n2 = 200 = 0.02

    1 - 2 = 0 1 - 2 0

    = - 2.06 = ()( )

    =()()

    = - 2.408

    Answer

    is rejected. Block should upgrade to an automatic line.

  • 7/22/2019 Statstictics Problems

    32/40

    One Way Anova 1

    Problem:

    A quality control supervisor for an automobile manufacturer is concerned withuniformity in the number of defects in cars coming off the assembly line. If oneassembly line has significantly more variability in the number of defects, thenchanges have to be made. The supervisor has collected the following data:

    Number of Defects

    Assembly Line A Assembly Line B

    Mean 10 11

    Variance 9 25

    Sample Size 20 16

    Does Assembly line B have significantly more variability in the number ofdefects? Test at the 0.05 significance level.

    Solution:

    H0: SigmaB = SigmaA

    H1: SigmaB > SigmaA

    Observed F =SB/SA

    = 25/9 = 2.778

    Fcrit = F0.05(15,19)

    = 2.23

    Thus we reject H0; assembly line B does have significantly more variability inthe number of defects, so some changes have to be made.

  • 7/22/2019 Statstictics Problems

    33/40

    Chi Sq Test 3

    Chi-Square Goodness of Fit Test

    Problem

    English test grade distributions have changed from last year, with grade B'ssomewhat lower. Is this significant?

    English testresults

    Grade A Grade B Grade C Grade D Grade EThis year, O 23 32 20 15 10Last year 25 20 15 25 10

    Solution:

    The given statement is H0.

    The table below shows the calculation. First, the expected values arecreated by scaling last year's results to be equivalent to this year. Then thetest statistic is calculated as SUM((O - E)^2/E).

    English test

    resultsGrade A Grade B Grade C Grade D Grade E Sum

    This year, O 23 32 20 15 10 100Last year 25 20 15 25 10 95Scaled last year, E 26 21 16 26 11 100(O - E) -3.3 10.9 4.2 -11.3 -0.5(O - E)^2 11.0 119.8 17.7 128.0 0.3(O - E)^2/E 0.4 5.7 1.1 4.9 0.0 12.1

    Chi-square is found to be 12.1 and the degrees of freedom are (5-1) = 4

    (there are five possible grades). Looking this up in the Chi Square tableshows the probability is between 5% (9.49) and 1% (13.28), so H0 isadequately falsified and a significant change can be claimed.

  • 7/22/2019 Statstictics Problems

    34/40

    Chi-Square Test of Independence

    Problem

    A year group in school chooses between drama and history as below. Isthere any difference between boys' and girls' choices?

    Chosedrama

    Chosehistory

    Boys 43 55

    Girls 52 54

    Solution:Observed

    Chosedrama

    Chosehistory Total

    Boys 43 55 98Girls 52 54 106Total 95 109 204

    Expected = (row tot * col tot)/overall totChosedrama

    Chosehistory Total

    Boys 45.6 52.4 98Girls 49.4 56.6 106Total 95 109 204

    (observed - expected)^2/expectedChose

    drama

    Chose

    history TotalBoys 0.2 0.1Girls 0.1 0.1Total 0.55

    Chi-square is 0.55. There are (2-1)*(2-1) = 1 degree of freedom. Checkingthe Chi Square table shows 0.55 is between 0.004 and 3.84, so noconclusion can be drawn about independence or similarity between boys'and girls' choices.

  • 7/22/2019 Statstictics Problems

    35/40

    Chi-Square Test Equality of Proportions

    Problem

    A wholesale merchant received a shipment of goods which is claimed to becontaining 5% defective items. The merchant decided to verify this. He drewa sample of 15 items and found 3 defective items. Test the claim. Use =0.05.

    Solution

    Given p=the proportion of defectives in the whole shipment

    p=p0=5/100

    =0.05

    n=15, p=x/n=3/15

    a. Statement of null and alternate hypothesis

    H0: p=0.05H1: p>0.05

    b. Level of significance:

    Given = 0.05. So we use right-tailed test.c. Test statistic and observed value:

    x0 = 3,number of defectives

    P = 3/15 = 0.2d. Expected value of test statistic:

    P(X3)=(x=3,15) ()() , p0 = 0.05= 0.0362

    e. Decision & Conclusion:

    x0 = 3 lies in rejection region, since 0.0362 is less than 0.05.therefore

    we reject H0 and conclude shipment contains more than 5% defective

    items, and hence merchant is advised to reject shipment.

    Binomial Dist 11. Binomial Distribution

    Problem

  • 7/22/2019 Statstictics Problems

    36/40

    Harley Davidson, director of quality control for the Kyoto MotorCompany is conducting his monthly spot check of automatictransmissions. In this procedure, 10 transmissions are removed fromthe pool of components and are checked for manufacturing defects.Historically, only 2 percent of the transmissions have flaws.

    a. What is the probability that Harleys sample contains more than twotransmissions with manufacturing flaws?

    b. What is the probability that none of the selected transmissions has

    any manufacturing flaws?

    Solut ion

    n = 10

    p = 0.02

    1 p = 0.98

    FormulaP(X = ) = ()( )

    P ( = 0) = ()()

    = 0.8170

    P ( = 1) = ()()

    = 0.16674

  • 7/22/2019 Statstictics Problems

    37/40

    P ( = 2) = ()()

    = 0.0153

    P ( > 2) = 1 p ( 2)= 1 [p ( = 0) + p ( = 1) + p ( = 2)]= 1 [0.8170 + 0.16674 + 0.0153]

    = 1 [0.9991]

    = 0.0009

    Answer

    a. The probability that Harleys sample contains more than twotransmissions with manufacturing flaws is 0.0009.

    b. The probability that none of the selected transmissions has any

    manufacturing flaws is 0.8170.

  • 7/22/2019 Statstictics Problems

    38/40

    Poisson Dist 1Poisson Distribution

    Problem Southwestern Electronics has developed a new calculator that performs a

    series of functions not yet performed by any other calculator. Themarketing department is planning to demonstrate this calculator to agroup of potential customers, but is worried about some initialproblems, which have resulted in 4 percent of the new calculatorsdeveloping mathematical inconsistencies. The marketing VP Isplanning on randomly selecting a group of calculators for thisdemonstration and is worried about the chances of selecting acalculator that could start malfunctioning. He believes that whether ornot a calculator functions is a Bernoulli process and he is convincedthat the probability of malfunction is really about 0.04.

    Assuming that the VP selects exactly 50 calculators to use in thedemonstration, and using the Poisson distribution as anapproximation of the binomial, what is the chance of getting atleast three calculators that malfunction?

    No calculators malfunctioning?Solut ion

    n = 50

    p = 0.04

    = np = 2

    = 0.13533

    Formula: P(X=) = P ( =0) = = 0.13533

    P ( =1) = = 0.27066

    P ( =2) = = 0.27066

    P ( =3) = = 0.18044

    P ( 3) = 1 P ( 2)

  • 7/22/2019 Statstictics Problems

    39/40

    = 1 [P ( =0) + P ( =1) + P ( =2)]

    = 1 [0.13533 + 0.27066 +0.27066]

    = 0.32335 Answer

    The chance of getting at least three calculators that malfunction is32.33%

    The chance of no calculators malfunctioning is 13.53%

  • 7/22/2019 Statstictics Problems

    40/40

    Normal Dis 1

    Normal Distribution

    Problem

    Regulations concerning the maximum number of people who can occupy a liftare to be set. The total weight of 8 people chosen at random follows a normaldistribution with a mean of 550kg and a standard deviation of 150kg. Whatsthe probability that the total weight of 8 people exceeds 600kg?

    Solution

    The mean is 550kg and we are interested in the area that is greater than600kg.

    z = ( x - m ) / s

    Here x = 600kgm , the mean = 550kgs, the standard deviation = 150kg

    z = ( 600 - 550 ) / 150z = 50 / 150z = 0.33

    Looking in the table for z = 0.3, and across under 0.03.

    The number in the table is the tail area for z=0.33 which is 0.3707.

    This is the probability that the weight will exceed 600kg.

    Therefore, the probability that the total weight of 8 people exceeds 600kg is0.37 correct to 2 figures.