sm mcclave stat10 wm

570
INSTRUCTOR'S SOLUTIONS MANUAL to Accompany James T. McClave P. George Benson and Terry Sincich's STATISTICS FOR BUSINESS AND ECONOMICS Tenth Edition Nancy S. Boudreau Bowling Green State University Upper Saddle River, New Jersey Columbus, Ohio To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

Upload: filip-vojinovic

Post on 20-Oct-2015

2.610 views

Category:

Documents


7 download

TRANSCRIPT

  • INSTRUCTOR'S SOLUTIONS MANUAL to Accompany James T. McClave P. George Benson and Terry Sincich's

    STATISTICS FOR BUSINESS AND ECONOMICS Tenth Edition Nancy S. Boudreau Bowling Green State University

    Upper Saddle River, New Jersey

    Columbus, Ohio

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • iii

    Contents Preface v Chapter 1 Statistics, Data, and Statistical Thinking 1 Chapter 2 Methods for Describing Sets of Data 5 The Kentucky Milk Case 46 Chapter 3 Probability 55 Chapter 4 Random Variables and Probability Distributions 82 The Furniture Fire Case 136 Chapter 5 Inferences Based on a Single Sample: Estimation with Confidence Intervals 137 Chapter 6 Inferences Based on a Single Sample: Tests of Hypothesis 161 Chapter 7 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses 201 The Kentucky Milk Case Part II 243 Chapter 8 Design of Experiments and Analysis of Variance 256 Chapter 9 Categorical Data Analysis 300 Discrimination in the Work Place 328 Chapter 10 Simple Linear Regression 332 Chapter 11 Multiple Regression and Model Building 379 The Condo Sales Case 444 Chapter 12 Methods for Quality Improvement 448 Chapter 13 Time Series: Descriptive Analyses, Models, and Forecasting 476 The Gasket Manufacturing Case 522 Chapter 14 Nonparametric Statistics 529

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • iv

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • v

    Preface This solutions manual is designed to accompany the text, Statistics for Business and Economics, Tenth Edition, by James T. McClave, P. George Benson, and Terry Sincich. It provides answers to most even-numbered exercises for each chapter in the text. Other methods of solution may also be appropriate; however, the author has presented one that she believes to be most instructive to the beginning Statistics student. This manual is provided to help instructors save time in preparing presentations of the solutions and to possibly provide another point of view regarding their meaning. Some of the exercises are subjective in nature. Subjective decisions regarding these exercises have been made and are explained by the author. Solutions based on these decisions are presented; the solution to this type of exercise is often most instructive. When an alternative interpretation of an exercise may occur, the author has often addressed it and given justification for the approach taken. I would like to thank Kelly Barber for creating the art work and for typing this work. Nancy S. Boudreau Bowling Green State University Bowling Green, Ohio

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Statistics, Data, and Statistical Thinking Chapter 1

    1.2 Descriptive statistics utilizes numerical and graphical methods to look for patterns, to

    summarize, and to present the information in a set of data. Inferential statistics utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data.

    1.4 The first element of inferential statistics is the population of interest. The population is a set of

    existing units. The second element is one or more variables that are to be investigated. A variable is a characteristic or property of an individual population unit. The third element is the sample. A sample is a subset of the units of a population. The fourth element is the inference about the population based on information contained in the sample. A statistical inference is an estimate, prediction, or generalization about a population based on information contained in a sample. The fifth and final element of inferential statistics is the measure of reliability for the inference. The reliability of an inference is how confident one is that the inference is correct.

    1.6 Quantitative data are measurements that are recorded on a meaningful numerical scale.

    Qualitative data are measurements that are not numerical in nature; they can only be classified into one of a group of categories.

    1.8 A population is a set of existing units such as people, objects, transactions, or events. A

    sample is a subset of the units of a population. 1.10 An inference without a measure of reliability is nothing more than a guess. A measure of

    reliability separates statistical inference from fortune telling or guessing. Reliability gives a measure of how confident one is that the inference is correct.

    1.12 Statistical thinking involves applying rational thought processes to critically assess data and

    inferences made from the data. It involves not taking all data and inferences presented at face value, but rather making sure the inferences and data are valid.

    1.14 a. The two variables measured are type of credit card used and amount of purchase.

    Type of credit card used is qualitative. It has no meaningful number associated with it, only the name of the card used. Amount of purchase is quantitative. It has a meaningful number associated with it.

    b. In Study 1, it says that all purchases were tracked. Thus, the data represent a population.

    1.16 a. High school GPA is a number usually between 0.0 and 4.0. Therefore, it is quantitative.

    b. Honors/awards would have responses that name things. Therefore, it would be qualitative.

    Statistics, Data, and Statistical Thinking 1

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • c. The scores on the SAT's are numbers between 200 and 800. Therefore, it is quantitative. d. Gender is either male or female. Therefore, it is qualitative. e. Parent's income is a number: $25,000, $45,000, etc. Therefore, it is quantitative. f. Age is a number: 17, 18, etc. Therefore, it is quantitative. 1.18 a. 1. The variable of interest is the status of a companys e-commerce strategy.

    Since a company either has an e-commerce strategy or not, the variable is qualitative.

    2. The variable of interest is when the company will implement an e-commerce plan. Since the time of implementation will be a date, this variable will be qualitative.

    3. The variable of interest is whether the company is delivering products over the

    internet or not. Since the company is either delivering products or not, the variable is qualitative.

    4. The variable of interest is the companys total revenue in the last fiscal year. Since this is a meaningful number, this variable is quantitative.

    b. Since there are many more that 154 companies in the U.S., this represents a sample rather

    than a population. 1.20 a. The population of interest is the collection of computer security personnel at all U.S. corporations and government agencies.

    b. Surveys were sent to computer security personnel at all U. S. corporations and government agencies. However, in 2006, only 616 organizations responded to the survey. There could be nonresponse bias. Often, only those subjects with strong opinions will respond to a survey. Thus, the responses may not reflect what the population as a whole thinks.

    c. The variable measured in the survey is whether or not there was unauthorized use of

    computer systems at the firms during the year. Since the responses will be either Yes or No, the variable is qualitative.

    d. If we assume that the responses were a random sample from the population, we could

    infer that about 52% of all computer security personnel will admit to unauthorized use of computer systems at their firms during the year.

    1.22. a. The data collection method used is a designed experiment.

    b. The experimental units in the study are the 50,000 smokers.

    c. The variable of interest is the age at which the scanning method first detects a tumor. Since this is a meaningful number, this variable is quantitative.

    2 Chapter 1

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • d. The population of interest is the set of all smokers in the U.S. The sample of interest is the set of 50,000 smokers surveyed.

    e. The researchers want to compare the age at first detection for the 2 methods to see if one

    is more sensitive than the other. 1.24 a. The variable of interest to the researchers is the rating of highway bridges. b. Since the rating of a bridge can be categorized as one of three possible values, it is

    qualitative. c. The data set analyzed is a population since all highway bridges in the U.S. were

    categorized. d. The data were collected observationally. Each bridge was observed in its natural setting. 1.26 a. The population of interest is the set of all New York accounting firms employing two or

    more professionals. There are two variables of interest: Whether or not the firm uses audit sampling methods, and if so, whether or not it uses random sampling. The sample is the set of 163 firms whose responses were useable. The inference of interest to the New York Society of CPAs is the proportion of all New York accounting firms employing two or more professionals that use sampling methods in auditing their clients.

    b. The four responses that were unusable could have been returned blank or could have been

    filled out incorrectly. c. Any time a survey is mailed it is questionable whether the returned questionnaires

    represent a random sample. Often times, only those with very strong opinions return the surveys. In such a case, the returned surveys would not be representative of the entire population.

    1.28 a. The experimental units in this study are the 24 projects.

    b. The population from which the sample was selected is the set of all new software development projects.

    c. The variable of interest in this project is the outcome of reusing previously developed

    software for the new software development projects. d. In the sample, 9 of the 24 projects were judged failures. This is (9 / 24)*100% = 37.5%.

    We could infer that approximately 37.5% of all projects would be judged failures. 1.30 a. The process being studied is the process of filling beverage cans with softdrink at CCSB's

    Wakefield plant. b. The variable of interest is the amount of carbon dioxide added to each can of beverage. c. The sampling plan was to monitor five filled cans every 15 minutes. The sample is the

    total number of cans selected.

    Statistics, Data, and Statistical Thinking 3

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • d. The company's immediate interest is learning about the process of filling beverage cans with softdrink at CCSB's Wakefield plant. To do this, they are measuring the amount of carbon dioxide added to a can of beverage to make an inference about the process of filling beverage cans. In particular, they might use the mean amount of carbon dioxide added to the sampled cans of beverage to estimate the mean amount of carbon dioxide added to all the cans on the process line.

    e. The technician would then be dealing with a population. The cans of beverage have

    already been processed. He/she is now interested in the outputs.

    4 Chapter 1

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Methods for Describing Sets of Data Chapter 2

    2.2 a. To find the frequency for each class, count the number of times each letter occurs. The

    frequencies for the three classes are:

    Class Frequency X 8 Y 9 Z 3

    Total 20 b. The relative frequency for each class is found by dividing the frequency by the total sample

    size. The relative frequency for the class X is 8/20 = .40. The relative frequency for the class Y is 9/20 = .45. The relative frequency for the class Z is 3/20 = .15.

    Class Frequency Relative Frequency

    X 8 .40 Y 9 .45 Z 3 .15

    Total 20 1.00 c. The frequency bar chart is: d. The pie chart for the frequency distribution is:

    Methods for Describing Sets of Data 5

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.4 a. The variable summarized in the table is Reason for requesting the installation of the passenger-side on-off switch. The values this variable could assume are: Infant, Child, Medical, Infant & Medical, Child & Medical, Infant & Child, and Infant & Child & Medical. Since the responses name something, the variable is qualitative.

    b. The relative frequencies are found by dividing the number of requests for each category by

    the total number of requests. For the category Infant, the relative frequency is 1,852/30,337 = .061. The rest of the relative frequencies are found in the table below:

    Reason Number of

    Requests Relative

    frequencies Infant 1,852 1,852/30,337 .061

    Child 17,148 17,148/30,337 .565

    Medical 8,377 8,377/30,337 .276

    Infant & Medical 44 44/30,337 .0014

    Child & Medical 903 903/30,337 .030

    Infant & Child 1,878 1,878/30,337 .062

    Infant & Child & Medical 135 135/30,337 .0045

    TOTAL 30,337 .9999

    c. Using MINITAB, a pie chart of the data is:

    Medical ( 8377, 27.6%)

    Inf ant&Medic ( 44, 0.1%)

    Inf ant&Child ( 1878, 6.2%)

    Inf ant ( 1852, 6.1%)

    Inf &Chd&Med ( 135, 0.4%)Child&Medica ( 903, 3.0%)

    Child (17148, 56.5%)

    Pie Chart of Reason

    d. There are 4 categories where Medical is mentioned as a reason: Medical, Infant & Medical, Child & Medical, and Infant & Child & Medical. The sum of the frequencies for these 4 categories is 8,377 + 44 + 903 + 135 = 9,459. The proportion listing Medical as one of the reasons is 9,459/30,337 = .312.

    6 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.6 a. To find relative frequencies, we divide the frequencies of each category by the total number of incidents. The relative frequencies of the number of incidents for each of the cause categories are:

    Management System Cause Category

    Number of Incidents Relative Frequencies

    Engineering & Design 27 27 / 83 = .325 Procedures & Practices 24 24 / 83 = .289 Management & Oversight 22 22 / 83 = .265 Training & Communication 10 10 / 83 = .120 TOTAL 83 1

    b. The Pareto diagram is:

    Category

    Pe

    rce

    nt

    Trn&C ommMgmt&O v erProc&PractEng&Des

    35

    30

    25

    20

    15

    10

    5

    0

    Management Systen Cause Category

    c. The category with the highest relative frequency of incidents is Engineering and Design. The category with the lowest relative frequency of incidents is Training and Communication.

    2.8 a. The data collection method was a survey.

    b. Since the data were numbers (percentage of US labor and materials), the variable is quantitative. Once the data were collected, they were grouped into 4 categories.

    Methods for Describing Sets of Data 7

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • c. Using MINITAB, a pie chart of the data is:

  • 2.12 Using MINITAB, the side-by-side bar charts are:

    Unathor ized Use of COmputer Systems

    Re

    lati

    ve

    Fre

    qu

    en

    cy

    Don't knowNoYes

    0.7

    0.6

    0.5

    0.4

    0.3

    0.2

    0.1

    0.0

    Don't knowNoYes

    1999 2006

    Chart of 1999, 2006 vs Use

    The relative frequency of unauthorized use of computer systems has decreased from 1999 to 2006.

    2.14 a. Using MINITAB, the side-by-side graphs are:

    Stars

    Fre

    qu

    en

    cy

    2345

    16

    12

    8

    4

    0

    2345

    16

    12

    8

    4

    0

    Exposure Opportunity

    Content Faculty

    Chart of Exposure, Opportunity, Content, Faculty vs Stars

    From these graphs, one can see that very few of the top 30 MBA programs got 5-stars in

    any criteria. In addition, about the same number of programs got 4 stars in each of the 4 criteria. The biggest difference in ratings among the 4 criteria was in the number of programs receiving 3-stars. More programs received 3-stars in Course Content than in any of the other criteria. Consequently, fewer programs received 2-stars in Course Content than in any of the other criteria.

    b. Since this chart lists the rankings of only the top 30 MBA programs in the world, it is

    reasonable that none of these best programs would be rated as 1-star on any criteria.

    Methods for Describing Sets of Data 9

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.16 2.18 a. The original data set has 1 + 3 + 5 + 7 + 4 + 3 = 23 observations. b. For the bottom row of the stem-and-leaf display: The stem is 0. The leaves are 0, 1, 2. The numbers in the original data set are 0, 1, and 2. c. The dot plot corresponding to all the data points is:

    2.20. a. The measurement class that contains the highest proportion of respondents is none.

    Sixty-one percent of the respondents said that their companies did not outsource any computer security functions.

    b. From the graph, 6% of the respondents indicated that they outsourced between 20% and

    40% of their computer security functions.

    c. The proportion of the 609 respondents who outsourced at least 40% of computer security functions is .04 + .01 + .01 = .06.

    d. The number of the 609 respondents who outsourced less than 20% of computer security

    functions is (.27 + .61)*609 = .88(609) = 536.

    10 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.22 a. Using MINITAB, the stem-and-leaf display of the data is: Stem-and-Leaf Display: SCORE Stem-and-leaf of SCORE N = 169 Leaf Unit = 1.0 1 6 2 1 6 2 7 2 3 7 8 4 8 4 15 8 66677888899 56 9 00001111111222222222233333333344444444444 (100) 9 55555555555555555555556666666666666666666777777777777777777888888+ 13 10 0000000000000

    b. From the stem-and-leaf display, we see that there are only 4 observations with sanitation scores less than the acceptable score of 86. The proportion of ships that have an accepted sanitation standard would be (169 4) / 169 = .976.

    c. The sanitation score of 84 is in bold in the stem-and-leaf display in part a.

    2.24 a. Using MINITAB, the frequency histogram is:

    50403020

    30

    20

    10

    0

    Length

    Freq

    uenc

    y

    Methods for Describing Sets of Data 11

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Using MINITAB, the frequency histogram is:

    2502000150010005000

    35

    30

    25

    20

    15

    10

    5

    0

    Weight

    Freq

    uenc

    y

    c. Using MINITAB, the frequency histogram is:

    10005000

    140

    120

    100

    80

    60

    40

    20

    0

    DDT

    Freq

    uenc

    y

    2.26 Using MINITAB, the two dot plots are:

    Dotplot for Arrive-Depart

    Yes. Most of the numbers of items arriving at the work center per hour are in the 135 to 165 area. Most of the numbers of items departing the work center per hour are in the 110 to 140 area. Because the number of items arriving is larger than the number of items departing, there will probably be some sort of bottleneck.

    12 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.28 a. Using MINITAB, the three frequency histograms are as follows (the same starting point and class interval were used for each):

    Histogram of C1 N = 25 Tenth Performance Midpoint Count 4.00 0 8.00 0 12.00 1 * 16.00 5 ***** 20.00 10 ********** 24.00 6 ****** 28.00 0 32.00 2 ** 36.00 0 40.00 1 * Histogram of C2 N = 25 Thirtieth Performance Midpoint Count 4.00 1 * 8.00 9 ********* 12.00 12 ************ 16.00 2 ** 20.00 1 * Histogram of C3 N = 25 Fiftieth Performance Midpoint Count 4.00 3 *** 8.00 15 *************** 12.00 4 **** 16.00 2 ** 20.00 1 * b. The histogram for the tenth performance shows a much greater spread of the observations

    than the other two histograms. The thirtieth performance histogram shows a shift to the leftimplying shorter completion times than for the tenth performance. In addition, the fiftieth performance histogram shows an additional shift to the left compared to that for the thirtieth performance. However, the last shift is not as great as the first shift. This agrees with statements made in the problem.

    Methods for Describing Sets of Data 13

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.30 a. A stem-and-leaf display is as follows, where the stems are the units place and the leaves are the decimal places:

    Stem Leaves 1 0 0 0 0 1 1 2 2 222 3 4 4 4 4444 5 5 55 6 79 2 1 144 6 7 9 9 3 0 028 9 9 4 1112 5 5 24 6 7 8 8 9

    10 1

    b. A little more than half (26/49 = .53) of all companies spent less than 2 months in bankruptcy. Only two of the 49 companies spent more than 6 months in bankruptcy. It appears then, in general, the length of time in bankruptcy for firms using "prepacks" is less than that of firms not using "prepacks."

    c. A dot diagram will be used to compare the time in bankruptcy for the three types of "prepack" firms:

    d. The circled times in part a correspond to companies that were reorganized through a leverage buyout. There does not appear to be any pattern to these points. They appear to be scattered about evenly throughout the distribution of all times.

    2.32 Using MINITAB, the stem-and-leaf display for the data is:

    Stem-and-leaf of Time N = 25 Leaf Unit = 1.0

    3 3 239 7 4 3499 (7) 5 0011469 11 6 34458 6 7 13 4 8 6 2 2 9 5 1 10 2

    The numbers in bold represent delivery times associated with customers who subsequently did not place additional orders with the firm. Since there were only 2 customers with delivery times of 68 days or longer that placed additional orders, I would say the maximum tolerable delivery time is about 65 to 67 days. Everyone with delivery times less than 67 days placed additional orders.

    14 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.34 a. x = 3 + 8 + 4 + 5 + 3 + 4 + 6 = 33 b. 2x = 32 + 82 + 42 + 52 + 32 + 42 + 62 = 175 c. = (3 5)2( 5)x 2 + (8 5)2 + (4 5)2 + (5 5)2 + (3 5)2 + (4 5)2 + (6 5)2 = 20 d. = (3 2)2( 2)x 2 + (8 2)2 + (4 2)2 + (5 2)2 + (3 2)2 + (4 2)2 + (6 2)2 = 71 e. ( )2x = (3 + 8 + 4 + 5 + 3 + 4 + 6)2 = 332 = 1089 2.36 a. x = 6 + 0 + (2) + (1) + 3 = 6 b. 2x = 62 + 02 + (2)2 + (1)2 + 32 = 50

    c. ( )22

    5x

    x = 50 265 = 50 7.2 = 42.8

    2.38 a. 8510

    xx

    n= = = 8.5

    b. 40016

    x = = 25

    c. 3545

    x = = .78

    d. 24218

    x = = 13.44 2.40 The median is the middle number once the data have been arranged in order. If n is even, there

    is not a single middle number. Thus, to compute the median, we take the average of the middle two numbers. If n is odd, there is a single middle number. The median is this middle number.

    A data set with five measurements arranged in order is 1, 3, 5, 6, 8. The median is the middle

    number, which is 5. A data set with six measurements arranged in order is 1, 3, 5, 5, 6, 8. The median is the average

    of the middle two numbers which is 5 5 102 2+ = = 5.

    Methods for Describing Sets of Data 15

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.42 a. x = 7 46 6

    xn

    + += = " 15 = 2.5 Median = 3 3

    2 + = 3 (mean of 3rd and 4th numbers, after ordering)

    Mode = 3

    b. x = 2 413 13

    x n

    + += " 40= = 3.08 Median = 3 (7th number, after ordering) Mode = 3

    c. x = 51 37 49610 10

    xn

    + += " = = 49.6 Median = 48 50

    2 + = 49 (mean of 5th and 6th numbers, after ordering)

    Mode = 50 2.44 a. The sample mean is:

    1 529 355 301 ... 63 3757 144.526 26

    n

    ii

    xx

    n= + + + += = = =

    The sample median is found by finding the average of the 13th and 14th observations once the data are arranged in order. The 13th and 14th observations are 100 and 105. The average of these two numbers (median) is:

    100 105 205median 102.52 2+= = =

    The mode is the observation appearing the most. For this data set, the mode is 70, which appears 3 times. Since the mean is larger than the median, the data are skewed to the right.

    b. The sample mean is:

    1 11 9 6 ... 4 136 5.2326 26

    n

    ii

    xx

    n= + + + += = = =

    The sample median is found by finding the average of the 13th and 14th observations once the data are arranged in order. The 13th and 14th observations are 5 and 5. The average of these two numbers (median) is:

    5 5 10median 52 2+= = =

    16 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The mode is the observation appearing the most. For this data set, the mode is 6, which appears 6 times. Since the mean and median are about the same, the data are somewhat symmetric.

    2.46 a. The sample mean is:

    1 1.72 2.50 2.16 1.95 37.62 1.88120 20

    n

    ii

    xx

    n= + + + += = = =

    The sample average surface roughness of the 20 observations is 1.881.

    b. The median is found as the average of the 10th and 11th observations, once the data have been ordered. The ordered data are:

    1.06 1.09 1.19 1.26 1.27 1.40 1.51 1.72 1.95 2.03 2.05 2.13 2.13 2.16 2.24 2.31 2.41 2.50 2.57 2.64

    The 10th and 11th observations are 2.03 and 2.05. The median is:

    2.03 2.05 4.08 2.042 2+ = =

    The middle surface roughness measurement is 2.04. Half of the sample measurements

    were less than 2.04 and half were greater than 2.04.

    c. The data are somewhat skewed to the left. Thus, the median might be a better measure of central tendency than the mean. The few small values in the data tend to make the mean smaller than the median.

    2.48 a. Using MINITAB, the stem-and-leaf display is: Stem-and-leaf of PAF N=17 Leaf Unit = 1.0 6 0 000009 8 1 25 (2) 2 45 7 3 13 5 4 0 4 5 4 6 2 3 7 057 b. The median is the middle number once the data are arranged in order. The data arranged in

    order are: 0, 0, 0, 0, 0, 9, 12, 15, 24, 25, 31, 33, 40, 62, 70, 75, 77.

    The middle number or the median is 24.

    c. The mean of the data is x = x

    n = 77 33 75 31

    17 + + + +" = 473

    17 = 27.82

    Methods for Describing Sets of Data 17

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • d. The number occurring most frequently is 0. The mode is 0. e. The mode corresponds to the smallest number. It does not seem to locate the center of the

    distribution. Both the mean and the median are in the middle of the stem-and-leaf display. Thus, it appears that both of them locate the center of the data.

    2.50 a. The sample mean length is:

    1 42.5 44.0 41.5 ... 36.0 6165 42.81144 144

    n

    ii

    xx

    n= + + + += = = =

    The average length of the 144 fish is 42.81 cm.

    The median is the average of the middle two observations once they have been ordered.

    The 72nd and 73rd observations are 45 and 45. The average of these two observations is 45. Half of the fish lengths are less than 45 cm and half are longer.

    The mode is 46 cm. This observation occurred 12 times.

    b. The sample mean weight is:

    1 732 795 547 ... 1433 151159 1049.72144 144

    n

    ii

    xx

    n= + + + += = = =

    The average weight of the 144 fish is 1049.72 grams.

    The median is the average of the middle two observations once they have been ordered.

    The 72nd and 73rd observations are 989 and 1011. The average of these two observations is

    989 1,011median 10002+= =

    Half of the fish weights are less than 1000 grams and half are heavier.

    There are 2 modes, 886 and 1186. Each of these observations occurred 3 times.

    c. The sample mean DDT level is:

    1 10 16 23 ... 1.9 3507.1 24.35144 144

    n

    ii

    xx

    n= + + + += = = =

    The average DDT level of the 144 fish is 24.35 parts per million.

    18 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The median is the average of the middle two observations once they have been ordered. The 72nd and 73rd observations are 7.1 and 7.2. The average of these two observations is

    7.1 7.2median 7.152+= =

    Half of the fish DDT levels are less than 7.15 parts per million and half are greater.

    The mode is 12. This observation occurred 8 times. d. From the graph in Exercise 2.24a, the data are skewed to the left. This corresponds to the

    relationship between the mean and the median. For data skewed to the left, the mean is less than the median. For the fish lengths, the mean is 42.81 and the median is 45.

    e. From the graph in Exercise 2.24b, the data are slightly skewed to the right. This

    corresponds to the relationship between the mean and the median. For data skewed to the right, the mean is more than the median. For the fish weights, the mean is 1049.72 and the median is 1000.

    f. From the graph in Exercise 2.24c, the data are skewed to the right. This corresponds to the

    relationship between the mean and the median. For data skewed to the right, the mean is more than the median. For the fish DDT levels, the mean is 24.35 and the median is 7.15.

    2.52 a. Due to the "elite" superstars, the salary distribution is skewed to the right. Since this

    implies that the median is less than the mean, the players' association would want to use the median.

    b. The owners, by the logic of part a, would want to use the mean. 2.54 a. The sample mean is:

    42080

    203...4351 ==++++==

    =n

    xx

    n

    ii

    The sample median is found by finding the average of the 10th and 11th observations once

    the data are arranged in order. The data arranged in order are: 1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13 The 10th and 11th observations are 3 and 4. The average of these two numbers (median) is:

    3 4 7median 3.52 2+= = =

    The mode is the observation appearing the most. For this data set, the mode is 1, which

    appears 5 times.

    Methods for Describing Sets of Data 19

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Eliminating the largest number which is 13 results in the following:

    The sample mean is:

    53.31967

    193...4351 ==++++==

    =n

    xx

    n

    ii

    The sample median is found by finding the middle observation once the data are arranged

    in order. The data arranged in order are: 1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 The 10th observation is 3. The median is 3 The mode is the observations appearing the most. For this data set, the mode is 1, which

    appears 5 times. By dropping the largest number, the mean is reduced from 4 to 3.53. The median is

    reduced from 3.5 to 3. There is no effect on the mode.

    c. The data arranged in order are: 1 1 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7 9 13 If we drop the lowest 2 and largest 2 observations we are left with: 1 1 1 2 2 3 3 3 4 4 4 5 5 5 6 7

    The sample 10% trimmed mean is:

    1 1 1 1 ... 7 56 3.516 16

    n

    ii

    xx

    n= + + + += = = =

    The advantage of the trimmed mean over the regular mean is that very large and very small

    numbers that could greatly affect the mean have been eliminated.

    20 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.56 a. s2 =

    ( )221

    xx

    nn

    =

    2208410

    10 1

    = 4.8889 s = 4.8889 = 2.211

    b. s2 =

    ( )221

    xx

    nn

    =

    2100380 `40

    40 1

    = 3.3333 s = 3.3333 = 1.826

    c. s2 =

    ( )221

    xx

    nn

    =

    2171820

    20 1

    = .1868 s = .1868 = .432

    2.58 a. Range = 42 37 = 5

    s2 =

    ( )221

    xx

    nn

    =

    219979355

    5 1

    = 3.7 s = 3.7 = 1.92

    b. Range = 100 1 = 99

    s2 =

    ( )221

    xx

    nn

    =

    230325,7959

    9 1

    = 1,949.25 s = 1,949.25 = 44.15

    c. Range = 100 2 = 98

    s2 =

    ( )221

    xx

    nn

    =

    229520,0338

    8 1

    = 1,307.84 s = 1,307.84 = 36.16

    2.60 This is one possibility for the two data sets. Data Set 1: 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 Data Set 2: 1, 1, 1, 1, 1, 5, 5, 5, 5, 5

    1x = 1 1 2 2 3 3 4 4 5 5 30

    10 10x

    n+ + + + + + + + += = = 3

    2x = 1 1 1 1 1 5 5 5 5 5 30

    10 10x

    n+ + + + + + + + += = = 3

    Therefore, the two data sets have the same mean. The variances for the two data sets are:

    ( )2 2221

    30110 20101 9

    xx

    nsn

    = =

    9

    = = 2.2222

    ( )2 2222

    30110 20101 9

    xx

    nsn

    = =

    9

    = = 4.4444

    Methods for Describing Sets of Data 21

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The dot diagrams for the two data sets are shown below.

    2.62 a. Range = 3 0 = 3

    ( )2 222

    7155

    1 5

    xx

    nsn

    = =

    1

    = 1.3 s = 1.3 = 1.1402

    b. After adding 3 to each of the data points, Range = 6 3 = 3

    ( )2 222

    221025

    1 5

    xx

    nsn

    = =

    1 = 1.3 s = 1.3 = 1.1402

    c. After subtracting 4 from each of the data points, Range = 1 (4) = 3

    ( )2 222

    ( 13)395

    1 5

    xx

    nsn

    = =

    1

    = 1.3 s = 1.3 = 1.1402

    d. The range, variance, and standard deviation remain the same when any number is added to

    or subtracted from each measurement in the data set. 2.64 a. The maximum age is 64. The minimum age is 39. The range is 64 39 = 25.

    b. The variance is: 2

    212

    2 1

    2494125,764-50 27.822

    1 50-1

    n

    ini

    i

    xx

    nsn

    ==

    = = =

    c. The standard deviation is:

    2 27.822 5.275s s= = =

    d. Since the standard deviation of the ages of the 50 most powerful women in Europe is 10 years and is greater than that in the U.S. (5.275 years), the age data for Europe is more variable.

    22 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.66 a. The maximum weight is 1.1 carats. The minimum weight is .18 carats. The range is 1.1 .18 = .92 carats.

    b. The variance is:

    2

    22

    2

    194.32146.19308 .0768

    1 308 1

    ii

    ii

    xx

    nsn

    = =

    = square carats

    c. The standard deviation is:

    2 .0768 .2772s s= = = carats

    d. The standard deviation. This gives us an idea about how spread out the data are in the same units as the original data.

    2.68 a. A worker's overall time to complete the operation under study is determined by adding the

    subtask-time averages. Worker A

    The average for subtask 1 is: 2117

    xx

    n= = = 30.14

    The average for subtask 2 is: 217

    xx

    n= = = 3

    Worker A's overall time is 30.14 + 3 = 33.14. Worker B

    The average for subtask 1 is: 2137

    xx

    n= = = 30.43

    The average for subtask 2 is: 297

    xx

    n= = = 4.14

    Worker B's overall time is 30.43 + 4.14 = 34.57. b. Worker A

    s =

    ( )2 72 21164557 15.8095

    1 7 1

    xx

    nn

    = =

    = 3.98

    Worker B

    s =

    ( )2 22 21364877 .9524

    1 7 1

    xx

    nn

    = =

    = .98

    c. The standard deviations represent the amount of variability in the time it takes the worker

    to complete subtask 1.

    Methods for Describing Sets of Data 23

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • d. Worker A

    s =

    ( )2 22 21677 .6667

    1 7 1

    xx

    nn

    = =

    = .82

    Worker B

    s =

    ( )2 22 291477 4.4762

    1 7 1

    xx

    nn

    = =

    = 2.12

    e. I would choose workers similar to worker B to perform subtask 1. Worker B has a slightly

    higher average time on subtask 1 (A: x = 30.14, B: x = 30.43). But, Worker B has a smaller variability in the time it takes to complete subtask 1 (part b). He or she is more consistent in the time needed to complete the task.

    I would choose workers similar to Worker A to perform subtask 2. Worker A has a smaller

    average time on subtask 2 (A: x = 3, B: x = 4.14). Worker A also has a smaller variability in the time needed to complete subtask 2 (part d).

    2.70 Since no information is given about the data set, we can only use Chebyshev's Rule. a. Nothing can be said about the percentage of measurements which will fall between x s and x + s. b. At least 3/4 or 75% of the measurements will fall between x 2s and x + 2s. c. At least 8/9 or 89% of the measurements will fall between x 3s and x + 3s.

    2.72 a. x = 20625

    xn

    = = 8.24

    s2 =

    ( )2 22 206177825

    1 25

    xx

    nn

    =

    1

    = 3.357 s = 2s = 1.83

    b.

    Interval

    Number of Measurements in Interval

    Percentage

    x s, or (6.41, 10.07) 18 18/25 = .72 or 72% x 2s, or (4.58, 11.90) 24 24/25 = .96 or 96% x 3s, or (2.75, 13.73) 25 25/25 = 1 or 100%

    c. The percentages in part b are in agreement with Chebyshev's Rule and agree fairly well

    with the percentages given by the Empirical Rule.

    24 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • d. Range = 12 5 = 7 s range/4 = 7/4 = 1.75 The range approximation provides a satisfactory estimate of s = 1.83 from part a. 2.74 From Chebyshevs Theorem, we know that at least or 75% of all observations will fall within

    2 standard deviations of the mean. From Exercise 2.47, x = .631. From Exercise 2.66, s = .2772. This interval is:

    2 .631 2(.2772) .631 .5544 (.0766, 1.1854)x s

    2.76 a. From the information given, we have x = 375 and s = 25. From Chebyshev's Rule, we know that at least three-fourths of the measurements are within the interval:

    x 2s, or (325, 425)

    Thus, at most one-fourth of the measurements exceed 425. In other words, more than 425 vehicles used the intersection on at most 25% of the days.

    b. According to the Empirical Rule, approximately 95% of the measurements are within the

    interval:

    x 2s, or (325, 425)

    This leaves approximately 5% of the measurements to lie outside the interval. Because of the symmetry of a mound-shaped distribution, approximately 2.5% of these will lie below 325, and the remaining 2.5% will lie above 425. Thus, on approximately 2.5% of the days, more than 425 vehicles used the intersection.

    2.78 a. Since the sample mean (18.2) is larger than the sample median (15), it indicates that the

    distribution of years is skewed to the right. In addition, the maximum number of years is 50 and the minimum is 2. If the distribution were symmetric, the mean and median should be about halfway between these two numbers. Halfway between the maximum and minimum values is 26, which is much larger than either the mean or the median.

    b. The standard deviation can be estimated by the range divided by either 4 or 6. For this

    distribution, the range is: Range = Largest smallest = 50 2 = 48. Dividing the range by 4, we get an estimate of the standard deviation to be 48/4 = 12. Dividing the range by 6, we get an estimate of the standard deviation to be 48/6 = 8. Thus, the standard deviation should be somewhere between 8 and 12. For this problem, the

    standard deviation is s = 10.64. This value falls in the estimated range of 8 to 12.

    Methods for Describing Sets of Data 25

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • c. First, we calculate the number of standard deviations from the mean the value of 40 years is. To do this, we first subtract the mean and then divide by the value of the standard deviation.

    Number of standard deviations is 40 40 18.210.64

    xs = = 2.05 2

    Using Chebyshev's Rule, we know that at most 1/k2 or 1/22 = 1/4 of the data will be more

    than 2 standard deviations from the mean. Thus, this would indicate that at most 25% of the Generation Xers responded with 40 years or more.

    Next, we calculate the number of standard deviations from the mean the value of 8 years is.

    Number of standard deviations is 8 8 1810.64

    x .2s = = .96 -1

    Using Chebyshev's Rule, we get no information about the data within 1 standard deviation

    of the mean. However, we know the median (15) is more than 8. By definition, 50% of the data are larger than the median. Thus, at least 50% of the Generation Xers responded with 8 years or more. No additional information can be obtained with the information given.

    2.80 a. Using MINITAB, the frequency histogram for the time in bankruptcy is:

    10987654321

    20

    10

    0

    Time in Bankrupt

    Freq

    uenc

    y

    The Empirical Rule is not applicable because the data are not mound shaped.

    26 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Using MINITAB, the descriptive measures are:

    Descriptive Statistics: Time in Bankrupt

    Variable N Mean Median TrMean StDev SE Mean Time in 49 2.549 1.700 2.333 1.828 0.261

    Variable Minimum Maximum Q1 Q3 Time in 1.000 10.100 1.350 3.500

    From Chebyshevs Theorem, we know that at least 75% of the observations will fall within 2 standard deviations of the mean. This interval is:

    2 2.549 2(1.828) 2.549 3.656 ( 1.107, 6.205)x s

    c. There are 47 of the 49 observations within this interval. The percentage would be (47/49)*100% = 95.9%. This agrees with Chebyshevs Theorem (at least 75%0. It also agrees with the Empirical Rule (approximately 95%).

    d. From the above interval we know that about 95% of all firms filing for prepackaged

    bankruptcy will be in bankruptcy between 0 and 6.2 months. Thus, we would estimate that a firm considering filing for bankruptcy will be in bankruptcy up to 6.2 months.

    2.82 a. Since it is given that the distribution is mound-shaped, we can use the Empirical Rule. We

    know that 1.84% is 2 standard deviations below the mean. The Empirical Rule states that approximately 95% of the observations will fall within 2 standard deviations of the mean and, consequently, approximately 5% will lie outside that interval. Since a mound-shaped distribution is symmetric, then approximately 2.5% of the day's production of batches will fall below 1.84%.

    b. If the data are actually mound-shaped, it would be extremely unusual (less than 2.5%) to

    observe a batch with 1.80% zinc phosphide if the true mean is 2.0%. Thus, if we did observe 1.8%, we would conclude that the mean percent of zinc phosphide in today's production is probably less than 2.0%.

    2.84 a. Since we do not have any idea of the shape of the distribution of SAT-Math score

    changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the observations will fall within 3 standard deviations of the mean. This interval would be:

    3 19 3(65) 19 195 ( 176, 214)x s Thus, for a randomly selected student, we could be pretty sure that this students score

    would be any where from 176 points below his/her previous SAT-Math score to 214 points above his/her previous SAT-Math score.

    b. Since we do not have any idea of the shape of the distribution of SAT-Verbal score changes, we must use Chebyshevs Theorem. We know that at least 8/9 of the

    observations will fall within 3 standard deviations of the mean. This interval would be:

    3 7 3(49) 7 147 ( 140, 154)x s

    Methods for Describing Sets of Data 27

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Thus, for a randomly selected student, we could be pretty sure that this students score

    would be any where from 140 points below his/her previous SAT-Verbal score to 154 points above his/her previous SAT-Verbal score.

    c. A change of 140 points on the SAT-Math would be a little less than 2 standard deviations

    from the mean. A change of 140 points on the SAT-Verbal would be a little less than 3 standard deviations from the mean. Since the 140 point change for the SAT-Math is not as big a change as the 140 point on the SAT-Verbal, it would be most likely that the score was a SAT-Math score.

    2.86 a. z = 40 305

    x x s = = 2 (sample) 2 standard deviations above the mean.

    b. z = 90 892

    x = = .5 (population) .5 standard deviations above the mean.

    c. z = 50 505

    x = = 0 (population) 0 standard deviations above the mean.

    d. z = 20 304

    x x s = = 2.5 (sample) 2.5 standard deviations below the mean.

    2.88 The 50th percentile of a data set is the observation that has half of the observations less than it.

    Another name for the 50th percentile is the median. 2.90 Since the element 40 has a z-score of 2 and 90 has a z-score of 3,

    2 = 40 and 3 = 90

    2 = 40 3 = 90 2 = 40 + 3 = 90 = 40 + 2 By substitution, 40 + 2 + 3 = 90 5 = 50 = 10 By substitution, = 40 + 2(10) = 60 Therefore, the population mean is 60 and the standard deviation is 10. 2.92 The percentile ranking of the age of 25 years would be 100% 73.5% = 26.5%.

    28 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.94 a. From Exercise 2.77, x = 94.91 and s = 4.83. The z-score for an observation of 78 is:

    78 94.91 3.504.83

    x xzs = = =

    This z-score indicates that an observation of 78 is 3.5 standard deviations below the mean. Very few observations will be lower than this one. b. The z-score for an observation of 98 is:

    98 94.91 0.634.83

    x xzs = = =

    This z-score indicates that an observation of 98 is .63 standard deviations above the mean. This score is not an unusual observation in the data set. 2.96 a. From the problem, = 2.7 and = .5

    z = x - z = x x = + z For z = 2.0, x = 2.7 + 2.0(.5) = 3.7 For z = 1.0, x = 2.7 1.0(.5) = 2.2 For z = .5, x = 2.7 + .5(.5) = 2.95 For z = 2.5, x = 2.7 2.5(.5) = 1.45 b. For z = 1.6, x = 2.7 1.6(.5) = 1.9 c. If we assume the distribution of GPAs is

    approximately mound-shaped, we can use the Empirical Rule.

    From the Empirical Rule, we know that .025

    or 2.5% of the students will have GPAs above 3.7 (with z = 2). Thus, the GPA corresponding to summa cum laude (top 2.5%) will be greater than 3.7 (z > 2).

    We know that .16 or 16% of the students will have GPAs above 3.2 (z = 1). Thus, the

    limit on GPAs for cum laude (top 16%) will be greater than 3.2 (z > 1). We must assume the distribution is mound-shaped.

    Methods for Describing Sets of Data 29

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.98 a. Since the data are approximately mound-shaped, we can use the Empirical Rule. On the blue exam, the mean is 53% and the standard deviation is 15%. We know that approximately 68% of all students will score within 1 standard deviation of the mean. This interval is:

    53 (15) (38, 68)x s

    About 95% of all students will score within 2 standard deviations of the mean. This interval is:

    2 53 2(15) 53 30 (23, 83)x s

    About 99.7% of all students will score within 3 standard deviations of the mean. This interval is:

    3 53 3(15) 53 45 (8, 98)x s

    b. Since the data are approximately mound-shaped, we can use the Empirical Rule. On the red exam, the mean is 39% and the standard deviation is 12%. We know that approximately 68% of all students will score within 1 standard deviation of the mean. This interval is:

    39 (12) (27, 51)x s

    About 95% of all students will score within 2 standard deviations of the mean. This interval is:

    2 39 2(12) 39 24 (15, 63)x s

    About 99.7% of all students will score within 3 standard deviations of the mean. This interval is:

    3 39 3(12) 39 36 (3, 75)x s

    c. The student would have been more likely to have taken the red exam. For the blue exam, we know that approximately 95% of all scores will be from 23% to 83%. The observed 20% score does not fall in this range. For the blue exam, we know that approximately 95% of all scores will be from 15% to 63%. The observed 20% score does fall in this range. Thus, it is more likely that the student would have taken the red exam.

    2.100 The 25th percentile, or lower quartile, is the measurement that has 25% of the measurements

    below it and 75% of the measurements above it. The 50th percentile, or median, is the measurement that has 50% of the measurements below it and 50% of the measurements above it. The 75th percentile, or upper quartile, is the measurement that has 75% of the measurements below it and 25% of the measurements above it.

    30 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.102 a. Median is approximately 4. b. QL is approximately 3 (Lower Quartile) QU is approximately 6 (Upper Quartile) c. IQR = QU QL 6 3 = 3

    d. The data set is skewed to the right since the right whisker is longer than the left, there is one outlier, and there are two potential outliers.

    e. 50% of the measurements are to the right of the median and 75% are to the left of the upper

    quartile. f. There are two potential outliers, 12 and 13. There is one outlier, 16. 2.104 a. From the problem, x = 52.33 and s = 9.22. The highest salary is 75 (thousand).

    The z-score is z = x xs = 75 52.33

    9.22 = 2.46

    Therefore, the highest salary is 2.46 standard deviations above the mean. The lowest salary is 35.0 (thousand).

    The z-score is z = x xs = 35.0 52.33

    9.22 = 1.88

    Therefore, the lowest salary is 1.88 standard deviations below the mean. The mean salary offer is 52.33 (thousand).

    The z-score is z = x xs = 52.33 52.33

    9.22 = 0

    The z-score for the mean salary offer is 0 standard deviations from the mean. No, the highest salary offer is not unusually high. For any distribution, at least 8/9 of the

    salaries should have z-scores between 3 and 3. A z-score of 2.46 would not be that unusual.

    Methods for Describing Sets of Data 31

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Using MINITAB, the box plot is:

    Since no salaries are outside the inner fences, none of them are potentially faulty observations. 2.106 Using MINITAB, the side-by-side box plots are:

    GROUP

    AG

    E

    321

    65

    60

    55

    50

    45

    40

    From the boxplots, there appears to be one outlier in the third group. 2.108 a. First, we will compute the mean and standard deviation.

    The sample mean is:

    1 393 5.2475

    n

    ii

    xx

    n== = =

    The sample variance is:

    2

    22

    2

    393594375 52.482

    1 75 1

    ii

    ii

    xx

    nsn

    = = =

    32 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The standard deviation is: 2 52.482 7.244s s= = =

    Since this data set is highly skewed, we will use 2 standard deviations from the mean as the cutoff for outliers. Z-scores with values greater than 2 in absolute value are considered outliers. An observation with a z-score of 2 would have the value:

    5.242 2(7.244) 5.24 14.488 5.24 19.7287.244

    x x xz x x xs = = = = =

    An observation with a z-score of -2 would have the value:

    5.242 2(7.244) 5.24

    7.24414.488 5.24 9.248

    x x xz xs

    x x

    = = = = =

    Thus any observation that is greater than to 19.728 or less than -9.248 would be considered an outlier. In this data set there would be 4 outliers: 21, 21, 25, 48.

    b. Deleting these 4 outliers, we will recalculate the mean, median, variance, and standard

    deviation. The median for the original data set is the middle number once they have been arranged in order and is the 38th observation which is 3.

    The new mean is:

    1 278 3.9271

    n

    ii

    xx

    n== = =

    The new sample variance is:

    2

    22

    2

    278213271 14.907

    1 71 1

    ii

    ii

    xx

    nsn

    = = =

    The new standard deviation is:

    2 14.907 3.861s s= = =

    The new median is the 36th observation once the data have been arranged in order and is 3. In the original data set, the mean is 5.24, the standard deviation is 7.244, and the median is 3. In the revised data set, the mean is 3.92, the standard deviation is 3.861, and the median is 3. The mean has been decreased, the standard deviation has been almost halved, but the median stays the same.

    Methods for Describing Sets of Data 33

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.110 For Perturbed Intrinsics, but no Perturbed Projections:

    1 1.0 1.3 3.0 1.5 1.3 8.1 1.625 5

    n

    ii

    xx

    n= + + + += = = =

    2

    212

    2 1

    8.115.63 2.5085 .6271 4 4

    n

    ini

    ii

    xx

    nsn

    ==

    = = =

    =

    2 .627 .792s s= = = The z-score corresponding to a value of 4.5 is

    4.5 1.62 3.63.792

    x xzs = = =

    Since this z-score is greater than 3, we would consider this an outlier for perturbed intrinsics, but no perturbed projections. For Perturbed Projections, but no Perturbed Intrinsics:

    1 22.9 21.0 34.4 29.8 17.7 125.8 25.165 5

    n

    ii

    xx

    n= + + + += = = =

    2

    212

    2 1

    125.83350.1 184.9725 46.2431 4 4

    n

    ini

    ii

    xx

    nsn

    ==

    = = = =

    2 46.243 6.800s s= = = The z-score corresponding to a value of 4.5 is

    4.5 25.16 3.0386.800

    x xzs = = =

    Since this z-score is less than -3, we would consider this an outlier for perturbed projections, but no perturbed intrinsics. Since the z-score corresponding to 4.5 for the perturbed projections, but no perturbed intrinsics is smaller than that for perturbed intrinsics, but no perturbed projections, it is more likely that the that the type of camera perturbation is perturbed projections, but no perturbed intrinsics.

    34 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.112 Using MINITAB, a scatterplot of the data is:

    876543210-1

    15

    10

    5

    0

    Var1

    Var

    2

    2.114 Using MINITAB, the scatterplot of the data is:

    1050

    550

    450

    350

    250

    150

    50

    Offices

    Law

    yers

    As the number of offices increases, the number of lawyers also tends to increase. 2.116 a. Using MINITAB, the scatterplot is:

    40302010

    20

    15

    10

    5

    10th

    30th

    It appears that as the completion time for the 10th trial increases, the completion time for the 30th trial decreases.

    Methods for Describing Sets of Data 35

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Using MINITAB, the scatterplot is:

    40302010

    20

    15

    10

    5

    10th

    50th

    It appears that as the completion time for the 10th trial increases, the completion time for the 50th trial increases.

    c. Using MINITAB, the scatterplot is:

    2015105

    20

    15

    10

    5

    30th

    50th

    It appears that as the completion time for the 30th trial increases, the completion time for the 50th trial increases.

    36 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.118 Using MINITAB, the scatterplot of the data is:

    T ime

    Ma

    ss

    6050403020100

    7

    6

    5

    4

    3

    2

    1

    0

    Scatterplot of Mass vs Time

    There is evidence to indicate that the mass of the spill tends to diminish as time increases. As time is getting larger, the mass is decreasing.

    2.120 The mean is sensitive to extreme values in a data set. Therefore, the median is preferred to the

    mean when a data set is skewed in one direction or the other. 2.122 a. If we assume that the data are about mound-shaped, then any observation with a z-score greater than 3 in absolute value would be considered an outlier. From Exercise

    1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70 is 1, and the z-score corresponding to 80 is 2. Since none of these z-scores is greater than 3 in absolute value, none would be considered outliers.

    b. From Exercise 1.121, the z-score corresponding to 50 is 2, the z-score corresponding to

    70 is 2, and the z-score corresponding to 80 is 4. Since the z-score corresponding to 80 is greater than 3, 80 would be considered an outlier.

    c. From Exercise 1.121, the z-score corresponding to 50 is 1, the z-score corresponding to 70

    is 3, and the z-score corresponding to 80 is 4. Since the z-scores corresponding to 70 and 80 are greater than or equal to 3, 70 and 80 would be considered outliers.

    d. From Exercise 1.121, the z-score corresponding to 50 is .1, the z-score corresponding to 70

    is .3, and the z-score corresponding to 80 is .4. Since none of these z-scores is greater than 3 in absolute value, none would be considered outliers.

    Methods for Describing Sets of Data 37

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 38 Chapter 2

    2.124 a. x = 4 + 6 + 6 + 5 + 6 + 7 = 34 2x = 42 + 62 + 62 + 52 + 62 + 72 = 198 34

    6x

    xn

    = = = 5.67

    ( )2 222

    34198 5.333361 6 1 5

    xx

    nsn

    = = =

    = 1.0667

    s = 1.067 = 1.03 b. x = 1 + 4 + (3) + 0 + (3) + (6) = 9 2x = (1)2 + 42 + (3)2 + 02 + (3)2 + (6)2 = 71 9

    6x

    xn

    = = = -$1.5

    ( )2 222

    ( 9)71 57.561 6 1 5

    xx

    nsn

    = = =

    = 11.5 dollars squared

    s = 11.5 = $3.39

    c. 3 4 2 1 15 5 5 5 16

    x = + + + + = 2.0625

    2 2 2 2 22 3 4 2 1 1

    5 5 5 5 16x = + + + + = 1.2039

    2.06255

    xx

    n= = = .4125%

    ( )2 222

    2.06251.2039 .353151 5 1 4

    xx

    nsn

    = = =

    = .0883% squared

    s = .0883 = .30% d. (a) Range = 7 4 = 3 (b) Range = $4 ($-6) = $10

    (c) Range = 4 1 64 5 59% % % % %5 16 80 80 80

    = = = .7375% 2.126 range/4 = 20/4 = 5

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.128 Using MINITAB, a pie chart of the data is:

    9.8%true

    90.2%false

    C ategoryfalsetrue

    Pie Chart of defect

    A response of true means the software contained defective code. Thus, only 9.8% of the

    modules contained defective software code. 2.130 The z-score would be:

    408 603.7 1.06185.4

    x xzs = = =

    Since this value is not very big, this is not an unusual value to observe. 2.132 a. The variable of interest is opinion of book reviews. The values could be would not

    recommend, cautious or very little recommendation, little or no preference, favorable/recommended, and outstanding/significant contribution. Since these responses are not numerical, the variable is quantitative.

    b. Most of the books (63%) received a "favorable/recommended" review. About the same

    percentage of books received the following reviews: "cautious or very little recommendation" (10%), "little or no preference" (9%), and "outstanding/significant contribution" (12%). Only 5% of the books received "would not recommend" reviews.

    c. If the top two categories are added together, the percent recommended is 75% (actually

    slightly higher than 75%). This agrees with the study. 2.134 a. To display the status, we use a pie chart. From the pie chart,

    we see that 58% of the Beanie babies are retired and 42% are current.

    Methods for Describing Sets of Data 39

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. Using Minitab, a histogram of the values is:

    Most (40 of 50) Beanie babies have values less than $100. Of the remaining 10, 5 have

    values between $100 and $300, 1 has a value between $300 and $500, 1 has a value between $500 and $700, 2 have values between $700 and $900, and 1 has a value between $1900 and $2100.

    c. A plot of the value versus the age of the Beanie Baby is as follows:

    From the plot, it appears that as the age increases, the value tends to increase. 2.136 a. Using MINITAB, the stem-and-leaf display is: Stem-and-leaf of C1 Leaf Unit = 0.10 N = 46 4 0 34 4 4 (25) 0 5 5 5 5 5 5 5 556666 6 6 6 7 7 7 7 7 8 8 8 8 9 16 1 000011222 3 34 4 1 7 7 2 2 2 2 2 3 2 3 9 1 4 1 4 7

    40 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • b. The leaves that represent those brands that carry the American Dental Association seal are circled above.

    c. It appears that the cost of the brands approved by the ADA tend to have the lower costs.

    Thirteen of the twenty brands approved by the ADA, or (13/20) 100% = 65% are less than the median cost.

    2.138 a. Using MINITAB, the summary statistics are:

    Descriptive Statistics: Marketing, Engineering, Accounting, Total

    Variable N Mean Median TrMean StDev SE Mean Marketin 50 4.766 5.400 4.732 2.584 0.365 Engineer 50 5.044 4.500 4.798 3.835 0.542 Accounti 50 3.652 0.800 2.548 6.256 0.885 Total 50 13.462 13.750 13.043 6.820 0.965

    Variable Minimum Maximum Q1 Q3 Marketin 0.100 11.000 2.825 6.250 Engineer 0.400 14.400 1.775 7.225 Accounti 0.100 30.000 0.200 3.725 Total 1.800 36.200 8.075 16.600

    b. The z-scores corresponding to the maximum time guidelines developed for each

    department and the total are as follows:

    Marketing: z = 6.5 4.772.58

    x xs = = .67

    Engineering: z = 7.0 5.043.84

    x xs = = .51

    Accounting: z = 8.5 3.656.26

    x xs = = .77

    Total: z = 17 13.466.82

    x xs = = .52

    c. To find the maximum processing time corresponding to a z-score of 3, we substitute in the

    values of z, , and s into the z formula and solve for x.

    z = x x x x zs x x zss = = +

    Marketing: x = 4.77 + 3(2.58) = 4.77 + 7.74 = 12.51 None of the orders exceed this time. Engineering: x = 5.04 + 3(3.84) = 5.04 + 11.52 = 16.56 None of the orders exceed this time. These both agree with both the Empirical Rule and Chebyshev's Rule.

    Methods for Describing Sets of Data 41

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Accounting: x = 3.65 + 3(6.26) = 3.65 + 18.78 = 22.43 One of the orders exceeds this time or 1/50 = .02. Total: x = 13.46 + 3(6.82) = 13.46 + 20.46 = 33.92 One of the orders exceeds this time or 1/50 = .02. These both agree with Chebyshev's Rule but not the Empirical Rule. Both of these last two

    distributions are skewed to the right. d. Marketing: x = 4.77 + 2(2.58) = 4.77 + 5.16 = 9.93 Two of the orders exceed this time or 2/50 = .04. Engineering: x = 5.04 + 2(3.84) = 5.04 + 7.68 = 12.72 Two of the orders exceed this time or 2/50 = .04. Accounting: x = 3.65 + 2(6.26) = 3.65 + 12.52 = 16.17 Three of the orders exceed this time or 3/50 = .06. Total: x = 13.46 + 2(6.82) = 13.46 + 13.64 = 27.10 Two of the orders exceed this time or 2/50 = .04. All of these agree with Chebyshev's Rule but not the Empirical Rule. e. No observations exceed the guideline of 3 standard deviations for both Marketing and

    Engineering. One observation exceeds the guideline of 3 standard deviations for both Accounting (#23, time = 30.0 days) and Total (#23, time = 36.2 days). Therefore, only (1/10) 100% of the "lost" quotes have times exceeding at least one of the 3 standard deviation guidelines.

    Two observations exceed the guideline of 2 standard deviations for both Marketing (#31,

    time = 11.0 days and #48, time = 10.0 days) and Engineering (#4, time = 13.0 days and #49, time = 14.4 days). Three observations exceed the guideline of 2 standard deviations for Accounting (#20, time = 22.0 days; #23, time = 30.0 days; and #36, time = 18.2 days). Two observations exceed the guideline of 2 standard deviations for Total (#20, time = 30.2 days and #23, time = 36.2 days). Therefore, (7/10) 100% = 70% of the "lost" quotes have times exceeding at least one the 2 standard deviation guidelines.

    We would recommend the 2 standard deviation guideline since it covers 70% of the lost

    quotes, while having very few other quotes exceed the guidelines. 2.140 a. First, construct a relative frequency distribution for the departments.

    Class Department Frequency Relative Frequency 1 Production 13 .241 2 Maintenance 31 .574 3 Sales 3 .056 4 R & D 2 .037 5 Administration 5 .093 TOTAL 54 1.001

    42 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The Pareto diagram is: From the diagram, it is evident that the departments with the worst safety record are Maintenance and Production. b. First, construct a relative frequency

    distribution for the type of injury in the maintenance department.

    Class Injury Frequency Relative Frequency 1 Burn 6 .194 2 Back strain 5 .161 3 Eye damage 2 .065 4 Cuts 10 .323 5 Broken arm 2 .065 6 Broken leg 1 .032 7 Concussion 3 .097 8 Hearing loss 2 .065 TOTAL 31 1.002

    The Pareto diagram is: From the Pareto diagram, it is

    evident that cuts is the most prevalent type of injury. Burns and back strain are the next most prevalent types of injuries.

    2.142 a. Using MINITAB, the descriptive statistics are:

    Descriptive Statistics: MPG

    Variable N Mean Median TrMean StDev SE Mean MPG 36 40.056 40.000 40.063 2.177 0.363

    Variable Minimum Maximum Q1 Q3 MPG 35.000 45.000 39.000 41.000

    Methods for Describing Sets of Data 43

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The mean is 40.056 and the standard deviation is 2.177. Both of these measures are measured in the same units as the original data, which are miles per gallon.

    b. Since the sample mean is a good estimate of the population mean, the manufacturer should

    be satisfied. The sample mean is 40.056 which is greater than 40. c. The range of the data set is 45 35 = 10. Using Chebyshev's Rule, the range should cover

    approximately 6 standard deviations. Thus, a good estimate of the standard deviation would be 10/6 = 1.67. Using the Empirical Rule, the range should cover approximately 4 standard deviations. Thus, a good estimate of the standard deviation would be 10/4 = 2.5 The given standard deviation is 2.177 which is between these two estimates. Thus, it is a reasonable value.

    d. Using MINITAB, the frequency histogram is (the relative frequency histogram would have

    the same shape):

    4544434241403938373635

    9

    8

    7

    6

    5

    4

    3

    2

    1

    0

    MPG

    Freq

    uenc

    y

    Yes, the data appear to be mound-shaped. e. Because the data are mound-shaped, we can use the Empirical Rule. We would expect

    approximately 68% of the data within the interval x s, approximately 95% of the data within the interval x 2s, and approximately all of the data within the interval x 3s.

    f. The interval x s is 40.056 2.177 or (37.879, 42.233). Twenty-seven of the

    observations fall in this interval or 27/36 = .75 or 75%. This number is a little larger than 68%.

    The interval x 2s is 40.056 2(2.177) or (35.702, 44.410). Thirty-four of the

    observations fall in this interval or 34/36 = .94 or 94%. This number is very close to 95%. The interval x 3s is 40.056 3(2.177) or (33.525, 46.587). Thirty-six of the

    observations fall in this interval or 36/36 = 1.00 or 100%. This number is the same as all of the observations.

    44 Chapter 2

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 2.144 a. Both the height and width of the bars (peanuts) change. Thus, some readers may tend to equate the area of the peanuts with the frequency for each year.

    b. The frequency bar chart is:

    Methods for Describing Sets of Data 45

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The Kentucky Milk Case (To accompany Chapters 12)

    There are many things that could be included in a report about the possibility of collusion. I have concentrated on the incumbency rates, bid levels and dispersion, and average winning bids. With the data available, no comparison of market share can be made since there was so much missing data. Actually, with the data available, the exact analysis cannot be made, since only the winning bid information is provided. Thus, we have no idea what the losing bids were. I will present what I think is a reasonable solution. This is by no means the only solution to the case. Many other presentations could also be used. Incumbency Rates The incumbency rate is the percent of the school districts that are won by the same vendor who won the previous year. A table containing the incumbency rates is included as well as a plot. Notice in the plot that the incumbency rates in the Tri-county market is higher than that in the Surrounding market. From 1985 through 1988, the incumbency rate for the Tri-county market was never lower than .923, while in the same period in the Surrounding market, the incumbency rate was never higher than .730. This implies the possibility of collusion in the Tri-county market. Surrounding Market Tri-county Market Year Number of

    Districts Same

    Vendors Incumbency

    Rate Number of

    Districts Same

    VendorsIncumbency

    Rate 1984 26 16 .615 10 8 .800 1985 27 19 .704 12 12 1.000 1986 32 19 .594 13 13 1.000 1987 37 27 .730 13 12 .923 1988 37 25 .676 13 13 1.000 1989 37 23 .622 13 9 .692 1990 34 24 .706 13 10 .769 1991 5 3 .600 13 11 .846

    46 The Kentucky Milk Case

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The plot of the incumbency rates is: Bid Levels and Dispersion Since we only have access to the winning bids in each of the school districts, we cannot make a true analysis of the bid levels and dispersions. As a compromise, I have used the winning bids of the two dairies in questionTrauth and Meyer. I have looked at only the winning bids of these two dairies in both the Tri-county market and in the Surrounding market. If there was no collusion, then the winning bids and the dispersions of the winning bids should be similar in the two markets for the two dairies. I looked at the box plots of the winning bids of the two dairies in each market for each type of milk: whole white, lowfat white and lowfat chocolate. I have included only a few of the box plots as illustrations. Those included are for 1985 and 1986.

    The Kentucky Milk Case 47

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 1985 Winning Bids: WHOLE LOWFAT LOWFAT OBS MARKET WINNER WHITE WHITE CHOCOLATE 1 SUR MEYER 0.1280 0.1250 0.1315 2 SUR TRAUTH 0.1200 0.1110 0.1090 3 SUR TRAUTH . 0.1079 0.1079 4 SUR TRAUTH . 0.1190 0.1210 5 SUR MEYER 0.1225 0.1130 0.1099 6 SUR TRAUTH 0.1230 0.1130 0.1120 7 SUR MEYER 0.1250 0.1145 0.1140 8 TRI TRAUTH 0.1440 0.1440 . 9 TRI TRAUTH 0.1450 0.1350 . 10 TRI MEYER 0.1410 0.1410 0.1410 11 TRI TRAUTH 0.1393 0.1393 . 12 TRI MEYER 0.1340 0.1340 0.1340 13 TRI MEYER 0.1445 0.1345 0.1395 14 TRI MEYER . 0.1345 . 15 TRI TRAUTH 0.1449 0.1349 0.1399 16 TRI TRAUTH . 0.1299 0.1299 17 TRI MEYER 0.1480 0.1480 0.1480 18 TRI TRAUTH 0.1310 0.1290 . 19 TRI MEYER . 0.1380 . 20 TRI TRAUTH 0.1435 0.1335 . Box Plots for Whole White Milk1985

    MA RKET

    WW

    BID

    TRI-COUNTYSURROUND

    0.150

    0.145

    0.140

    0.135

    0.130

    0.125

    0.120

    Boxplots for Whole White Milk - 1985

    48 The Kentucky Milk Case

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Box Plots for Lowfat White Milk1985

    MA RKET

    LF

    WB

    ID

    TRI-COUNTYSURROUND

    0.15

    0.14

    0.13

    0.12

    0.11

    Boxplots for Lowfat White Milk - 1985

    Box Plots for Lowfat Chocolate Milk1985

    MA RKET

    LF

    CB

    ID

    TRI-COUNTYSURROUND

    0.15

    0.14

    0.13

    0.12

    0.11

    Boxplots for Lowfat Chocolate Milk - 1985

    The Kentucky Milk Case 49

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • For each type of milk, the mean and median winning bids for the Tri-county market were higher than the corresponding winning bids in the Surrounding market. Also, the dispersion, indicated by the width of the boxes and the length of the whiskers, for the Surrounding market is larger than for the Tri-county market in most cases. This is indicative of collusion in the Tri-county market. This same pattern also existed in 1986.

    1986 Winning Bids: WHOLE LOWFAT LOWFAT OBS MARKET WINNER WHITE WHITE CHOCOLATE 1 SUR TRAUTH 0.1195 0.1100 0.1085 2 SUR TRAUTH 0.1330 0.1240 0.1290 3 SUR TRAUTH 0.1140 0.1070 0.1050 4 SUR MEYER 0.1350 0.1250 0.1315 5 SUR TRAUTH 0.1224 0.1124 0.1110 6 SUR TRAUTH . 0.1110 0.1110 7 SUR TRAUTH . 0.1180 0.1200 8 SUR TRAUTH 0.1250 0.1125 0.1115 9 TRI TRAUTH 0.1475 0.1475 . 10 TRI TRAUTH 0.1469 0.1369 . 11 TRI MEYER 0.1440 0.1340 0.1395 12 TRI TRAUTH 0.1420 0.1420 . 13 TRI MEYER 0.1390 0.1390 0.1390 14 TRI MEYER 0.1470 0.1370 0.1420 15 TRI MEYER . 0.1380 . 16 TRI TRAUTH 0.1474 0.1374 0.1424 17 TRI TRAUTH . 0.1349 0.1349 18 TRI MEYER 0.1505 0.1505 0.1505 19 TRI TRAUTH 0.1360 0.1320 . 20 TRI MEYER . 0.1430 . 21 TRI TRAUTH 0.1460 0.1360 . Box Plots for Whole White Milk1986

    MA RKET

    WW

    BID

    TRI-COUNTYSURROUND

    0.15

    0.14

    0.13

    0.12

    0.11

    Boxplots for Whole White Milk - 1986

    50 The Kentucky Milk Case

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Box Plots for Lowfat White Milk1986

    MA RKET

    LF

    WB

    ID

    TRI-COUNTYSURROUND

    0.15

    0.14

    0.13

    0.12

    0.11

    Boxplots for Lowfat White Milk - 1986

    Box Plots for Lowfat Chocolate Milk1986

    MA RKET

    LF

    CB

    ID

    TRI-COUNTYSURROUND

    0.15

    0.14

    0.13

    0.12

    0.11

    0.10

    Boxplots for Lowfat Chocolate Milk - 1986

    The Kentucky Milk Case 51

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The same pattern that existed for 1985 and 1986 also existed in 1984, 1987, and 1988. From 1989 on, the pattern no longer existed. Thus, from the plots, it appears that the two dairies were working together from 1984 through 1988 in the Tri-county market. I also plotted the mean winning bids for the two dairies in each of the two markets from 1984 through 1991 for each type of milk. In all three plots, the mean winning bid in 1983 was almost the same in the two markets. Then, in 1984, the mean winning bid in the Tri-county market was higher than in the Surrounding market for all three types of milk. This trend holds basically through 1988 (the lowfat white milk mean winning bid for the Surrounding market was greater than the mean winning bid in the Tri-county market in 1988). After 1988, the mean winning bids in the two markets are almost the same. This points to collusion in the Tri-county market from 1984 through 1988.

    52 The Kentucky Milk Case

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • The dispersion, measured using the standard deviation, of the winning bids for each of the three types of milk was basically smaller in the Tri-county market than in the Surrounding market for the years 1985 through 1988. Again, after 1988 this pattern no longer existed. Again, this points to collusion between the two dairies in the Tri-county market during the years 1984 through 1988.

    The Kentucky Milk Case 53

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 54 The Kentucky Milk Case

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • Probability Chapter 3

    3.2 a. This is a Venn Diagram. b. If the sample points are equally likely, then

    P(1) = P(2) = P(3) = = P(10) = 110

    Therefore,

    P(A) = P(4) + P(5) + P(6) = 1 1 1 310 10 10 10

    + + = = .3

    P(B) = P(6) + P(7) = 1 1 210 10 10

    + = = .2

    c. P(A) = P(4) + P(5) + P(6) = 1 1 3 520 20 20 20

    + + = = .25

    P(B) = P(6) + P(7) = 3 3 620 20 20

    + = = .3

    3.4 a. 9 9! 9 8 7 6 5 4 3 2 14 4!(9 4)! 4 3 2 1 5 4 3 2 1

    = = = 126

    b. 7 7! 7 6 5 4 3 2 12 2!(7 2)! 2 1 5 4 3 2 1

    = = = 21

    c. 4 4! 4 3 2 14 4!(4 4)! 4 3 2 1 1

    = = = 1

    d. 5 5! 5 4 3 2 10 0!(5 0)! 1 5 4 3 2 1

    = = = 1

    e. 6 6! 6 5 4 3 2 15 5!(6 5)! 5 4 3 2 1 1

    = = = 6

    Probability 55

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 3.6 a. The 36 sample points are: 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6

    3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6

    b. If the dice are fair, then each of the sample points is equally likely. Each would have a

    probability of 1/36 of occurring.

    c. There is one sample point in A: 3,3. Thus, P(A) = 136

    .

    There are 6 sample points in B: 1,6 2,5 3,4 4,3 5,2 and 6,1. Thus, P(B) = 6 136 6

    = .

    There are 18 sample points in C: 1,1 1,3 1,5 2,2 2,4 2,6 3,1 3,3 3,5 4,2 4,4

    4,6 5,1 5,3 5,5 6,2 6,4 and 6,6. Thus, P(C) = 18 136 2

    = . 3.8 Each student will obtain slightly different proportions. However, the proportions should be

    close to P(A) = 1/10, P(B) = 6/10 and P(C) = 3/10. 3.10 Define the following event:

    B: {Postal worker was assaulted on the job in the past year}

    P(B) = 600 .0512,000

    = 3.12 a. The 5 sample points are: Total population, Agricultural change, Presence of industry, Growth, and Population

    concentration.

    b. The probabilities are best estimated with the sample proportions. Thus,

    P(Total population) = .18 P(Agricultural change) = .05 P(Presence of industry) = .27 P(Growth) = .05 P(Population concentration) = .45

    c. Define the following event:

    A: {Factor specified is population-related} P(A) = P(Total population) + P(Growth) + P(Population concentration) = .18 + .05 + .45 = .68.

    56 Chapter 3

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 3.14 a. The sample points of this experiment correspond to each of the 8 possible types of commodities. Suppose we introduce notation to make the listing of the sample points easier.

    A: {carload contains agricultural products} CH: {carload contains chemicals} CO: {carload contains coal} F: {carload contains forest products} MO: {carload contains metallic ores and minerals} MV: {carload contains motor vehicles and equipment} N: {carload contains nonmetallic minerals and products} O: {carload contains other} The eight sample points are: A CH CO F MO MV N O b. The probability of each sample point is found by dividing the number of carloads for each

    sample point by the total number of carloads. The probabilities are: P(A) = 41,690 / 335,770 = .124 P(CH) = 38,331 / 335,770 = .114 P(CO) = 124,595 / 335,770 = .371 P(F) = 21,929 / 335,770 = .065 P(MO) = 34,521 / 335,770 = .103 P(MV) = 22,906 / 335,770 = .068 P(N) = 37,416 / 335,770 = .111 P(O) = 14,382 / 335,770 = .043 c. P(MV) = .068 P(nonagricultural products) = P(CH) + P(CO) + P(F) + P(MO) + P(MV) + P(N) + P(O) = .114 + .371 + .065 + .103 + .068 + .111 + .043 = .875 d. P(CH) + P(CO) = .114 + .371 = .485 e. Since there were 335,770 carloads that week, the probability of selecting any one in

    particular would be 1 / 335,770 = .00000298. Thus, the probability of selecting the carload with the serial number 1003642 is .00000298.

    Probability 57

    To download more slides, ebook, solutions and test bank, visit http://downloadslide.blogspot.com

  • 3.16 a. Since order does not matter, the number of different bets would be a combination of 8 things taken 2 at a time.

    The number of ways would be

    8 8! 8 7 6 5 4 3 2 1 40,320 282 2!(8 2)! 2 1 6 5 4 3 2 1 1440

    = = = = b. If all players are of equal ability, then each of the 28 sample points would b