4a-1. descriptive statistics (part 1) numerical description numerical description central tendency...

59
4A-1

Upload: lindsey-thomasina-andrews

Post on 13-Dec-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

4A-1

Descriptive Statistics (Part Descriptive Statistics (Part 1)1)

Descriptive Statistics (Part Descriptive Statistics (Part 1)1)

Numerical Description

Central Tendency

Dispersion

Chapter4A4A4A4A

McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, Inc. All rights reserved.

4A-3

• StatisticsStatistics are descriptive measures derived from a are descriptive measures derived from a sample (sample (nn items). items).

• ParametersParameters are descriptive measures derived from are descriptive measures derived from a population (a population (NN items). items).

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

4A-4

• Three key characteristics of numerical data:Three key characteristics of numerical data:

CharacteristicCharacteristic InterpretationInterpretation

Central TendencyCentral Tendency Where are the data values concentrated? Where are the data values concentrated? What seem to be typical or middle data What seem to be typical or middle data values?values?

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

DispersionDispersion How much variation is there in the data? How much variation is there in the data? How spread out are the data values? How spread out are the data values? Are there unusual values?Are there unusual values?

ShapeShape Are the data values distributed symmetrically? Are the data values distributed symmetrically? Skewed? Sharply peaked? Flat? Bimodal?Skewed? Sharply peaked? Flat? Bimodal?

4A-5

• Numerical statistics can be used to summarize this Numerical statistics can be used to summarize this random sample of brands.random sample of brands.

• Defect rate = Defect rate = total no. defectstotal no. defectsno. inspectedno. inspected

x 100x 100

• Must allow for sampling error since the analysis is Must allow for sampling error since the analysis is based on sampling.based on sampling.

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

Example: Vehicle QualityExample: Vehicle Quality

• Consider the data set of vehicle defect rates from Consider the data set of vehicle defect rates from J. D. Power and Associates. J. D. Power and Associates.

4A-6

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

• Number of defects per 100 vehicles, 1004 models.Number of defects per 100 vehicles, 1004 models.

4A-7

To begin, sort the To begin, sort the data in Excel.data in Excel.

4A-8

• Sorted data provides insight into central tendency Sorted data provides insight into central tendency and dispersion.and dispersion.

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

4A-9

• The dot plot offers a visual impression of the data.The dot plot offers a visual impression of the data.

Visual DisplaysVisual Displays

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

4A-10

• Histograms with 5 bins (suggested by Sturge’s Histograms with 5 bins (suggested by Sturge’s Rule) and 10 bins are shown below.Rule) and 10 bins are shown below.

• Both are symmetric with no extreme values and Both are symmetric with no extreme values and show a modal class toward the low end.show a modal class toward the low end.

Visual DisplaysVisual Displays

Numerical DescriptionNumerical DescriptionNumerical DescriptionNumerical Description

4A-11

Descriptive Descriptive Statistics in ExcelStatistics in Excel

Go to Tools | Data Analysis Go to Tools | Data Analysis and select and select Descriptive StatisticsDescriptive Statistics

4A-12

Highlight the data Highlight the data range, specify a cell range, specify a cell for the upper-left for the upper-left corner of the output corner of the output range, check range, check Summary Statistics Summary Statistics and click and click OK.OK.

4A-13

Here is the resulting analysis.

4A-14

Descriptive Statistics in MegaStat

4A-15

Here is the resulting MegaStat analysis:

4A-16

• The central tendency is the middle or typical The central tendency is the middle or typical values of a distribution.values of a distribution.

• Central tendency can be assessed using a dot Central tendency can be assessed using a dot plot, histogram or more precisely with numerical plot, histogram or more precisely with numerical statistics.statistics.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

4A-17

StatisticStatistic FormulaFormula Excel FormulaExcel Formula ProPro ConCon

MeanMean =AVERAGE(Data)=AVERAGE(Data)

Familiar and Familiar and uses all the uses all the sample sample information. information.

Influenced Influenced by extreme by extreme values.values.1

1 n

ii

xn

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Six Measures of Central TendencySix Measures of Central Tendency

MedianMedian

Middle Middle value in value in sorted sorted arrayarray

=MEDIAN(Data)=MEDIAN(Data)Robust when Robust when extreme data extreme data values exist. values exist.

Ignores Ignores extremes extremes and can be and can be affected by affected by gaps in data gaps in data values.values.

4A-18

StatisticStatistic FormulaFormula Excel FormulaExcel Formula ProPro ConCon

ModeMode

Most Most frequently frequently occurring occurring data valuedata value

=MODE(Data)=MODE(Data)

Useful for Useful for attribute attribute data or data or discrete data discrete data with a small with a small range.range.

May not be May not be unique, unique, and is not and is not helpful for helpful for continuous continuous data.data.

Central TendencyCentral TendencyCentral TendencyCentral Tendency Six Measures of Central TendencySix Measures of Central Tendency

MidrangeMidrange =0.5*(MIN(Data)=0.5*(MIN(Data)+MAX(Data))+MAX(Data))

Easy to Easy to understand understand and and calculate.calculate.

Influenced Influenced by extreme by extreme values and values and ignores ignores most data most data values.values.

min max

2

x x

4A-19

StatisticStatistic FormulaFormula Excel FormulaExcel Formula ProPro ConCon

Geometric Geometric mean (mean (GG))

=GEOMEAN(Data)=GEOMEAN(Data)

Useful for Useful for growth growth rates and rates and mitigates mitigates high high extremes.extremes.

Less Less familiar familiar and and requires requires positive positive data.data.

Trimmed Trimmed meanmean

Same as the Same as the mean except mean except omit highest omit highest and lowest and lowest kk% of data % of data values (e.g., values (e.g., 5%)5%)

=TRMEAN(Data, %)=TRMEAN(Data, %)

Mitigates Mitigates effects of effects of extreme extreme values.values.

Excludes Excludes some data some data values values that could that could be be relevant.relevant.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Six Measures of Central TendencySix Measures of Central Tendency

1 2 ... nnx x x

4A-20

• A familiar measure of central tendency.A familiar measure of central tendency.

• In Excel, use function =AVERAGE(Data) where In Excel, use function =AVERAGE(Data) where Data is an array of data values.Data is an array of data values.

Population Formula Sample Formula

1

N

ii

x

N

1

n

ii

xx

n

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MeanMean

4A-21

• For the sample of For the sample of nn = 37 car brands: = 37 car brands:

1 87 93 98 ... 159 164 173 4639125.38

37 37

n

ii

xx

n

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MeanMean

4A-22

• Arithmetic mean is the most familiar average.Arithmetic mean is the most familiar average.

• Affected by every sample item.Affected by every sample item.

• The balancing point or fulcrum for the data.The balancing point or fulcrum for the data.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Characteristics of the MeanCharacteristics of the Mean

4A-23

• Regardless of the shape of the distribution, Regardless of the shape of the distribution, absolute distances from the mean to the data absolute distances from the mean to the data points always sum to zero.points always sum to zero.

1

( ) 0n

ii

x x

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Characteristics of the MeanCharacteristics of the Mean

• Consider the following Consider the following asymmetric distribution of quiz asymmetric distribution of quiz scores whose mean = 65.scores whose mean = 65.

1

( )n

ii

x x

= (42 – 65) + (60 – 65) + (70 – 65) + (75 – 65) + (78 – 65)= (42 – 65) + (60 – 65) + (70 – 65) + (75 – 65) + (78 – 65)= (-23) + (-5) + (5) + (10) + (13) = -28 + 28 = 0= (-23) + (-5) + (5) + (10) + (13) = -28 + 28 = 0

4A-24

• The The medianmedian ( (MM) is the 50) is the 50thth percentile or midpoint of percentile or midpoint of the the sortedsorted sample data. sample data.

• MM separates the upper and lower half of the sorted separates the upper and lower half of the sorted observations.observations.

• If If nn is odd, the median is the middle observation in is odd, the median is the middle observation in the data array.the data array.

• If If nn is even, the median is the average of the is even, the median is the average of the middle two observations in the data array.middle two observations in the data array.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

4A-25

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

• For For nn = 8, the median is between the fourth and = 8, the median is between the fourth and fifth observations in the data array.fifth observations in the data array.

4A-26

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

• For For nn = 9, the median is the fifth observation in the = 9, the median is the fifth observation in the data array.data array.

4A-27

• Consider the following Consider the following nn = 6 data values: = 6 data values:11 12 15 17 21 3211 12 15 17 21 32

• What is the median?What is the median?

M M = (= (xx33++xx44)/2 = (15+17)/2 = 16 )/2 = (15+17)/2 = 16

11 12 15 16 17 21 32

For even For even nn, Median = , Median = / 2 ( / 2 1)

2n nx x

nn/2 = 6/2 = 3 and /2 = 6/2 = 3 and nn/2+1 = 6/2 + 1 = 4/2+1 = 6/2 + 1 = 4

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

4A-28

• Consider the following Consider the following nn = 7 data values: = 7 data values:12 23 23 25 27 34 4112 23 23 25 27 34 41

• What is the median?What is the median?

M M = = xx44 = 25 = 25

12 23 23 25 27 34 41

For odd For odd nn, Median = , Median = ( 1) / 2nx

((nn+1)/2 = (7+1)/2 = 8/2 = 4+1)/2 = (7+1)/2 = 8/2 = 4

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

4A-29

• Use Excel’s function =MEDIAN(Data) where Data Use Excel’s function =MEDIAN(Data) where Data is an array of data values.is an array of data values.

• For the 37 vehicle quality ratings (odd For the 37 vehicle quality ratings (odd nn) the ) the position of the median is position of the median is ((nn+1)/2 = (37+1)/2 = 19.+1)/2 = (37+1)/2 = 19.

• So, the median is So, the median is xx1919 = 121. = 121.

• When there are several duplicate data values, the When there are several duplicate data values, the median does not provide a clean “50-50” split in median does not provide a clean “50-50” split in the data.the data.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MedianMedian

4A-30

• The median is insensitive to extreme data values.The median is insensitive to extreme data values.• For example, consider the following quiz scores for For example, consider the following quiz scores for

3 students:3 students:

Tom’s scores:Tom’s scores: 20, 40, 70, 75, 80 20, 40, 70, 75, 80 Mean =57, Mean =57, Median = 70Median = 70, Total = 285, Total = 285Jake’s scores:Jake’s scores: 60, 65, 70, 90, 95 60, 65, 70, 90, 95 Mean = 76, Mean = 76, Median = 70Median = 70, Total = 380, Total = 380Mary’s scores:Mary’s scores: 50, 65, 70, 75, 90 50, 65, 70, 75, 90 Mean = 70, Mean = 70, Median = 70Median = 70, Total = 350, Total = 350

• What does the median for each student tell you?What does the median for each student tell you?

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Characteristics of the MedianCharacteristics of the Median

4A-31

• The most frequently occurring data value.The most frequently occurring data value.

• Similar to mean and median if data values occur Similar to mean and median if data values occur often near the center of sorted data.often near the center of sorted data.

• May have multiple modes or no mode. May have multiple modes or no mode.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

ModeMode

4A-32

Lee’s scoresLee’s scores:: 60, 70, 70, 70, 80 60, 70, 70, 70, 80 Mean =70, Median = 70, Mean =70, Median = 70, Mode = 70Mode = 70Pat’s scoresPat’s scores:: 45, 45, 70, 90, 100 45, 45, 70, 90, 100 Mean = 70, Median = 70, Mean = 70, Median = 70, Mode = 45Mode = 45Sam’s scoresSam’s scores:: 50, 60, 70, 80, 90 50, 60, 70, 80, 90 Mean = 70, Median = 70, Mean = 70, Median = 70, Mode = noneMode = noneXiao’s scoresXiao’s scores:: 50, 50, 70, 90, 90 50, 50, 70, 90, 90 Mean = 70, Median = 70, Mean = 70, Median = 70, Modes = 50,90Modes = 50,90

Central TendencyCentral TendencyCentral TendencyCentral Tendency

ModeMode• For example, consider the following quiz scores for For example, consider the following quiz scores for

3 students:3 students:

• What does the mode for each student tell you?What does the mode for each student tell you?

4A-33

• Easy to define, not easy to calculate in large Easy to define, not easy to calculate in large samples.samples.

• Use Excel’s function =MODE(Array)Use Excel’s function =MODE(Array)- will return #N/A if there is no mode.- will return #N/A if there is no mode.- will return first mode found if multimodal.- will return first mode found if multimodal.

• May be far from the middle of the distribution and May be far from the middle of the distribution and not at all typical.not at all typical.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

ModeMode

4A-34

• Generally isn’t useful for continuous data since Generally isn’t useful for continuous data since data values rarely repeat.data values rarely repeat.

• Best for attribute data or a discrete variable with a Best for attribute data or a discrete variable with a small range (e.g., Likert scale).small range (e.g., Likert scale).

Central TendencyCentral TendencyCentral TendencyCentral Tendency

ModeMode

4A-35

• Consider the following Consider the following P/EP/E ratios for a random ratios for a random sample of 68 Standard & Poor’s 500 stocks.sample of 68 Standard & Poor’s 500 stocks.

• What is the mode?What is the mode?

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Example: Price/Earnings Ratios and ModeExample: Price/Earnings Ratios and Mode

7 8 8 10 10 10 10 12 13 13 13 13 13 13 13 14 14

14 15 15 15 15 15 16 16 16 17 18 18 18 18 19 19 19

19 19 20 20 20 21 21 21 22 22 23 23 23 24 25 26 26

26 26 27 29 29 30 31 34 36 37 40 41 45 48 55 68 91

4A-36

• Excel’s descriptive Excel’s descriptive statistics results are:statistics results are:

• The mode 13 occurs The mode 13 occurs 7 times, but what 7 times, but what does the dot plot does the dot plot show?show?

Mean 22.7206

Median 19

Mode 13

Range 84

Minimum 7

Maximum 91

Sum 1545

Count 68

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Example: Price/Earnings Ratios and ModeExample: Price/Earnings Ratios and Mode

4A-37

• Points scored by the winning NCAA football team Points scored by the winning NCAA football team tends to have modes in multiples of 7 because tends to have modes in multiples of 7 because each touchdown yields 7 points.each touchdown yields 7 points.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Example: Rose Bowl Winners’ PointsExample: Rose Bowl Winners’ Points

• Consider the dot plot of the points scored by the Consider the dot plot of the points scored by the winning team in the first 87 Rose Bowl games.winning team in the first 87 Rose Bowl games.

• What is the mode?What is the mode?

4A-38

• Compare mean and median or look at histogram to Compare mean and median or look at histogram to determine degree of skewness.determine degree of skewness.

Central TendencyCentral TendencyCentral TendencyCentral Tendency

SkewnessSkewness

4A-39

Distribution’s Distribution’s ShapeShape

Histogram AppearanceHistogram Appearance StatisticsStatistics

Skewed leftSkewed left(negative (negative skewness)skewness)

Long tail of histogram points leftLong tail of histogram points left(a few low values but most data (a few low values but most data on right)on right)

Mean < MedianMean < Median

Central TendencyCentral TendencyCentral TendencyCentral Tendency

Symptoms of SkewnessSymptoms of Skewness

SymmetricSymmetric Tails of histogram are balancedTails of histogram are balanced (low/high values offset)(low/high values offset) Mean Mean Median Median

Skewed rightSkewed right(positive (positive skewness)skewness)

Long tail of histogram points rightLong tail of histogram points right(most data on left but a few high (most data on left but a few high values)values)

Mean > MedianMean > Median

4A-40

• The The midrangemidrange is the point halfway between the is the point halfway between the lowest and highest values of X.lowest and highest values of X.

• Easy to use but sensitive to extreme data values.Easy to use but sensitive to extreme data values.min max

2

x xMidrange = Midrange =

• For the J. D. Power quality data (n=37):For the J. D. Power quality data (n=37):

min max

2

x xMidrange = Midrange = 1 37 87 173

1302 2

x x =

• Here, the midrange (130) is higher than the mean Here, the midrange (130) is higher than the mean (125.38) or median (121).(125.38) or median (121).

Central TendencyCentral TendencyCentral TendencyCentral Tendency

MidrangeMidrange

4A-41

• VariationVariation is the “spread” of data points about the is the “spread” of data points about the center of the distribution in a sample. Consider the center of the distribution in a sample. Consider the following measures of dispersion:following measures of dispersion:

StatisticStatistic FormulaFormula ExcelExcel ProPro ConCon

RangeRange xxmaxmax – – xxminmin=MAX(Data)-=MAX(Data)-

MIN(Data)MIN(Data) Easy to calculateEasy to calculateSensitive to Sensitive to extreme data extreme data values.values.

DispersionDispersionDispersionDispersion

VarianceVariance (s(s22))

=VAR(Data)=VAR(Data)Plays a key role Plays a key role in mathematical in mathematical statistics.statistics.

Non-intuitive Non-intuitive meaning.meaning.

2

1

1

n

ii

x x

n

Measures of VariationMeasures of Variation

4A-42

StatisticStatistic FormulaFormula ExcelExcel ProPro ConCon

Standard Standard deviationdeviation ((ss))

=STDEV(Data)=STDEV(Data)

Most common Most common measure. Uses measure. Uses same units as the same units as the raw data ($ , £, ¥, raw data ($ , £, ¥, etc.).etc.).

Non-intuitive Non-intuitive meaning.meaning.

2

1

1

n

ii

x x

n

DispersionDispersionDispersionDispersion

Measures of VariationMeasures of Variation

Coef-Coef-ficient. officient. ofvariationvariation ((CVCV))

NoneNone

Measures relative Measures relative variation in variation in percentpercent so can so can compare data compare data sets.sets.

Requires Requires non-non-negative negative data.data.

100s

x

4A-43

StatisticStatistic FormulaFormula ExcelExcel ProPro ConCon

Mean Mean absolute absolute deviationdeviation ((MADMAD))

=AVEDEV(Data)=AVEDEV(Data) Easy to Easy to understand.understand.

Lacks “nice” Lacks “nice” theoretical theoretical properties.properties.

DispersionDispersionDispersionDispersion

Measures of VariationMeasures of Variation

1

n

ii

x x

n

4A-44

• The difference between the largest and smallest The difference between the largest and smallest observation.observation.

Range = Range = xxmax max – – xxminmin

• For example, for the For example, for the nn = 68 P/E ratios, = 68 P/E ratios,

Range = 91 – 7 = 84 Range = 91 – 7 = 84

DispersionDispersionDispersionDispersion

RangeRange

4A-45

• The The population variancepopulation variance ( (22) is ) is defined as the sum of squared defined as the sum of squared deviations around the mean deviations around the mean divided by the population size.divided by the population size.

• For the For the sample variancesample variance (s (s22), we ), we divide by divide by nn – 1 instead of – 1 instead of nn, , otherwise sotherwise s22 would tend to would tend to underestimate the unknown underestimate the unknown population variance population variance 22..

2

2 1

N

ii

x

N

2

2 1

1

n

ii

x xs

n

DispersionDispersionDispersionDispersion

VarianceVariance

4A-46

• The square root of the variance.The square root of the variance.

• Units of measure are the same as Units of measure are the same as XX..

Population Population standard standard deviationdeviation

2

1

N

ii

x

N

Sample Sample

standard standard deviationdeviation

2

1

1

n

ii

x xs

n

• Explains how individual values in a data set vary Explains how individual values in a data set vary from the mean.from the mean.

DispersionDispersionDispersionDispersion

Standard DeviationStandard Deviation

4A-47

• Excel’s built in functions areExcel’s built in functions are

StatisticStatistic Excel Excel populationpopulation formulaformula

Excel Excel sample sample formulaformula

VarianceVariance =VARP(Array)=VARP(Array) =VAR(Array)=VAR(Array)

Standard deviationStandard deviation =STDEVP(Array)=STDEVP(Array) =STDEV(Array)=STDEV(Array)

DispersionDispersionDispersionDispersion

Standard DeviationStandard Deviation

4A-48

• Consider the following five quiz scores for Consider the following five quiz scores for Stephanie.Stephanie.

DispersionDispersionDispersionDispersion

Calculating a Standard DeviationCalculating a Standard Deviation

4A-49

• Now, calculate the sample standard deviation:Now, calculate the sample standard deviation:

2

1 2380595 24.39

1 5 1

n

ii

x xs

n

• Somewhat easier, the Somewhat easier, the two-sum formulatwo-sum formula can also can also be used:be used:

2

212

2 1

(360)28300 28300 259205 595 24.39

1 5 1 5 1

n

ini

ii

x

xns

n

DispersionDispersionDispersionDispersion

Calculating a Standard DeviationCalculating a Standard Deviation

4A-50

• The standard deviation is nonnegative because The standard deviation is nonnegative because deviations around the mean are squared.deviations around the mean are squared.

• When every observation is exactly equal to the When every observation is exactly equal to the mean, the standard deviation is zero.mean, the standard deviation is zero.

• Standard deviations can be large or small, Standard deviations can be large or small, depending on the units of measure.depending on the units of measure.

• Compare standard deviations Compare standard deviations onlyonly for data sets for data sets measured in the same units and only if the means measured in the same units and only if the means do not differ substantially.do not differ substantially.

DispersionDispersionDispersionDispersion

Calculating a Standard DeviationCalculating a Standard Deviation

4A-51

• Useful for comparing variables measured in Useful for comparing variables measured in different units or with different means.different units or with different means.

• A unit-free measure of dispersionA unit-free measure of dispersion

• Expressed as a percent of the mean.Expressed as a percent of the mean.

• Only appropriate for nonnegative data. It is Only appropriate for nonnegative data. It is undefined if the mean is zero or negative.undefined if the mean is zero or negative.

100s

CVx

DispersionDispersionDispersionDispersion

Coefficient of VariationCoefficient of Variation

4A-52

• For example:For example:

Defect ratesDefect rates ((nn = 37) = 37)

ss = 22.89 = 22.89= 125.38= 125.38 givesgives CVCV = 100 × (22.89)/(125.38) = 18% = 100 × (22.89)/(125.38) = 18%

ATM ATM depositsdeposits ((nn = 100) = 100)

ss = 280.80 = 280.80= 233.89= 233.89 givesgives CVCV = 100 × (280.80)/(233.89) = = 100 × (280.80)/(233.89) =

120%120%

P/E ratiosP/E ratios ((nn = 68) = 68)

ss = 14.28 = 14.28= 22.72= 22.72 givesgives CVCV = 100 = 100 ×× (14.08)/(22.72) = 62% (14.08)/(22.72) = 62%

x

x

x

100s

CVx

DispersionDispersionDispersionDispersion

Coefficient of VariationCoefficient of Variation

4A-53

• The The Mean Absolute DeviationMean Absolute Deviation ( (MADMAD) reveals the ) reveals the average distance from an individual data point to average distance from an individual data point to the mean (center of the distribution).the mean (center of the distribution).

• Uses absolute values of the deviations around the Uses absolute values of the deviations around the mean.mean.

• Excel’s function is =AVEDEV(Array)Excel’s function is =AVEDEV(Array)

1

n

ii

x xMAD

n

DispersionDispersionDispersionDispersion

Mean Absolute DeviationMean Absolute Deviation

4A-54

• Consider the histograms of hole diameters drilled in Consider the histograms of hole diameters drilled in a steel plate during manufacturing.a steel plate during manufacturing.

• The desired distribution is outlined in red.The desired distribution is outlined in red.

DispersionDispersionDispersionDispersion

Machine AMachine A Machine BMachine B

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Manufacturing Manufacturing

4A-55

Desired mean (5mm) Desired mean (5mm) but too much variation.but too much variation.

Acceptable variation but Acceptable variation but mean is less than 5 mm.mean is less than 5 mm.

• Take frequent samples to monitor quality.Take frequent samples to monitor quality.

Machine AMachine A Machine BMachine B

DispersionDispersionDispersionDispersion

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Manufacturing Manufacturing

4A-56

• Consider student ratings of four professors on eight Consider student ratings of four professors on eight teaching attributes (10-point scale).teaching attributes (10-point scale).

DispersionDispersionDispersionDispersion

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Job Performance Job Performance

4A-57

• Jones and Wu have identical means but different Jones and Wu have identical means but different standard deviations.standard deviations.

DispersionDispersionDispersionDispersion

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Job Performance Job Performance

4A-58

• Smith and Gopal have different means but identical Smith and Gopal have different means but identical standard deviations.standard deviations.

DispersionDispersionDispersionDispersion

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Job Performance Job Performance

4A-59

• A high mean (better rating) and low standard A high mean (better rating) and low standard deviation (more consistency) is preferred. Which deviation (more consistency) is preferred. Which professor do you think is best?professor do you think is best?

DispersionDispersionDispersionDispersion

Central Tendency vs. Dispersion: Central Tendency vs. Dispersion: Job Performance Job Performance