sbe10_03b (ca) descriptive stats-numerical measures

Upload: krisbdb

Post on 08-Apr-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    1/48

    11 2005 Thomson/South-Western 2005 Thomson/South-Western

    Chapter 3C apter 3Descriptive Statistics: NumericalDescriptive Statistics: Numerical

    MeasuresMeasures

    Part BPart Bs Measures of Distribution Shape, RelativeLocation, and Detecting Outliers

    s Exploratory Data Analysiss

    Measures of Association Between Two Variabless The Weighted Mean and

    Working with Grouped Data

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    2/48

    22 2005 Thomson/South-Western 2005 Thomson/South-Western

    Measures of Distribution Shape,Measures of Distribution Shape,Relative Location, and Detecting OutliersRelative Location, and Detecting Outliers

    s Distribution ShapeDistribution Shapes z-Scoresz-Scoress Chebyshevs TheoremChebyshevs Theorems Empirical RuleEmpirical Rules

    Detecting OutliersDetecting Outliers

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    3/48

    33 2005 Thomson/South-Western 2005 Thomson/South-Western

    Distribution Shape: SkewnessDistribution Shape: Skewness

    s An important measure of the shape of adistribution is called skewness.

    s The formula for computing skewness for a dataset is somewhat complex.

    s

    Skewness can be easily computed usingstatistical software.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    4/48

    44 2005 Thomson/South-Western 2005 Thomson/South-Western

    Distribution Shape: SkewnessDistribution Shape: Skewness

    s Symmetric (not skewed)Symmetric (not skewed) Skewness is zero.Skewness is zero. Mean and median are equal.Mean and median are equal.

    Relati ve Fre quenc y

    Relative Fre quenc y

    .05.05

    .10.10

    .15.15

    .20.20

    .25.25

    .30.30

    .35.35

    00

    Skewness =Skewness =00

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    5/48

    55 2005 Thomson/South-Western 2005 Thomson/South-Western

    Distribution Shape: SkewnessDistribution Shape: Skewness

    s Moderately Skewed LeftModerately Skewed Left Skewness is negative.Skewness is negative. Mean will usually be less than the median.Mean will usually be less than the median.

    Relati ve Fre quenc y

    Relative Fre quenc y

    .05.05

    .10.10

    .15.15

    .20.20

    .25.25

    .30.30

    .35.35

    00

    Skewness =Skewness = .31.31

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    6/48

    66 2005 Thomson/South-Western 2005 Thomson/South-Western

    Distribution Shape: SkewnessDistribution Shape: Skewness

    s Moderately Skewed RightModerately Skewed Right Skewness is positive.Skewness is positive. Mean will usually be more than the median.Mean will usually be more than the median.

    Relati ve Fre quenc y

    Relative Fre quenc y

    .05.05

    .10.10

    .15.15

    .20.20

    .25.25

    .30.30

    .35.35

    00

    Skewness = .31Skewness = .31

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    7/4877 2005 Thomson/South-Western 2005 Thomson/South-Western

    Distribution Shape: SkewnessDistribution Shape: Skewness

    s Highly Skewed RightHighly Skewed Right Skewness is positive (often above 1.0).Skewness is positive (often above 1.0). Mean will usually be more than the median.Mean will usually be more than the median.

    Relati ve Frequenc y

    Relative Fre quenc y

    .05.05

    .10.10

    .15.15

    .20.20

    .25.25

    .30.30

    .35.35

    00

    Skewness = 1.25Skewness = 1.25

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    8/4888 2005 Thomson/South-Western 2005 Thomson/South-Western

    Seventy efficiency apartmentsSeventy efficiency apartmentswere randomly sampled inwere randomly sampled ina small college town. Thea small college town. The

    monthly rent prices formonthly rent prices forthese apartments are listedthese apartments are listedin ascending order on the next slide.in ascending order on the next slide.

    Distribution Shape: Skewness

    s Example: Apartment Rents

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    9/4899 2005 Thomson/South-Western 2005 Thomson/South-Western

    425 430 430 435 435 435 435 435 440 440

    440 440 440 445 445 445 445 445 450 450

    450 450 450 450 450 460 460 460 465 465

    465 470 470 472 475 475 475 480 480 480

    480 485 490 490 490 500 500 500 500 510

    510 515 525 525 525 535 549 550 570 570

    575 575 580 590 600 600 600 600 615 615

    Distribution Shape: Skewness

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    10/481010 2005 Thomson/South-Western 2005 Thomson/South-Western

    Relat ive F req

    uency

    Relat ive F req

    u ency

    .05.05

    .10.10

    .15.15

    .20.20

    .25.25

    .30.30

    .35.35

    00

    Skewness = .92Skewness = .92

    Distribution Shape: Skewness

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    11/481111 2005 Thomson/South-Western 2005 Thomson/South-Western

    TheThe z-scorez-score is often called the standardized value.is often called the standardized value. TheThe z-scorez-score is often called the standardized value.is often called the standardized value.

    It denotes the number of standard deviations a dataIt denotes the number of standard deviations a datavaluevalue x x ii is from the mean.is from the mean.It d enotes the number of standard deviations a dataIt denotes the number of standard deviations a datavaluevalue x x ii is from the mean.is from the mean.

    z-Scoresz-Scores

    z x x

    sii=

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    12/481212 2005 Thomson/South-Western 2005 Thomson/South-Western

    z-Scoresz-Scores

    A data value less than the sample mean will have aA data value less than the sample mean will have az-score less than zero.z-score less than zero.

    A data value greater than the sample mean will haveA data value greater than the sample mean will havea z-score greater than zero.a z-score greater than zero.A data value equal to the sample mean will have aA data value equal to the sample mean will have az-score of zero.z-score of zero.

    An observations z-score is a measure of the relativeAn observations z-score is a measure of the relativelocation of the observation in a data set.location of the observation in a data set.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    13/481313 2005 Thomson/South-Western 2005 Thomson/South-Western

    s z-Score of Smallest Value (425)z-Score of Smallest Value (425)

    425 490.801.20

    54.74ix x z s

    = = =

    -1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93

    -0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75

    -0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47

    -0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20

    -0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35

    0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45

    1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

    z z -Scores-Scores

    Standardized Values for Apartment RentsStandardized Values for Apartment Rents

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    14/481414 2005 Thomson/South-Western 2005 Thomson/South-Western

    Chebyshevs Theorem (or Chebyshev'sChebyshevs Theorem (or Chebyshev'sinequality)inequality)

    At least (1 - 1/At least (1 - 1/ z z 22 ) of the items in) of the items in anyany data set will bedata set will bewithinwithin z z standard deviations of the mean, wherestandard deviations of the mean, where z z isisany value greater than 1any value greater than 1

    (z>1, beware that value 1 is not included).(z>1, beware that value 1 is not included).

    At least (1 - 1/At least (1 - 1/ z z 22 ) of the items in) of the items in anyany data set will bedata set will bewithinwithin z z standard deviations of the mean, wherestandard deviations of the mean, where z z isisany value greater than 1any value greater than 1

    (z>1, beware that value 1 is not included).(z>1, beware that value 1 is not included).

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    15/481515 2005 Thomson/South-Western 2005 Thomson/South-Western

    Pafnuty ChebyshevPafnuty Chebyshev(1821-1894)(1821-1894)

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    16/481616 2005 Thomson/South-Western 2005 Thomson/South-Western

    At least of the data values must beAt least of the data values must be

    within of the mean.within of the mean.

    At l east of the data values must beAt least of the data values must be

    within of the mean.within of the mean.

    75%75%75%75%

    z z = 2 standard deviations= 2 standard deviations z z = 2 standard deviations= 2 standard deviations

    Chebyshevs TheoremChebyshevs Theorem

    At least of the data values must beAt least of the data values must bewithin of the mean.within of the mean.At l east of the data values must beAt least of the data values must bewithin of the mean.within of the mean.

    89%89%89%89%z z = 3 standard deviations= 3 standard deviations z z = 3 standard deviations= 3 standard deviations

    At least of the data values must beAt least of the data values must bewithin of the mean.within of the mean.

    At l east of the data values must beAt least of the data values must be

    within of the mean.within of the mean.

    94%94%94%94%

    z z = 4 standard deviations= 4 standard deviations z z = 4 standard deviations= 4 standard deviations

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    17/481717 2005 Thomson/South-Western 2005 Thomson/South-Western

    For example:For example:

    Chebyshevs TheoremChebyshevs Theorem

    LetLet z z = 1.5 with = 490.80 and= 1.5 with = 490.80 and ss = 54.74= 54.74x

    At least (1At least (1 1/(1.5)1/(1.5) 22 ) = 1) = 1 0.44 = 0.56 or 56%0.44 = 0.56 or 56%

    of the rent values must be betweenof the rent values must be betweenx-- z z ((ss ) = 490.80) = 490.80 1.5(54.74) = 4091.5(54.74) = 409

    andandx++ z z ((ss ) = 490.80 + 1.5(54.74) = 573) = 490.80 + 1.5(54.74) = 573

    (Actually, 86% of the rent values(Actually, 86% of the rent valuesare between 409 and 573.)are between 409 and 573.)

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    18/481818 2005 Thomson/South-Western 2005 Thomson/South-Western

    Empirical RuleEmpirical Rule

    For data having a bell-shaped distribution:For data having a bell-shaped distribution:

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    68.26%68.26%68.26%68.26%+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation+/- 1 standard deviation

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    95.44%95.44%95.44%95.44%+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations+/- 2 standard deviations

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    of the values of a normal random variableof the values of a normal random variableare within of its mean.are within of its mean.

    99.72%99.72%99.72%99.72%

    +/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations+/- 3 standard deviations

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    19/48

    1919 2005 Thomson/South-Western 2005 Thomson/South-Western

    Empirical RuleEmpirical Rule

    x x 33 11

    22 + 1+ 1

    + 2+ 2 + 3+ 3

    68.26%68.26%95.44%95.44%99.72%99.72%

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    20/48

    2020 2005 Thomson/South-Western 2005 Thomson/South-Western

    Detecting OutliersDetecting Outliers

    AnAn outlieroutlier is an unusually small or unusually largeis an unusually small or unusually largevalue in a data set.value in a data set.

    A data value with a z-score less than -3 or greaterA data value with a z-score less than -3 or greaterthan +3 might be considered an outlier.than +3 might be considered an outlier.

    It might be:It might be: an incorrectly recorded data valuean incorrectly recorded data value a data value that was incorrectly included in thea data value that was incorrectly included in the

    data setdata set

    a correctly recorded data value that belongs ina correctly recorded data value that belongs inthe data setthe data set

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    21/48

    2121 2005 Thomson/South-Western 2005 Thomson/South-Western

    Detecting OutliersDetecting Outliers

    -1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75

    -0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47

    -0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20

    -0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45

    1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

    The most extreme z-scores are -1.20 and 2.27The most extreme z-scores are -1.20 and 2.27

    Using |Using | z z || >> 3 as the criterion for an outlier, there are3 as the criterion for an outlier, there areno outliers in this data set.no outliers in this data set.

    Standardized Values for Apartment RentsStandardized Values for Apartment Rents

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    22/48

    2222 2005 Thomson/South-Western 2005 Thomson/South-Western

    Exploratory Data AnalysisExploratory Data Analysis

    s Five-Number SummaryFive-Number Summarys Box PlotBox Plot

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    23/48

    2323 2005 Thomson/South-Western 2005 Thomson/South-Western

    Five-Number SummaryFive-Number Summary

    11 Smallest ValueSmallest ValueFirst QuartileFirst Quartile

    MedianMedian

    Third QuartileThird Quartile

    Largest ValueLargest Value

    22

    33

    44

    55

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    24/48

    2424 2005 Thomson/South-Western 2005 Thomson/South-Western

    Five-Number SummaryFive-Number Summary

    425 430 430 435 435 435 435 435 440 440

    440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465

    465 470 470 472 475 475 475 480 480 480

    480 485 490 490 490 500 500 500 500 510

    510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

    Lowest Value = 425Lowest Value = 425 First Quartile = 445First Quartile = 445

    Median = 475Median = 475Third Quartile = 525Third Quartile = 525 Largest Value = 615Largest Value = 615

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    25/48

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    26/48

    2626 2005 Thomson/South-Western 2005 Thomson/South-Western

    Box PlotBox Plot

    s Limits are located (not drawn) using theLimits are located (not drawn) using theinterquartile range (IQR).interquartile range (IQR).

    s Data outside these limits are consideredData outside these limits are consideredoutliersoutliers ..

    s The locations of each outlier is shown with theThe locations of each outlier is shown with the

    symbolsymbol ** .. continuedcontinued

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    27/48

    2727 2005 Thomson/South-Western 2005 Thomson/South-Western

    Box PlotBox Plot

    Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325Lower Limit: Q1 - 1.5(IQR) = 445 - 1.5(80) = 325

    Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 645Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 645

    The lower limit is located 1.5(IQR) belowThe lower limit is located 1.5(IQR) below QQ 1.1.

    The upper limit is located 1.5(IQR) aboveThe upper limit is located 1.5(IQR) above QQ 3.3.

    There are no outliers (values less than 325 orThere are no outliers (values less than 325 orgreater than 645) in the apartment rent data.greater than 645) in the apartment rent data.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    28/48

    2828 2005 Thomson/South-Western 2005 Thomson/South-Western

    Box PlotBox Plot

    s Whiskers (dashed lines) are drawn from theWhiskers (dashed lines) are drawn from theends of the box to the smallest and largestends of the box to the smallest and largestdata values inside the limits.data values inside the limits.

    375375 400400 425425 450450 475475 500500 525525 550550 575575 600600 625625Smallest valueSmallest value

    inside limits = 425inside limits = 425Largest valueLargest value

    inside limits = 615inside limits = 615

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    29/48

    2929 2005 Thomson/South-Western 2005 Thomson/South-Western

    Measures of AssociationMeasures of AssociationBetween Two VariablesBetween Two Variables

    s CovarianceCovariances Correlation CoefficientCorrelation Coefficient

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    30/48

    3030 2005 Thomson/South-Western 2005 Thomson/South-Western

    CovarianceCovariance

    Positive values indicate a positive relationship.Positive values indicate a positive relationship. Positive values indicate a positive relationship.Positive values indicate a positive relationship.

    Negative values indicate a negative relationship.Negative values indicate a negative relationship. Negative values indicate a negative relationship.Negative values indicate a negative relationship.

    TheThe covariancecovariance is a measure of the linear associationis a measure of the linear associationbetween two variables.between two variables.Th eThe covariancecovariance is a measure of the linear associationis a measure of the linear associationbetween two variables.between two variables.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    31/48

    3131 2005 Thomson/South-Western 2005 Thomson/South-Western

    CovarianceCovariance

    The covariance is computed as follows:The covariance is computed as follows:

    The covariance is computed as follows:The covariance is computed as follows:

    forforsamplessamples

    forforpopulationspopulations

    sx x y y

    nxyi i=

    ( )( )1

    xyi x i yx y

    N =

    ( )( )

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    32/48

    3232 2005 Thomson/South-Western 2005 Thomson/South-Western

    The correlation coefficient is computed as follows:The correlation coefficient is computed as follows:

    The correlation coefficient is computed as follows:The correlation coefficient is computed as follows:

    forforsamplessamples

    forforpopulationspopulations

    r s

    s sxyxy

    x y=

    xy

    xy

    x y=

    Correlation CoefficientCorrelation Coefficient

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    33/48

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    34/48

    3434 2005 Thomson/South-Western 2005 Thomson/South-Western

    Correlation CoefficientCorrelation Coefficient

    Just because two variables are highly correlated, itJust because two variables are highly correlated, itdoes not mean that one variable is the cause of thedoes not mean that one variable is the cause of theother.other.

    Jus t because two variables are highly correlated, itJust because two variables are highly correlated, it

    does not mean that one variable is the cause of thedoes not mean that one variable is the cause of theother.other.

    Correlation is a measure of linear association and notCorrelation is a measure of linear association and notnecessarily causation.necessarily causation.Co rrelation is a measure of linear association and notCorrelation is a measure of linear association and notnecessarily causation.necessarily causation.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    35/48

    3535 2005 Thomson/South-Western 2005 Thomson/South-Western

    A golfer is interested in investigatingA golfer is interested in investigatingthe relationship, if any, between drivingthe relationship, if any, between drivingdistance and 18-hole score.distance and 18-hole score.

    277.6277.6259.5259.5269.1269.1267.0267.0255.6255.6272.9272.9

    696971717070707071716969

    Average DrivingAverage DrivingDistance (yds.)Distance ( yds.)

    AverageAverage18-Hole Score18-Hole Score

    Covariance and Correlation Coeffic ient

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    36/48

    3636 2005 Thomson/South-Western 2005 Thomson/South-Western

    Covariance and Correlation Coefficient

    277.6277.6259.5259.5

    269.1269.1267.0267.0255.6255.6272.9272.9

    69697171

    7070707071716969

    x x y y

    10.6510.65-7.45-7.45

    2.152.150.050.05

    -11.35-11.355.955.95

    -1.0-1.01.01.0

    0000

    1.01.0-1.0-1.0

    -10.65-10.65-7.45-7.45

    0000

    -11.35-11.35-5.95-5.95

    ( )ix x ( )( )i ix x y y ( )iy y

    AverageAverageStd. Dev.Std. Dev.

    267.0267.0 70.070.0 -35.40-35.408.21928.2192 .8944.8944

    TotalTotal

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    37/48

    3737 2005 Thomson/South-Western 2005 Thomson/South-Western

    s Sample Covariance

    s Sample Correlation Coefficient

    Covariance and Correlation Coefficient

    7.08-.9631

    (8.2192)(.8944)xy

    xy x y

    sr

    s s

    = = =

    ( )( ) 35.407.08

    1 6 1i i

    xy

    x x y y s

    n

    = = =

    Th W i h d M dTh W i h d M d

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    38/48

    3838 2005 Thomson/South-Western 2005 Thomson/South-Western

    The Weighted Mean andThe Weighted Mean andWorking with Grouped DataWorking with Grouped Data

    s Weighted MeanWeighted Means Mean for Grouped DataMean for Grouped Datas Variance for Grouped DataVariance for Grouped Datas Standard Deviation for Grouped DataStandard Deviation for Grouped Data

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    39/48

    3939 2005 Thomson/South-Western 2005 Thomson/South-Western

    Weighted MeanWeighted Mean

    When the mean is computed by giving each dataWhen the mean is computed by giving each data

    value a weight that reflects its importance, it isvalue a weight that reflects its importance, it isreferred to as areferred to as a weighted meanweig hted mean ..

    In the computation of a grade point average (GPA),In the computation of a grade point average (GPA),the weights are the number of credit hours earned forthe weights are the number of credit hours earned for

    each grade.each grade.When data values vary in importance, the analystWhen data values vary in importance, the analystmust choose the weight that best reflects themust choose the weight that best reflects theimportance of each value.importance of each value.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    40/48

    4040 2005 Thomson/South-Western 2005 Thomson/South-Western

    Weighted MeanWeighted Mean

    i i

    i

    wx x

    w=

    where:where:

    x x ii = value of observation= value of observation ii

    ww ii = weight for observation= weight for observation ii

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    41/48

    4141 2005 Thomson/South-Western 2005 Thomson/South-Western

    Grouped DataGrouped Data

    The weighted mean computation can be used toThe weighted mean computation can be used toobtain approximations of the mean, variance, andobtain approximations of the mean, variance, andstandard deviation for the grouped data.standard deviation for the grouped data.

    To compute the weighted mean, we treat theTo compute the weighted mean, we treat themidpoint of each classmidp oint of each class as though it were the meanas though it were the meanof all items in the class.of all items in the class.

    We compute a weighted mean of the class midpointsWe compute a weighted mean of the class midpointsusing theusing the class frequencies as weightsclass freq uencies as weights ..Similarly, in computing the variance and standardSimilarly, in computing the variance and standarddeviation, the class frequencies are used as weights.deviation, the class frequencies are used as weights.

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    42/48

    4242 2005 Thomson/South-Western 2005 Thomson/South-Western

    Mean for Grouped DataMean for Grouped Data

    i if Mx n

    =

    N

    M f ii=

    where:where:

    f f ii = frequency of class= frequency of class ii MMii = midpoint of class= midpoint of class ii

    s Sample Data

    s Population Data

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    43/48

    4343 2005 Thomson/South-Western 2005 Thomson/South-Western

    Given below is the previous sample of monthly rentsGiven below is the previous sample of monthly rents

    for 70 efficiency apartments, presented here as groupedfor 70 efficiency apartments, presented here as groupeddata in the form of a frequency distribution.data in the form of a frequency distribution.

    Rent ($) Frequency420-439 8

    440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

    Sample Mean for Grouped DataSample Mean for Grouped Data

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    44/48

    4444 2005 Thomson/South-Western 2005 Thomson/South-Western

    Sample Mean for Grouped DataSample Mean for Grouped Data

    This approximationThis approximation

    differs by $2.41 fromdiffers by $2.41 fromthe actual samplethe actual samplemean of $490.80.mean of $490.80.

    34,525493.21

    70x = =

    Rent ($) f i 420-439 8440-459 17460-479 12

    480-499 8500-519 7520-539 4540-559 2560-579 4

    580-599 2600-619 6

    Total 70

    M i

    429.5

    449.5

    469.5

    489.5

    509.5

    529.5

    549.5

    569.5

    589.5609.5

    f i M i

    3436.07641.55634.0

    3916.03566.52118.01099.02278.0

    1179.03657.034525.0

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    45/48

    4545

    2005 Thomson/South-Western 2005 Thomson/South-Western

    Variance for Grouped DataVariance for Grouped Data

    sf M x

    ni i2

    2

    1=

    ( )

    2

    2

    = f M

    N i i( )

    s For sample data

    s

    For population data

    f

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    46/48

    4646

    2005 Thomson/South-Western 2005 Thomson/South-Western

    Rent ($) f i 420-439 8440-459 17460-479 12

    480-499 8500-519 7520-539 4540-559 2560-579 4

    580-599 2600-619 6

    Total 70

    M i

    429.5

    449.5

    469.5

    489.5

    509.5

    529.5

    549.5

    569.5

    589.5609.5

    Sample Variance for Grouped DataSample Variance for Grouped Data

    M i - x

    -63.7-43.7-23.7

    -3.716.336.356.376.3

    96.3116.3

    f i (M i - x )2

    32471.7132479.59

    6745.97

    110.111857.555267.866337.13

    23280.66

    18543.5381140.18

    208234.29

    (M i - x )2

    4058.961910.56

    562.16

    13.76265.361316.963168.565820.16

    9271.7613523.36

    continuedcontinued

    l f d

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    47/48

    4747

    2005 Thomson/South-Western 2005 Thomson/South-Western

    3,017.89 54.94s = =

    ss 22 = 208,234.29/(70 1) = 3,017.89= 208,234.29/(70 1) = 3,017.89

    This approximation differs by only $.20This approximation differs by only $.20from the actual standard deviation of $54.74.from the actual standard deviation of $54.74.

    Sample Variance for Grouped DataSample Variance for Grouped Data

    s Sample Variance

    s Sample Standard Deviation

    d f hE d f Ch 3 P B

  • 8/7/2019 SBE10_03b (ca) Descriptive Stats-Numerical measures

    48/48

    End of Chapter 3, Part BEnd of Chapter 3, Part B