central tendency measures lectures 3 4 5

Upload: loredana-lupescu

Post on 06-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    1/35

    Summarizing and Describing

    Numerical Data

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    2/35

    Lectures 3+4+5 Topics

    Measures ofCentral Tendency

    Mean, Median, Mode

    Measures ofVariationThe Range, Variance and

    Standard Deviation

    ShapeSymmetric, Skewed, Skewness, Kurtosis

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    3/35

    Summary Measures

    Central Tendency

    MeanMedian

    Mode

    Summary Measures

    Variation

    Variance

    Standard Deviation

    Coefficient of

    Variation

    Range

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    4/35

    Measures of Central Tendency

    Central Tendency

    Mean Median Mode

    n

    xn

    ii

    !1

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    5/35

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    6/35

    Sum of the observations

    Number of observationsMean =

    This is the most popular and usefulThis is the most popular and usefulmeasure of central locationmeasure of central location

    The ArithmeticThe Arithmetic

    MeanMean

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    7/35

    n

    xx i

    n1i!!

    Sample mean Population mean

    N

    x iN

    1i!!Q

    Sample size Population size

    n

    xx i

    n1i!!

    The ArithmeticThe Arithmetic

    MeanMean

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    8/35

    !

    !

    ! !

    10

    ...

    10

    102110

    1 xxxxx ii

    Example 1The reported time spent on the Internet of10 adults are 0, 7,12, 5,

    33,14, 8,0, 9,22hours. Find the mean time spent on the Internet.

    00 77 222211.0hours11.0hours

    Example 2

    Suppose the telephone bills representthe population of measurements ( 200). The population meanis

    !

    !

    !Q !

    200

    x...xx

    200

    x 20021i200

    1i 42.1942.19 38.4538.45 45.7745.7743.5943.59

    The ArithmeticThe Arithmetic

    MeanMean

    The arithmetic

    mean

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    9/35

    Weighted mean for data groupedWeighted mean for data grouped

    by categories or variantsby categories or variants

    i

    ii

    k

    i

    ffxx

    ! !1

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    10/35

    When many of the measurements have the same value, the

    measurement can be summarized in a frequency table. Suppose

    the number ofchildrenin a sample of16 families were recorded

    as follows:

    NUMBER OF CHILDREN 0 1 2 3

    NUMBER OF FAMILIES 3 4 7 2

    16 families

    5.116

    )3(2)2(7)1(4)0(3

    16

    ....

    16

    16162211

    16

    1 !

    !

    !

    ! !fxfxfxfx

    x iii

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    11/35

    The Median

    0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

    Median = 5 Median = 5

    Important Measure of Central Tendency

    In an ordered array, the median is the

    middle number.If n is odd, the median is the middle number.If n is even, the median is the average of the 2

    middle numbers.

    Not Affected by Extreme Values

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    12/35

    Odd number of observations

    0, 0, 5, 7, 8 9, 12, 14, 220, 0, 5, 7, 8, 9, 12, 14, 22, 330, 0, 5, 7, 8, 9, 12, 14, 22, 33

    Even number of observations

    Example 4.3

    Find the median of the time spent on the internet

    for the adults of example 1

    TheThe MedianMedian of a set of observations is theof a set of observations is thevalue that falls in the middle when thevalue that falls in the middle when theobservations are arranged in order ofobservations are arranged in order ofmagnitude or ranked increasinglymagnitude or ranked increasingly

    The MedianThe Median

    Suppose only 9 adults were sampled

    (exclude, say, the longest time (33))

    Comment

    8

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    13/35

    The Mode

    0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

    Mode = 9

    A Measure of Central Tendency

    Value that Occurs Most Often

    Not Affected by Extreme Values

    There May Not be a Mode

    There May be Several Modes

    Used for Either Numerical or Categorical Data

    0 1 2 3 4 5 6

    No Mode

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    14/35

    TheThe ModeMode of a set of observations is theof a set of observations is thevariable value that occurs most frequently.variable value that occurs most frequently.

    Set of data may have one mode (or modalSet of data may have one mode (or modalclass), or two or more modes.class), or two or more modes.

    The modal classFor large data sets

    the modal class is

    much more relevant

    than a single-value

    mode.

    The ModeThe Mode

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    15/35

    Approximating DescriptiveApproximating Descriptive

    Measures for groupedMeasures for groupedData by CLASSESData by CLASSESApproximating descriptive measures forApproximating descriptive measures for

    grouped data may be needed in twogrouped data may be needed in twocases:cases:

    when approximated values.suffices the needs,when approximated values.suffices the needs,

    when only secondary grouped data arewhen only secondary grouped data areavailable.available.

    i

    k

    i

    ii

    k

    i

    f

    fxx

    1

    1

    !

    !

    !

    x midpoint

    f frequency

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    16/35

    Class Class Frequency Midpointi limits fi xi xi fi1 2-5 3 3.5 10.5

    2 5-8 6 6.5 39.0

    3 8-11 8 9.5 76.0

    . . . . .

    6 17-20 2 18.5 37.0

    n =sample size= 30=f1++fn 312.0

    Example 3Example 3Approximate the mean (calculate the mean) ofApproximate the mean (calculate the mean) ofthe telephone call durations problem asthe telephone call durations problem asrepresented by the frequency distributionrepresented by the frequency distribution

    5 8 11 14 17 20 More6.5

    26.10

    :valueReal

    !x

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    17/35

    Median and ModeMedian and Mode

    MedianMedian

    Me

    1-Me

    1i

    i

    0n

    n-1)(21

    Kx

    !

    !

    in

    Me

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    18/35

    Median and ModeMedian and Mode

    ModeMode

    21

    1

    0 Kx

    ((

    (!Mo

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    19/35

    If a distribution is symmetrical, theIf a distribution is symmetrical, themean, median and mode coincidemean, median and mode coincide

    If a distribution is non symmetrical, andIf a distribution is non symmetrical, andskewed to the left or to the right, theskewed to the left or to the right, thethree measures differ.three measures differ.

    A positively skeweddistribution(skewed to the right)

    Mean

    Median

    Mode MeanMedian

    Mode

    A negatively skeweddistribution(skewed to the left)

    Relationship among Mean, Median,Relationship among Mean, Median,and Modeand Mode

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    20/35

    Summary Measures

    Central Tendency

    MeanMedian

    Mode

    n

    xn

    ii!1

    Summary Measures

    Variation

    Variance

    Standard Deviation

    Coefficient of

    Variation

    Range

    1n

    xxs

    2

    i2

    !

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    21/35

    Measures of Variation

    Variation

    Variance Standard Deviation Coefficient of

    VariationPopulation

    Variance

    Sample

    Variance

    Population

    Standard

    Deviation

    Sample

    Standard

    Deviation

    Range

    100%

    ! X

    S

    CV

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    22/35

    Measure of Variation

    Difference Between Largest & Smallest

    Observations:

    Absolute Range =

    Relative Range =

    Ignores How Data Are Distributed:

    The Range

    SmallestrgestLa xx

    7 8 9 10 11 12

    Range = 12 - 7 = 5

    7 8 9 10 11 12

    Range = 12 - 7 = 5

    meanxxSmallestLa

    /)( rgest

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    23/35

    DeviationDeviation

    Individual deviation from the mean =Individual deviation from the mean =

    Overall deviation = 0, becauseOverall deviation = 0, because

    Summing squared deviationsSumming squared deviations

    ororabsolute values of the deviationsabsolute values of the deviations

    meanxi

    ! 0XXi

    2

    XXi

    || xxi

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    24/35

    Important Measure of Variation

    Shows Variation About the Mean

    Computed as an arithmetic mean of

    squared deviations or as a square mean ofindividual deviations

    For the Population:

    For the Sample:

    Variance

    N

    Xi !

    2

    2 QW

    1

    2

    2

    !

    n

    XXs

    i

    For the Population: use N in the

    denominator.

    For the Sample : use n - 1

    in the denominator.

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    25/35

    Most Important Measure of Variation

    Shows Variation About the Mean:

    For the Population:

    For the Sample:

    Standard Deviation

    N

    Xi !

    2Q

    W

    1

    2

    !

    nXXs i

    For the Population: use N in the

    denominator.

    For the Sample : use n - 1

    in the denominator.

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    26/35

    Sample Standard Deviation

    1

    2

    !

    n

    XXi

    Data: 10 12 14 15 17 18 18 24

    s =

    n = 8 Mean =16

    18

    16241618161716151614161216102222222

    )()()()()()()(

    = 4.2426

    s

    :Xi

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    27/35

    Comparing Standard Deviations

    1

    2

    n

    XXis = = 4.2426

    N

    Xi

    !

    2Q

    W = 3.9686

    Value for the Standard Deviation is larger for data considered as a Sample.

    Data : 10 12 14 15 17 18 18 24:Xi

    N= 8 Mean =16

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    28/35

    Comparing Standard Deviations

    Mean = 15.5

    s = 3.33811 12 13 14 15 16 17 18 19 20 21

    11 12 13 14 15 16 17 18 19 20 21

    Data B - AGE

    Data A - AGE

    Mean = 15.5

    s = .9258

    11 12 13 14 15 16 17 18 19 20 21

    Mean = 15.5

    s = 4.57

    Data C - AGE

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    29/35

    Coefficient of VariationCoefficient of Variation

    Measure ofMeasure of Relative VariationRelative Variation

    Always aAlways a % or coefficient% or coefficient

    Shows Variation Relative to MeanShows Variation Relative to Mean

    Used toUsed to Compare 2 or More GroupsCompare 2 or More Groups

    Formula ( for Sample):Formula ( for Sample):

    100%

    !

    X

    SCV

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    30/35

    Comparing Coefficient of VariationComparing Coefficient of Variation

    Stock A:Stock A: Average Price last year =Average Price last year = $50$50 Standard Deviation (sd)Standard Deviation (sd) == $5$5 Stock B:Stock B: Average Price last yearAverage Price last year == $100$100 (sd) =(sd) = $5$5

    100%

    !

    X

    SCV

    Coefficient of Variation:

    Stock A: CV = 10%Stock B: CV = 5%

    Both average prices are

    representatives

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    31/35

    ShapeShape

    Describes How Data Are DistributedDescribes How Data Are Distributedbetween smallest and largest valuesbetween smallest and largest values

    Measures of Shape:Measures of Shape: Symmetric or skewedSymmetric or skewed

    Right-Skewed or

    Positively SkewedLeft-Skewed or

    Positive Skew-ness Symmetric

    Mean = Median = ModeMean Median Mode Median MeanMod

    e

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    32/35

    Box plotBox plot graphical presentation ofgraphical presentation of

    CTMCTM

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    33/35

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    34/35

  • 8/3/2019 Central Tendency Measures Lectures 3 4 5

    35/35

    Central tendencyCentral tendency

    measures summarymeasures summary Discussed Measures ofDiscussed Measures of Central TendencyCentral Tendency Mean, Median, ModeMean, Median, Mode Addressed Measures ofAddressed Measures ofVariationVariation The RangeThe Range,, Variance,Variance, Standard Deviation, Coefficient ofStandard Deviation, Coefficient ofVariationVariation DeterminedDetermined ShapeShape of Distributionsof Distributions Symmetric or SkeweSymmetric or SkeweddCoefficient of skewnessCoefficient of skewness

    Mean= Median =ModeMean Median Mode Mode Median Mean