Session VI
“Measures of Central Tendency”
Measures of Central Tendency
The three measures in common use are:
Average
Median
Mode
Average
There are three different techniques available for
calculating the average
Ungrouped data
Grouped data
Weighted average
Average-Ungrouped Data
1
ni
i
XX
n
Xi=readings n=number of observations
Average-Grouped Data
1
1 1 2 2
1 2
... .
...
hi i
i
h h
h
f XX
n
f X f X f X
f f f
h = number of cells fi=frequencyXi=midpoint n=number of observations
Average-Weighted Average
1
1
n
i iiw n
i
i
w XX
w
Used when a number of averages are combined with different frequencies
Xi=readings n=number of observationsW=weight of average
Median-Grouped Data
2m
d m
m
ncf
M L if
lm = lower boundary of the cell with the median
n = total number of observations
cfm= cumulative frequency of all cells below m
fm = frequency of median cell
i =cell interval
Example
Boundaries Midpoint Frequency Computation
23.6-26.5 25.0 4 100
26.6-29.5 28.0 36 1008
29.6-32.5 31.0 51 1581
32.6-35.5 34.0 63 2142
35.6-38.5 37.0 58 2146
38.6-41.5 40.0 52 2080
41.6-44.5 43.0 34 1462
44.6-47.5 46.0 16 736
47.6-50.5 49.0 6 294
Total 320 11549
Frequency Distribution of the Life of 320 tires in 1000 km
Median-Grouped Data
2m
d m
m
ncf
M L if
320154
235.6 3 35.958
Md
Using data from example table
Mode
The Mode is the value that occurs with the
greatest frequency.
It is possible to have no modes in a series or
numbers or to have more than one mode.
Relationship Among theMeasures of Central Tendency
Measures of Dispersion
Range
Standard Deviation
Variance
Range
The range is the simplest and easiest to
calculate of the measures of dispersion.
Range = R = Xh – Xl
Largest value - Smallest value in data set
Standard Deviation
Sample Standard Deviation:
2
1( )
1
n
iXi X
Sn
Standard Deviation
Ungrouped Technique
2 2
1 1( )
( 1)
n n
i in Xi Xi
Sn n
2
2
11
/
1
nn
ii
Xi Xi n
Sn
Standard Deviation
2 2
11
( ) ( )
( 1)
hh
i i i ii
i
n f X f X
sn n
Grouped Technique
Variance
2
2
11
/
1
nn
ii
Xi Xi n
Sn
Comments on the Measures of Dispersion
As n increases, accuracy of R decreases
Use R when there is small amount of data or data is
too scattered
If n > 10 use standard deviation
A smaller standard deviation means better quality
Comparison of two distributions with equal average and range
Other Measures
There are three other measures that are
frequently used to analyze a collection of data:
Skewness
Kurtosis
Coefficient of Variation
Skewness is the lack of symmetry of the data.
For grouped data:
Skewness
3
13 3
( ) /h
i ii
f X X na
s
Skewness
Left (negative) and right (positive) skewness distributions
Kurtosis provides information regarding the shape of the
population distribution (the peakedness or heaviness of
the tails of a distribution).
For grouped data:
Kurtosis
4
14 4
( ) /h
i ii
f X X na
s
Kurtosis
Leptokurtic and Platykurtic distributions
Correlation variation (CV) is a measure of how much
variation exists in relation to the mean.
(100%)sCV
X
Coefficient of Variation
Population
Set of all items that possess a characteristic of
interest
Sample
Subset of a population
Population and Sample
Parameter is a characteristic of a population, it describes a
population
Example: average weight of the population, e.g. 50,000 cans
made in a month.
Statistic is a characteristic of a sample, used to make
inferences on the population parameters that are
typically unknown, called an estimator
Example: average weight of a sample of 500 cans from that
month’s output, an estimate of the average weight of
the 50,000 cans.
Parameter and Statistic
Comparison of Sample and Population
Statistic Parameter
X average μ(Xo) mean
s sample standard deviation σ(so) standard deviation