descriptive statistics, numerical description
TRANSCRIPT
![Page 1: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/1.jpg)
DESCRIPTIVE STATISTICS Part I: Numerical Description
In this chapter, we will learn how to describe a set of data using numerical methods. This is the first of two chapters that together will aim at providing methods of descriptive statistics. In descriptive statistics, which is the use of graphical methods to display data and explore key statistics.
1
![Page 2: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/2.jpg)
What are the basic features of a data set?
A data set is a collection of data representing a particular variable. Examples of data sets are given below.
Data Sets:
• Students’ grades in a calculus test: 65, 85, 70, 75, 85, 80, 82, 85, 90, 78, 81, 82, 67, 80
• Property tax of a sample of houses:$5000, $4500, $4000, $7200, $5000, $3800, $4100, $5000
• Driving distance to work of a group of employee (miles): 1.2, 2.0, 2.2, 15.0, 11.0, 5.0, 3.7, 4.9, 15.2, 16.0
• Ages of all students in a college: 18, 19, 21, ……………………..…, 22, 18, 19, 21
2Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
![Page 3: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/3.jpg)
In general, establishing a data set requires consideration of a number of key questions:
Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
3
Data Set Key Questions:
•Are the data qualitative or quantitative?
•What levels of measurement do the data exhibit? (nominal, ordinal, interval, or ratio)
•What is the source of data?(the population)
•What is the appropriate sampling technique that should be used to collect the samples? (random or stratified)
•What is the appropriate minimum sample size?
![Page 4: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/4.jpg)
4
Data Set Types: (1) Univariate, (2) Bivariate, (3) Multivariate
Data Set Variable Typical Tasks
Univariate One Histograms, Descriptive Statistics, Frequency tallies Bivariate Two Scatter plots, correlations, simple regressionMultivariate
More than two variables Multiple regression, data mining, modeling
Person # Weight (lb)1 1502 1203 1304 1255 1556 1347 1508 1409 160
10 20011 18012 140
Person #
Years at work
Annual Salary ($)
1 5 50,0002 20 73,0003 10 65,0004 5 55,0005 8 60,0006 10 60,0007 15 68,0008 15 69,0009 20 68,000
10 20 69,00011 18 68,00012 10 62,00013 3 48,000
Uni
varia
te D
ata
Set
Biv
aria
te D
ata
Set
Case Name AgeIncome ($) Position Gender
1 Frieda 45 67,100 Consumer Analyst F
2 Stefan 32 56,500Operations analyst M
3 John 55 88,200 Marketing VP F
4 Donna 27 59,000 Statistician F
5 Larry 46 26,000 Security guard M
6 Alicia 52 68,500 QC Director F
7 Alec 65 95,200 Chief executive M
8 Jaime 50 71,200Human Resources M
Multivariate Data Set
Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
![Page 5: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/5.jpg)
Time-series data set
5
Cross sectional Sample
Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
Data Sets in the Context of Sampling:
• Cross sectional data set• Time-series data set
![Page 6: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/6.jpg)
6Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
![Page 7: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/7.jpg)
7
Working Problem 2.2: Explain what is inheritance tax. What is the difference between inheritance tax and Estate tax?
What is the level of measurement for each of the following variables: State, Income tax, sales tax, and inheritance tax. Why do some states have a wide income tax range?
http://portal.kiplinger.com/tools/slideshows/slideshow_pop.html?nm=TaxUnfriendlyStatesRetirees
State Income Tax (%)
U.S. States Sales Tax
(%)Inheritance
Tax (%)Alaska 0.0 0.0 NO
Wyoming 0.0 4.0 NoMichigan 4.4 6.0 No
Pennsylvania 3.1 6.0 YESColorado 4.6 2.9 NODelaware 4.6 0.0 NOHawaii 1.4 to 11 4.0 NOGeorgia 1.0 to 6.0 4.0 NO
South Carolina 3.0 to 7.0 6.0 NOAlabama 2.0 to 5.0 4.0 NO
California 1.25 to 10.55 8.3 NORhode Island 3.75-9.9 7.0 NO
New Jersey 1.4 to 8.97 7.0 YESVermont 3.55-8.95 6.0 NO
Iowa 0.36 to 8.98 6.0 YES
Nebraska 2.56 to 6.84 5.5 Yes
Wisconsin 4.6 to 7.75 5.0 NO
Oregon 5.0 to 11.0 0.0 YESIndiana 3.4 7.0 YES
North Dakota 1.84-4.86 5.0 NO
![Page 8: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/8.jpg)
8
Working Problem 2.3:
Identify the following data sets as ‘Cross-Sectional Data’ or ‘Time-Series Data’:
(a) Two weeks before the 56th quadrennial United States presidential election, which was held on November 4, 2008, a sample of people taking randomly from undecided states revealed that Democrat Barack Obama is expected to earn 54% of the popular votes and John McCain is expected to earn 46% of the votes
Cross Sectional ( ) Time-Series ( )
(b) A survey of 1000 students from a university of 10,000 students, revealed that 65% of the students do not prefer weekend classes
Cross Sectional ( ) Time-Series ( )
(c) The U.S. City average price per gallon of unleaded regular gasoline from 2000 to 2009 was as follow:
Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec2000 1.301 1.369 1.541 1.506 1.498 1.617 1.593 1.51 1.582 1.559 1.555 1.4892001 1.472 1.484 1.447 1.564 1.729 1.64 1.482 1.427 1.531 1.362 1.263 1.1312002 1.139 1.13 1.241 1.407 1.421 1.404 1.412 1.423 1.422 1.449 1.448 1.3942003 1.473 1.641 1.748 1.659 1.542 1.514 1.524 1.628 1.728 1.603 1.535 1.4942004 1.592 1.672 1.766 1.833 2.009 2.041 1.939 1.898 1.891 2.029 2.01 1.8822005 1.823 1.918 2.065 2.283 2.216 2.176 2.316 2.506 2.927 2.785 2.343 2.1862006 2.315 2.31 2.401 2.757 2.947 2.917 2.999 2.985 2.589 2.272 2.241 2.3342007 2.274 2.285 2.592 2.86 3.13 3.052 2.961 2.782 2.789 2.793 3.069 3.022008 3.047 3.033 3.258 3.441 3.764 4.065 4.09 3.786 3.698 3.173 2.151 1.6892009 1.787 1.928 1.949 2.056 2.265 2.631 2.543 2.627 2.574 2.561 2.66 2.621
http://data.bls.gov/cgi-bin/surveymost
Cross Sectional ( ) Time-Series ( )
![Page 9: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/9.jpg)
9Notes: …………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
What are the different elements of descriptive statistics?
Two types of descriptive statistics:
(1) Numerical measures of data, and
(2) Graphical displays of data.
The Focus of this Chapter
![Page 10: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/10.jpg)
10
Numerical measures of descriptive statistics consist of two types of measures:
• Measures of central tendency (mean, median, and mode)
• Measures of dispersion (range, standard deviation, and variance)
• Combined measures (coefficient of variation, signal-to-noise ratio, and standardized variable)
Measures of Central Tendency
10.0 12.0 13.0 14.0 14.0 14.0 24.0 24.0 24.0 26.8 27.0 27.0 29.0 30
Mean Mode
Median
Measures of Dispersion
Range Standard Deviation
Variance
![Page 11: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/11.jpg)
11
Key points to perform good statistical analysis
1. Identify your objectives:
· What questions do you really need to answer?· What variable do you need to examine?· What population you are about to evaluate?
2. Collect the appropriate samples and data to address your questions: ·Do you have access to the entire population?
· Would a selection of a sample from the population be easier to access, less costly, and less destructive than an evaluation of the whole population?· Remember ‘GIGO’ or garbage-in, garbage-out. If the samples are not representative of the population, and the data collected is not accurate and precise, the conclusions drawn from the analysis will be meaningless.
3. Describe the data using the analysis of descriptive statistics : · Do you detect data abnormality or outliers?
· Can you explore the data in such a way that will provide a clear description of data center and data variability?· Use descriptive statistics as a guideline for other methods of analysis
4. Perform inference : · Can the sample statistics be used to estimate population parameters?
· Is your estimation of population parameters reliable? · Do you have confidence in the population estimates?
![Page 12: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/12.jpg)
Center Values
Measures of Central Tendency
Mean Mode
Median
12
What are the measures of central tendency?
10.0 12.0 13.0 14.0 14.0 14.0 24.0 24.0 24.0 26.8 27.0 27.0 29.0 30
![Page 13: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/13.jpg)
(1) Arithmetic Mean
Measures of Data Center (Central Tendency)
Arithmetic Mean of Sample Observations
Arithmetic Mean of Population Observations
13
![Page 14: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/14.jpg)
Example: The Table below illustrates a comparison of gas prices in some States in September 2009 and September 2008. Determine the mean of gas prices ($ per gallon) for each year.
State Sept- 2009 Sep-2008California 3.099 3.75Colorado 2.48 3.732Florida 2.527 3.893Massachusetts 2.597 3.582Minnesota 2.452 3.765New York 2.811 3.805Ohio 2.411 3.933Texas 2.404 3.729Washington 2.947 3.785
Gas Prices of a number of states in September 2008, and September 2009http://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_home_page.html
gallonn
xXMean
ni
/636.2$9
947.2........597.2527.248.2099.31
gallonn
xXMean
ni
/775.3$9
785.3........582.3893.3732.375.31
For September 2009:
For September 2008:
Comment on the Results14
![Page 15: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/15.jpg)
15
Properties of arithmetic mean:
1. The mean of a set of data is unique and can be used as an identity measure of the data center 2. We can determine the mean of any data set that contains ratio or interval level data
3. We need all observation values to be able to calculate the mean
4. You know it is the correct mean value when the sum of the deviations of each value from it is zero,
![Page 16: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/16.jpg)
16
Example: Determine the arithmetic mean of the three values of student grades:
80, 40, and 30. Using the mean value, prove that
. Solution:
The arithmetic mean:
![Page 17: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/17.jpg)
17
Working Problem 2.4: Calculate the mean for the following data set of minimum wage ($): 7, 8, 6, 6, 8, 5, 6, 5, 8, 8
![Page 18: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/18.jpg)
18
2. Median:
The median of a set of numbers arranged in order of magnitude is the middle value or the arithmetic mean of the two middle values.
Example: Calculate the median of the following data set:14, 12, 14, 16, 15, 19, 17, 17, 17
Solution:To determine the median, we first arrange the data in order of magnitude:
12, 14, 14, 15, 16, 17, 17, 17, 19
Thus, the median is 16
Example: Calculate the median of the following data set:8, 9, 10, 9, 8, 6, 11, 7, 12, 8
Solution:To determine the median, we first arrange the data in order of magnitude
6, 7, 8, 8, 8, 9, 9, 10, 11, 12
Since this data set consists of an even number of observations, the middle values that split this data into equal number of observations on both sides are 8 and 9. Thus, the median of this set of data is (8+9)/2 = 8.5
![Page 19: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/19.jpg)
19
3. Mode:
The mode is that value which occurs with the greatest frequency. Interestingly, mode is a French word that means fashion; perhaps, it is popular and common fashion.
Example: Calculate the mode of the following observations:
80, 87, 90, 82, 78, 74, 80, 77, 80, 91, 81, 80
Example: Calculate the mode of the following observations:
5, 7, 8, 9, 9, 9, 10, 11, 12, 14, 14, 14, 15
Solution:The mode of this set is 80
Solution:This set exhibits two modes 9, and 14, and is called bimodal.
![Page 20: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/20.jpg)
Working Problem:
Calculate the mean and the mode and the median for the following data set of minimum wage ($)
7, 8, 6, 6, 8, 5, 6, 5, 8, 8
20
Answer:
Mean = $6.7
Median =$6.5
Mode = $8
![Page 21: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/21.jpg)
21
Working Problem 2.6: Calculate the median and the mode for the following data set ofminimum wage ($): 7, 8, 6, 6, 8, 5, 6, 5, 8, 8
![Page 22: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/22.jpg)
Geometric Mean, G
n nxxxxG ......321
Example: If the return on investment earned by a manufacturer of a sport car for four successive years was: 20 percent, 15 percent, -40 percent, and 100 percent. What is the geometric mean rate of return on investment?
1344.1656.1)2)(6.0)(15.1)(2.1(...... 44321 n nxxxxG
Accordingly, the average rate of return, which is essentially a compound annual growth rate, is 13.44%.
22
![Page 23: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/23.jpg)
23
Example: Suppose the inflation rates for the last 5 years in a certain country are 5%, 4%, 2%, 8%, and 6%, respectively. What is the mean rate of inflation over this five-year period?
Accordingly, the average rate of inflation over the five-year period is 4.9%
Geometric Mean, G
Solution:
At the end of the first year, the price index will be 1.05 times the price index at the beginning of the year; at the end of the second year, the price index will be (1.04)(1.05); at the end of the third year, the price index will be (1.02)(1.04)(1.05) and so on. Thus, the mean of 1.05, 1.04, 1.02, 1.08, and 1.06 is:
![Page 24: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/24.jpg)
24
Working Problem 2.8: The percent increase in sales for the last 4 years at X-L Company were: 9.91, 10.75, 13.12, 26.6
(a) Find the geometric mean percent increase.
(b) Find the arithmetic mean percent increase.
(c) Is the arithmetic mean equal to or greater than the geometric mean?
![Page 25: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/25.jpg)
25
What are the ‘dispersion’ or variability measures?
Range Mean deviation Standard deviation Variance
10.0 12.0 13.0 14.0 14.0 14.0 24.0 24.0 24.0 26.8 27.0 27.0 29.0 30
Measures of Dispersion
Range Standard Deviation
Variance
![Page 26: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/26.jpg)
What are ‘Dispersion’ or Variability measures? Range Mean deviation Standard deviation Variance
minmax XXR
Example: Calculate the range of the following set of data:
200, 205, 204, 202, 207, 208
26
The Range = R = 208 - 200 = 8
![Page 27: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/27.jpg)
27
Properties of range:
• The range represents the most commonly used statistic after the arithmetic mean • It is simple as it relies on two values, the maximum value and the minimum value
• It is easy to understand: the higher the range, the higher the variability
• Since the range relies on two values (maximum and minimum), a mistake in any one of these two values or a presence of an outlier can result in a misleading value of range
minmax XXR
![Page 28: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/28.jpg)
What are ‘Dispersion’ or Variability measures?
Mean deviation
n
i Xxn
MD1
1
28
Example : Calculate the mean deviation of the following ten observations of metal sheet thickness (mm):
83, 90, 70, 90, 90, 60, 70, 70, 90, 100
Solution: Step 1: Calculate the Mean
X
= (83 + 90 + 70 + 90 + 90 + 60 + 70 + 70 + 90 + 100) / 10 = 81.3 mm
Step 2: Subtract each observation value from the Mean value, and add up the absolute differences
Thickness (mm)
83 (83-81.3) =1.7 1.790 (90-81.3) = 8.7 8.770 (70-81.3) = -11.3 11.390 (90-81.3) = 8.7 8.790 (90-81.3) = 8.7 8.760 (60-81.3) = -21.3 21.370 (70-81.3) = -11.3 11.370 (70-81.3) = -11.3 11.390 (90-81.3) = 8.7 8.7100 (100-81.3) = 18.7 18.7Mean = = 81.3 Sum = 110.4
)(
Xx |)(|
Xx
X
Mean Deviation = 110.4/10 = 11.04 mm
![Page 29: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/29.jpg)
What are ‘Dispersion’ or Variability measures? Range Mean deviation Standard deviation Variance
For a Population:
N
iN
x
1
2
For n < 30, we use (n-1) in the denominator
For a Sample:
ni
nXxs
1
2)(
29
![Page 30: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/30.jpg)
Standard deviation
Thickness (mm) 83 90 70 90 90 60 70 70 90 100
X
= (83 + 90 + 70 + 90 + 90 + 60 + 70 + 70 + 90 + 100) / 10 = 81.3 mm
For n < 30, we use (n-1) in the denominator
Example: Calculate the standard deviation of the following ten observations of metal sheet thickness
ni
nXxs
1
2)(
Thickness (mm)83 (83-81.3) =1.7 2.8990 (90-81.3) = 8.7 75.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.6990 (90-81.3) = 8.7 75.6960 (60-81.3) = -21.3 453.6970 (70-81.3) = -11.3 127.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.69100 (100-81.3) = 18.7349.69Mean = 81.3 Sum = 1492.1
)(
Xx 2)(
Xx
mms 88.129
1.1492
n
in
Xxs1
2
1)(
30
![Page 31: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/31.jpg)
What are ‘Dispersion’ or Variability measures? Range Mean deviation Standard deviation Variance
For a Population:
Ni
Nx
1
22
For n < 30, we use (n-1) in the denominator
For a Sample:
ni
nXxs
1
22 )(
31
![Page 32: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/32.jpg)
32
Properties of variance:
• The variance represents the most commonly used statistic to indicate variability
• It is easy to understand: the higher the variance, the higher the variability
• Unlike the range, the variance takes into account all values of the observation values. Therefore, it is largely insensitive to outliers
• Variance values cannot be subtracted to determine variability. It can only be added. If U = X ± Y, Var (U) = Var (X) + Var (Y). This is the principle of analysis of variance (Chapter 11)
![Page 33: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/33.jpg)
What are ‘Dispersion’ or Variability measures?
Variance
Thickness (mm) 83 90 70 90 90 60 70 70 90 100
X
= (83 + 90 + 70 + 90 + 90 + 60 + 70 + 70 + 90 + 100) / 10 = 81.3 mm
For n < 30, we use (n-1) in the denominator
Example: Calculate the varianceof the following ten observations of metal sheet thickness
Thickness (mm)83 (83-81.3) =1.7 2.8990 (90-81.3) = 8.7 75.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.6990 (90-81.3) = 8.7 75.6960 (60-81.3) = -21.3 453.6970 (70-81.3) = -11.3 127.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.69100 (100-81.3) = 18.7349.69Mean = 81.3 Sum = 1492.1
)(
Xx 2)(
Xx
22 79.1659
1.1492 mms
33
ni
nXxs
1
22 )(
ni
nXxs
1
22 )(
![Page 34: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/34.jpg)
Working Problem:
Calculate the minimum, maximum, range, standard deviation, and variance for the following data set of minimum wage ($)
7, 8, 6, 6, 8, 5, 6, 5, 8, 8
34
Answer:
Minimum = $5
Maximum =$8
Range = $3
Standard deviation = $1.252
Variance = 1.567
![Page 35: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/35.jpg)
35
Example: Suppose X and Y are independent random variables. The variance of X is equal to 16; and the variance of Y is equal to 9. Let U = X - Y.What is the standard deviation of U?•2.65 ……….•5.00 ……….•7.00 ……….•25.0 ……….•None of the above ……….
![Page 36: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/36.jpg)
36
Working Problem 2.9: Question (1): Calculate the minimum, the maximum, the range, the mean deviation, the standard deviation, and the variance for the following data set of minimum wage ($) 7, 8, 6, 6, 8, 5, 6, 5, 8, 8
Question (2): In two consecutive exams, the mean grade of the first test was 80 and the mean grade of the second test was 90. The standard deviation of grade of the first test was 6 and the standard deviation of grade of the second test was 8. Calculate the mean of the two tests and the variance of the two tests?
![Page 37: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/37.jpg)
37
![Page 38: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/38.jpg)
What are Combined Descriptive Measures?
Coefficient of Variation (C.V%)100%.
X
sVC
Thickness (mm) 83 90 70 90 90 60 70 70 90 100
X
= (83 + 90 + 70 + 90 + 90 + 60 + 70 + 70 + 90 + 100) / 10 = 81.3 mm
Example: Calculate the Coefficient of Variation of the following ten observations of metal sheet thickness
Thickness (mm)83 (83-81.3) =1.7 2.8990 (90-81.3) = 8.7 75.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.6990 (90-81.3) = 8.7 75.6960 (60-81.3) = -21.3 453.6970 (70-81.3) = -11.3 127.6970 (70-81.3) = -11.3 127.6990 (90-81.3) = 8.7 75.69100 (100-81.3) = 18.7349.69Mean = 81.3 Sum = 1492.1
)(
Xx 2)(
Xx
mms 88.129
1.1492
%84.151003.81
88.12
100%.
X
sVC
38
![Page 39: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/39.jpg)
39
Working Problem 2.11: Calculate the Coefficient of Variation (CV%) for the following data set ofminimum wage ($): 7, 8, 6, 6, 8, 5, 6, 5, 8, 8
![Page 40: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/40.jpg)
What are Combined Descriptive Measures?
Standardized Variable (the z Score)
A standardized variable is a measure of the deviation from the mean by an individual value in units of the standard deviation:
40
Example: An instructor who has been teaching statistics for twenty years has observed that the average grade of students is 88% and the standard deviation is 3%. After teaching the course for two classes, one in the fall semester and one in the spring semester of 2008, the instructor found that the average grades were as follow:
Term Mean GradeFall 2008 82%
Spring 2008 91%
How do these two semesters compare to the instructor’s average over the last twenty years?
![Page 41: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/41.jpg)
41
Standardized Variable (the z Score)
Example: An instructor who has been teaching statistics for twenty years has observed that the average grade of students is 88% and the standard deviation is 3%. After teaching the course for two classes, one in the fall semester and one in the spring semester of 2008, the instructor found that the average grades were as follow:
Term Mean GradeFall 2008 82%
Spring 2008 91%
How do these two semesters compare to the instructor’s average over the last twenty years?
The standardized variable (z- score) is calculated for each semester as follows:
Term Mean Grade z-ScoreFall 2008 82% z82 = (82-88)/3 = -2Spring 2008 91% z91 =(91-88)/3 = 1
From the above scores, you can conclude that the class’s grade in the Fall 2008 being 82% was 2 standard deviations below the teacher’s mean grade, while the class’s grade in the Spring 2008 being 91% was 1 standard deviations above the teacher mean grade.
![Page 42: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/42.jpg)
42
Example: The mean driving time of people living in Union City near Atlanta Georgia to CNN Center in downtown Atlanta is 40 minutes, with a standard deviation of 10 minutes. You asked four CNN employees who live in Union City about their driving time to CNN Center, and you get the following answers: 38 minutes, 52 minutes, 58 minutes, and 40 minutes. Find the z-score that corresponds to each driving time. Interpret the difference in z-scores?
Where t is the actual driving time, t is the mean driving time, and t is the standard deviation of driving time.
At t = 38 minutes,
At t = 52 minutes,
At t = 58 minutes,
At t = 40 minutes,
![Page 43: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/43.jpg)
43
Working Problem 2.12: The average scoring points per game (PTG) up to week 10 in the 2010 NFL football season was 22 points and the standard deviation was 4 points. Using the z-score, compare the following 3 teams and determine which team had a relatively better scoring season:
San Francisco 16 PTG, New England 29 PTG, Pittsburgh 24 PTG
![Page 44: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/44.jpg)
44
Working Problem 2.13: The annual salaries of engineers in the U.S. automobile industry are normally distributed with a mean of $100,000 and a standard deviation of $10,000. What is the z-score for the income x of an auto-engineer who earns $85,000 annually? And what is the z-score for an auto-engineer who earns $105,000 annually?
![Page 45: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/45.jpg)
45
Working Problem 2.14: The annual salaries of U.S. state governors are normally distributed with a mean of $135,450 and a standard deviation of $36,530. If in 2007, the Arkansas governor made $85,000 annual salary, andthe California governor made $206,000. Compare the annual salaries of these two governors using the z-score.
Arnold Schwarzenegger Mike Beebe (California) (Arkansas)
![Page 46: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/46.jpg)
The Use of Computer for Performing Descriptive Statistics
Powerful Tools are available to perform statistical analyses, the focus should therefore be on:
• Planning for sample and data selection in view of the study or application objectives
• Gathering and organizing data in such a way that serves the purpose of the application
• Selecting the appropriate type of analysis
• Organizing the analysis output
• Interpreting the analysis outcome
• Making a report addressing the case or application in question
46
![Page 47: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/47.jpg)
Data on Annual Tuition and Financial Aid by Different U.S. State Colleges (http://www.ordoludus.com/costs.php, 2006)
SchoolIn-State Out-of-State
TuitionTotal Cost ($)
Fin.Tuition Aid ($)
Georgia Institute of Technology $4,648 $18,990 $25,792 $8,222 University of Tennessee $5,290 $16,060 $21,270 $6,954 University of Mississippi $4,320 $9,744 $14,442 $7,532 University of Kentucky $5,812 $12,798 $18,027 $7,861 Louisiana State University $4,515 $12,815 $19,145 $8,006 University of Florida $3,094 $16,579 $22,839 $10,566 University of Virginia $7,133 $23,877 $30,266 $13,449 University of South Carolina $7,314 $18,956 $25,039 $9,501 University of North Carolina $4,515 $18,313 $24,903 $9,687 University of Georgia $4,628 $16,848 $23,224 $7,320 University of Alabama $4,864 $13,516 $18,540 $7,980 University of California (UCLA) $6,504 $24,324 $36,252 $13,462 North Dakota State University $5,264 $12,545 $17,675 $5,487
Florida State University $3,208 $16,340 $23,118 $8,269
The Use of Computer for Performing Descriptive Statistics
Example:
47
![Page 48: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/48.jpg)
12
Analysis of Descriptive Statistics: Steps 1 and 2 48
![Page 49: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/49.jpg)
3
Analysis of Descriptive Statistics: Steps 3 and 4
4
49
![Page 50: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/50.jpg)
5
Analysis of Descriptive Statistics: Steps 5 and 6
6
The minimum of the largest 4 observationsThe maximum of the
smallest 4 observations
50
![Page 51: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/51.jpg)
Analysis of Descriptive Statistics: Output 51
![Page 52: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/52.jpg)
52
InterpretationThe most critical aspect of statistics is to learn how to interpret the results… This is not your typical Math course where all you have to do is find answers…The true answer is not the outputs..it is the interpretation of the outputs
StatisticIn-State
Tuition ($)Out-State Tuition ($) Total Cost ($)
Financial Aid ($)
Mean 5079 16550 22895 8878Median 4756 16460 22979 8114Mode 4515 None None NoneStandard Deviation 1269.44 4196.51 5602.14 2297.84Sample Variance 1611486.64 17610692.25 31383983.67 5280083.14Range 4220 14580 21810 7975Minimum 3094 9744 14442 5487Maximum 7314 24324 36252 13462Count 14 14 14 14Largest(4) 5812 18956 25039 9687Smallest(4) 4515 12815 18540 7532
Outputs of descriptive statistics for tuition, cost, and financial aid
![Page 53: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/53.jpg)
53
Nor
th D
akot
a S
tate
Uni
vers
ity
Uni
vers
ity o
f Ten
ness
ee
Uni
vers
ity o
f Geo
rgia
Uni
vers
ity o
f Mis
siss
ippi
Uni
vers
ity o
f Ken
tuck
y
Uni
vers
ity o
f Ala
bam
a
Loui
sian
a S
tate
Uni
vers
ity
Geo
rgia
Inst
itute
of T
echn
olog
y
Flor
ida
Sta
te U
nive
rsity
Uni
vers
ity o
f Sou
th C
arol
ina
Uni
vers
ity o
f Nor
th C
arol
ina
Uni
vers
ity o
f Flo
rida
Uni
vers
ity o
f Virg
inia
Uni
vers
ity o
f Cal
iforn
ia (U
CLA
)$0
$5,000
$10,000
$15,000
$20,000
$25,000
$30,000
$35,000
$40,000
Total CostFinancial Aid
School
Tota
l Cos
t ($)
and
Fin
anci
al A
id ($
)
Total Cost and Financial Aid by School
Optimum Choice
![Page 54: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/54.jpg)
54
![Page 55: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/55.jpg)
APPENDIX 2.A Steps to Add Data Analysis to Excel 2007
55
![Page 56: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/56.jpg)
1
2
Data Analysis Add-In-Steps 1 and 256
![Page 57: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/57.jpg)
34
5
Data Analysis Add-In-Steps 3 through 5 57
![Page 58: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/58.jpg)
Data Analysis Add-In-Steps 6 and 7
6 7
58
![Page 59: Descriptive Statistics, Numerical Description](https://reader035.vdocuments.site/reader035/viewer/2022062522/58829f5c1a28ab92618b5bd3/html5/thumbnails/59.jpg)
8
9
Data Analysis Add-In-Steps 8 and 959