excursions in modern mathematics, 7e: 14.1 - 2copyright © 2010 pearson education, inc. 14...
TRANSCRIPT
Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc.
14 Descriptive Statistics
14.1 Graphical Descriptions of Data
14.2 Variables
14.3 Numerical Summaries
14.4 Measures of Spread
Excursions in Modern Mathematics, 7e: 14.1 - 3Copyright © 2010 Pearson Education, Inc.
•data set: a collection of data values.•data points: individual data values in a data set•N still represents the size of the data set.•variable: any characteristic that varies with the members of a population.
Data Set
Excursions in Modern Mathematics, 7e: 14.1 - 4Copyright © 2010 Pearson Education, Inc.
Numerical (or quantitative) variable: a variable that represents a measurable quantity
–Continuous variable: difference between the values of a numerical variable can be arbitrarily small–Discrete: values of the numerical variable change by minimum increments
Categorical (or qualitative): cannot be measured numerically:
Variables
Excursions in Modern Mathematics, 7e: 14.1 - 5Copyright © 2010 Pearson Education, Inc.© Copyright McGraw-Hill 2000
5
Frequency Table
• organize the data in a meaningful,
intelligible way.
• enable the reader to determine the
nature or shape of the distribution.
• facilitate computational procedures for
measures of average and spread.
Excursions in Modern Mathematics, 7e: 14.1 - 6Copyright © 2010 Pearson Education, Inc.
Chem 103 Test Scores (pg 545 #1)What type of data is this? Construct a Frequency Table for the raw data.
Student ID Score Student ID Score
1362 50 4315 70
1486 70 4719 70
1721 80 4951 60
1932 60 5321 60
2489 70 5872 100
2766 10 6433 50
2877 80 6921 50
2964 60 8317 70
3217 70 8854 100
3588 80 8964 80
3780 80 9158 60
3921 60 9347 60
Excursions in Modern Mathematics, 7e: 14.1 - 7Copyright © 2010 Pearson Education, Inc.
•Bar graph has data listed in increasing order on a horizontal axis and the frequency of each data value displayed by the height of the column above that test score
•Pictograms use icons or pictures instead of bars to show the frequencies (see pg 528)
•The point of a pictogram is that a graph is often used not only to inform but also to impress and persuade, and, in such cases, a well-chosen icon or picture can be a more effective tool than just a bar.
•Draw a bar graph for the Chem 103 data. Note any outliers
Bar Graphs and Pictograms
Excursions in Modern Mathematics, 7e: 14.1 - 8Copyright © 2010 Pearson Education, Inc.
•used when the number of categories is small.•Uses relative frequencies of the categories •the “pie” represents the entire population (100%)•the “slices” represent the categories (or classes), with the size (angle) of each slice being proportional to the relative frequency of the corresponding category.
Pie Charts
Excursions in Modern Mathematics, 7e: 14.1 - 9Copyright © 2010 Pearson Education, Inc.
Relative frequencies : the frequencies given in terms of percentages of the total population.
For the Chem103 data :(round to nearest 10th)
Construct a pie chart.
Relative Frequency
Score 10 50 60 70 80 100
Relative Frequency
4.2% 12.5% 29.2% 25% 20.8% 8.3%
Excursions in Modern Mathematics, 7e: 14.1 - 10Copyright © 2010 Pearson Education, Inc.
When it comes to deciding how best to display graphically the frequencies of a population, a critical issue is the number of categories into which the data can fall. When the number of categories is too big (say, in the dozens), a bar graph or pictogram can become muddled and ineffective. This happens more often than not with numerical data–numerical variables can take on infinitely many values.
How Many Categories
Excursions in Modern Mathematics, 7e: 14.1 - 11Copyright © 2010 Pearson Education, Inc.
•In situations with large data sets it is customary to present a more compact picture of the data by grouping together sets of scores into categories called class intervals. •the number of class intervals should be somewhere between 5 and 20.•Class interval and endpoint conventions. (#20)•Histograms•See pg. 533•DO #20
Excursions in Modern Mathematics, 7e: 14.1 - 12Copyright © 2010 Pearson Education, Inc.
Measures of Location
Measures of location such as the mean (or average), the median, and the quartiles, are numbers that provide information about the values of the data.
Numerical Summaries of a Data Set
Excursions in Modern Mathematics, 7e: 14.1 - 13Copyright © 2010 Pearson Education, Inc.
Mean
N
ddd N...mean
valuesdata ofnumber total
valuesdata theof summean
21
#24a 548 .Pg
Excursions in Modern Mathematics, 7e: 14.1 - 14Copyright © 2010 Pearson Education, Inc.
To find the average A of a data set given by a frequency table do the following:Step 1.
S = d1•f1 + d2•f2 +… + dk•fk
To Find the Average From a Table
Step 2.
N = f1 + f2 +…+ fk Step 3.A = S/N
Pg. 548 # 29
Excursions in Modern Mathematics, 7e: 14.1 - 15Copyright © 2010 Pearson Education, Inc.
Median
• Halfway point in the data set.• Physical middle• Data MUST be in order.
Excursions in Modern Mathematics, 7e: 14.1 - 16Copyright © 2010 Pearson Education, Inc.
■ Sort the data set from smallest to largest. Let d1, d2, d3, … , dN represent the sorted data.
■ If N is odd, the median is (middle)
■ If N is even, the median is the average of
FINDING THE MEDIANOF A DATA SET
d
N1
2
.
d
N
2
and dN
21
.
Pg. 548 #24b
Excursions in Modern Mathematics, 7e: 14.1 - 17Copyright © 2010 Pearson Education, Inc.
After the median, the next most commonly used values are the first and third quartiles. The first quartile (denoted by Q1) is the 25th percentile, and the third quartile (denoted by Q3) is the 75th percentile.
Pg. 549 # 34
Quartiles
Excursions in Modern Mathematics, 7e: 14.1 - 18Copyright © 2010 Pearson Education, Inc.
Invented in 1977 by statistician John Tukey, a box plot (also known as a box-and-whisker plot) is a picture of the five-number summary of a data set. The box plot consists of a rectangular box that sits above a scale and extends from the first quartile Q1 to the third quartile Q3 on that scale. A vertical line crosses the box, indicating the position of the median M. On both sides of the box are “whiskers” extending to the smallest value, Min, and largest value, Max, of the data.
Box Plots
Excursions in Modern Mathematics, 7e: 14.1 - 19Copyright © 2010 Pearson Education, Inc.
This figure shows a generic box plot for a data set.
Pg. 549 # 42
Box Plots
Excursions in Modern Mathematics, 7e: 14.1 - 20Copyright © 2010 Pearson Education, Inc.
Range: the difference between the highest and lowest data value usually denoted by R.
R = Max – Min
The range of a data set is a useful piece of information when there are no outliers in the data. In the presence of outliers the range tells a distorted story.
The Range
Excursions in Modern Mathematics, 7e: 14.1 - 21Copyright © 2010 Pearson Education, Inc.
•eliminate the possible distortion caused by outliers•denoted by the acronym IQR.•the difference between the third quartile and the first quartile• IQR = Q3 – Q1•tells us how spread out the middle 50% of the data values are.
•Find R and IQR for #34
The Interquartile Range
Excursions in Modern Mathematics, 7e: 14.1 - 22Copyright © 2010 Pearson Education, Inc.
•The most important and most commonly used measure of spread for a data set•The key concept for understanding the standard deviation is the concept of deviation from the mean. •If A is the average of the data set and x is an arbitrary data value, the difference x – A is x’s deviation from the mean.• The deviations from the mean tell us how “far” the data values are from the average value of the data.
Standard Deviation
Excursions in Modern Mathematics, 7e: 14.1 - 23Copyright © 2010 Pearson Education, Inc.
The deviations from the mean are themselves a data set, which we would like to summarize. One way would be to average them, but if we do that, the negative deviations and the positive deviations will always cancel each other out so that we end up with an average of 0. This, of course, makes the average useless in this case. The cancellation of positive and negative deviations can be avoided by squaring each of the deviations.
Standard Deviation
Excursions in Modern Mathematics, 7e: 14.1 - 24Copyright © 2010 Pearson Education, Inc.
The squared deviations are never negative, and if we average them out, we get an important measure of spread called the variance, denoted by V.
Finally, we take the square root of the variance and get the standard deviation, denoted by the Greek letter (and sometimes by the acronym SD).
The following is an outline of the definition of the standard deviation of a data set.
Standard Deviation
Excursions in Modern Mathematics, 7e: 14.1 - 25Copyright © 2010 Pearson Education, Inc.
■ Let A denote the mean of the data set. For each number x in the data set, compute its deviation from the mean (x – A) and square each of these numbers. These numbers are called the squared deviations.
■ Find the average of the squared deviations. This number is called the variance V.
■ The standard deviation is the square
root of the variance
THE STANDARD DEVIATION OF A DATA SET
V .
Excursions in Modern Mathematics, 7e: 14.1 - 26Copyright © 2010 Pearson Education, Inc.
• Page 551 # 56a,c, 62, 63
• Groups Pg. 545 – 551
# 20, 34c, 56b, 64