mm150 unit 9 seminar statistics ii
DESCRIPTION
MM150 Unit 9 Seminar Statistics II. 9.1 Measures of Central Tendency. 2. Averages. Several different types of averages can be calculated for a given set of data. All averages, in general, are called measures of central tendency . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/1.jpg)
MM150Unit 9 Seminar
Statistics II
1
![Page 2: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/2.jpg)
9.1
Measures of Central Tendency
22
![Page 3: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/3.jpg)
Averages• Several different types of averages can be
calculated for a given set of data.
• All averages, in general, are called measures of central tendency.
• The three most common measures of central tendency are mean, median, mode, and midrange.
3
![Page 4: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/4.jpg)
Mean – To find the arithmetic mean, or mean, sum the data scores and then divide by the number of data scores.
Example: Find the mean of the data scores 5, 6, 2, 9, 8
5 + 6 + 2 + 9 + 8 = 30 = 6 5 5
Median – To find the median, put the data scores in ascending or descending order and then find the middle data score. If there are an even number of data scores, after ranking the scores, find the mean of the middle two.
Example: Find the median of the data scores 5, 7, 2, 9, 8
Put the scores in ascending order: 2, 5, 7, 8, 9
Example: Find the median of the data scores 4, 7, 2, 9Put the scores in ascending order: 2, 4, 7, 9Find the mean of 4 and 7: (4 + 7)/2 = 11/2 = 5.5
4
![Page 5: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/5.jpg)
Mode – The mode is the data score that occurs most frequently.
Example: Find the mode of the data scores 6, 4, 9, 8, 6, 5
It may help to put the scores in ascending order: 4, 5, 6, 6, 8, 9
You can see that the data score 6 occurs most often.
•You can have data sets that don’t have a mode (each score occurs once) and you can have data sets that are bimodal – which means they have 2 modes.
Midrange – The midrange is the value halfway between the greatest and least data score. To find it, take the mean of the greatest and least data score.
Example: Find the midrange of the data scores 6, 4, 9, 8, 6, 5
It may help to put the scores is ascending order: 4, 5, 6, 6, 8, 9
The midrange is (4 + 9)/2 = 13/2 = 6.5
*Please read on page 362 of your text when each is the ‘better’ average.
5
![Page 6: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/6.jpg)
EVERYONE: page 366 #21
6
![Page 7: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/7.jpg)
EVERYONE: page 366 #21
7
1, 2, 2, 3, 3, 3, 4, 5, 5, 7
Mean: 1 + 2 + 2 + 3 + 3 + 3 + 4 + 5 + 5 + 7 = 3.5
10
Median: 1, 2, 2, 3, 3, 3, 4, 5, 5, 7 3 weeks
(3 + 3)/2 = 3
Mode: 1, 2, 2, 3, 3, 3, 4, 5, 5, 7 3 weeks
Midrange: 1 + 7 = 4 weeks 2
![Page 8: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/8.jpg)
Mean ExampleTodd is taking a math class where his end of term grade is based on 4 exams, each having the same number of points and weighted the same. He scored 98, 82, and 87 on the first three exams in his class. What does he need to score on the 4th exam to get at least a 90% for the final grade? The instructor uses mean as the average.
98 + 82 + 87 + x = 90 4
267 + x = 90 4
267 + x = 360
x = 93
Todd must score a 93 or higher on the test.
8
![Page 9: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/9.jpg)
Measures of Position• There are 2 measures of position, percentiles and quartiles.
• They are used to make comparisons for a large amount of data.
• Percentile – There are 99 percentiles that divide the data up into 100 equal parts.
• When you are reported to be in the 98th percentile, this does not mean your score is a 98%. This means that you outperformed about 98% of the population. 9
![Page 10: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/10.jpg)
Measures of Position (con’t)
Quartile – Quartiles divide data into 4 equal parts, called quartiles. The first quartile is at 25%, the second at 50%, and the third at 75%.
Q1, Q2, Q3 divide ranked scores into four
equal parts25% 25% 25% 25%
Q3Q2Q1(minimum) (maximum)
(median)
10
![Page 11: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/11.jpg)
ExampleDetermine Q1, Q2, and Q3 of the data below:
15, 10, 19, 18, 11, 15, 13, 18, 19, 17,19, 15, 16, 13, 15, 16, 13, 12, 14
1.First put the data in ascending order
10, 11, 12, 13, 13, 13, 14, 15, 15, 15, 15, 16, 16, 17, 18, 18, 19, 19, 19
2.Find the median, 15 is Q2
3.Find the median of the lower half for Q1, 13
4.Find the median of the upper half for Q3, 1811
![Page 12: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/12.jpg)
9.2
Measures of Dispersion
1212
![Page 13: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/13.jpg)
2 Measures of DispersionRange – The range is the difference between the greatest and least data score.
Example: Find the range of the data scores
55, 59, 51, 64, 60
Put the data scores in ascending order 51, 55, 59,
60, 64
The range is 64 – 51 = 13
Standard deviation – The standard deviation tells us how much the data differ from the mean.
See the next PowerPoint slide for an example of standard deviation.
Range = (maximum value) – (minimum value)
13
![Page 14: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/14.jpg)
Standard Deviation, cont’d
2
1
x xs
n
14
![Page 15: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/15.jpg)
Standard Deviation ExampleFind the standard deviation of 11, 15, 18, 9, 12
1.) Find the mean of the data scores. 11+15+18+9+12 = 65 = 13 5 52.) Make a chart with 3 columns
Data Data – Mean (Data – Mean)2
11 11- 13 = -2 4
15 15 – 13 = 2 4
18 18 – 13 = 5 25
9 9 – 13 = -4 16
12 12 – 13 = -1 1
3. 4. 5.
6.) 507.) Divide 50 by n – 1, where n is the number of data scores. So divide 50 by 4, which is 12.5
8.) Find the square root of the number found in step 7. √12.5 ≈ 3.5355 15
![Page 16: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/16.jpg)
Everyone Example
Calculate the standard deviation for the data set:
15, 16, 20, 13
16
![Page 17: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/17.jpg)
EVERYONE solution1. Find the mean. 15 + 16 + 20 + 13 = 64 = 16 4 4
2. Make a 3-column table.
Data Data – Mean (Data – Mean)2
15 -1 1
16 0 0
20 4 16
13 -3 9
6. 26
3. 4. 5.
7. 26/3 = 8.667
8. √8.667 ≈ 2.944
17
![Page 18: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/18.jpg)
9.3
The Normal Curve18
18
![Page 19: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/19.jpg)
Types of Distributions
• Rectangular Distribution
• All observed values occur with the same frequency.
• J-shaped distribution
• The frequency is either constantly increasing or constantly decreasing.
19
![Page 20: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/20.jpg)
Types of Distributions (con’t)
• Bimodal
• Two nonadjacent values occur more frequently than any other values in the set of data.
• Skewed to left or right
• Has more of a “tail” on one side than the other.
• The greatest frequency appears on the left or the right of the curve.
20
![Page 21: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/21.jpg)
21
Skewed Distributions
• In figure (a) the greatest frequency appears on the left so the mode would be on the left side of the curve.
• All the data would be used to determine the mean. The values on the right side of the curve in (a) would increase the value of the mean. So the value of the mean would be farther to the right of the mode.
• The median would be between the mean and the mode.
![Page 22: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/22.jpg)
Normal DistributionsPROPERTIES
1. The graph of a normal distribution is called a normal curve.
2. The normal curve is bell-shaped and symmetric about the mean.
3. The mean, median, and mode of a normal distribution all have the same value and all occur a the center of the distribution.
22
![Page 23: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/23.jpg)
23
The Empirical RuleIn any normal distribution
1.Approximately 68% of all the data lies within one standard deviation of the mean (in both directions).
2. Approximately 95% of all the data lies within two standard deviations of the mean (in both directions).
3.Approximately 99.7% of all the data lies within
three standard deviations of the mean (in both directions).
![Page 24: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/24.jpg)
The Empirical Rule
24
![Page 25: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/25.jpg)
Empirical Rule ExampleSuppose that the weights of newborn infants are normally distributed. If approximately 2000 infants are born at Sarasota Memorial Hospital each year, determine the approximate number of infants who are expected to weigh:
a)within one standard deviation of the mean.
b)within two standard deviations of the mean.
Solution: a) By the empirical rule, 68% of the infants weigh within one standard deviation of the mean.
(.68)(2000) = 1360 infants are expected to weigh within one
standard deviation of the mean.
a) By the empirical rule, 95% of the infants weigh within two standard deviations of the mean.
(.95)(2000) = 1900 infants are expected to weigh within two
standard deviations of the mean.25
![Page 26: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/26.jpg)
z-Scores
• z-scores determine how far, in terms of standard deviations, a given score is from the mean of the distribution.
value of the piece of data - mean
standard deviation
xz
s
26
![Page 27: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/27.jpg)
Example: z-scores
• A normal distribution has a mean of 50 and a standard deviation of 5. Find z-scores for the following values.
• a) 55 b) 60 c) 43
• a)
A score of 55 is one standard deviation above the mean.
55 50 51
5 5z
27
![Page 28: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/28.jpg)
Example: z-scores continued
• b)
A score of 60 is 2 standard deviations above the mean.
• c)
A score of 43 is 1.4 standard deviations below the mean.
60 50 102
5 5z
43 50 71.4
5 5z
28
![Page 29: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/29.jpg)
9.4
Linear Correlation and Regression
2929
![Page 30: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/30.jpg)
Linear Correlation• Linear correlation is used to determine
whether there is a relationship between two quantities and, if so, how strong the relationship is.
• The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables.
• If the value is positive, as one variable increases, the other increases.
• If the value is negative, as one variable increases, the other decreases.
• The variable, r, will always be a value between –1 and 1 inclusive.
30
![Page 31: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/31.jpg)
Scatter Diagrams• A visual aid used with correlation is the scatter
diagram, a plot of points (bivariate data).– The independent variable, x, generally is a
quantity that can be controlled.
– The dependent variable, y, is the other variable.
• The value of r is a measure of how far a set of points varies from a straight line. – The greater the spread, the weaker the
correlation and the closer the r value is to 0.
– The smaller the spread, the stronger the correlation and the closer the r value is to 1.
31
![Page 32: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/32.jpg)
Correlation
32
![Page 33: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/33.jpg)
Linear Correlation Coefficient
• The formula to calculate the correlation coefficient (r) is as follows:
33
![Page 34: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/34.jpg)
Example: Words Per Minute versus Mistakes
There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes.
934Nancy
1041Kendra1253Phillip1167George824Ellen
MistakesWords per MinuteApplicant
34
![Page 35: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/35.jpg)
Solution• We will call the words typed per
minute, x, and the mistakes, y.• List the values of x and y and
calculate the necessary sums.
306811156934xy = 2,281
y2 = 510
x2 =10,711
y = 50x = 219
1012118y
Mistakes xyy2 x2x
41536724
WPM
41010016816361442809737121448919264576
35
![Page 36: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/36.jpg)
Solution continued
• The n in the formula represents the number of pieces of data. Here n = 5.
r n xy x y
n x2 x 2 n y 2 y 2
r 5 2281 219 50
5 10,711 219 2 5 510 50 2
36
![Page 37: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/37.jpg)
Solution continued
11,405 10,950
5 10,711 47,961 5 510 2500
455
53,555 47,961 2550 2500
455
5594 500.86
37
![Page 38: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/38.jpg)
Solution continued
• Since 0.86 is fairly close to 1, there is a fairly strong positive correlation.
• This result implies that the more words typed per minute, the more mistakes made.
38
![Page 39: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/39.jpg)
Linear Regression
• Linear regression is the process of determining the linear relationship between two variables.
• The line of best fit (regression line or the least squares line) is the line such that the sum of the squares of the vertical distances from the line to the data points (on a scatter diagram) is a minimum.
39
![Page 40: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/40.jpg)
The Line of Best Fit
• Equation:
y mx b, where
m n xy x y
n x2 x 2, and b
y m x n
40
![Page 41: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/41.jpg)
Example• Use the data in the previous
example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart.
• Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points.
41
![Page 42: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/42.jpg)
Solution
• From the previous results, we know that
m n xy x y
n x2 x 2
m 5(2,281) (219)(50)
5(10,711) 2192
m 455
5594m 0.081
42
![Page 43: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/43.jpg)
Solution
• Now we find the y-intercept, b.
Therefore the line of best fit is y = 0.081x + 6.452
b y m x
n
b 50 0.081 219
5
b 32.261
56.452
43
![Page 44: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/44.jpg)
Solution continued
• To graph y = 0.081x + 6.452, plot at least two points and draw the graph.
8.88230
8.07220
7.26210
yx
44
![Page 45: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/45.jpg)
Solution continued
45
![Page 46: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/46.jpg)
46
Example: page 407 #24a) Draw a scatter
diagram
b) Determine the value of r, rounded to the nearest thousandth
c) Determine whether a correlation exists at … = 0.05
d) Determine whether a correlation exists at …. = 0.01
![Page 47: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/47.jpg)
47
Example: continued
a) The first thing
to do is plot the
points.
Here we have
(6,13), (8,11),
(11,9),
(14,10) and (17,7).
![Page 48: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/48.jpg)
48
Example: continued
![Page 49: MM150 Unit 9 Seminar Statistics II](https://reader035.vdocuments.site/reader035/viewer/2022062721/56813696550346895d9e239d/html5/thumbnails/49.jpg)
49
Example: page 407 #32Determine the equation of the line of best
fit from the data in the exercise indicated. Round both the slope and y
intercept to the nearest hundredth.