chapter 1 the mean, the number of observations, the variance and the standard deviation
Post on 21-Dec-2015
229 views
TRANSCRIPT
![Page 1: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/1.jpg)
Chapter 1
The mean, the number of observations, the variance and the standard deviation
![Page 2: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/2.jpg)
Some definitions
Data - observations, measurements, scores
Statistics - a series of rules and methods that can be used to organize and interpret data.
Descriptive Statistics - methods to summarize large amounts of data with just a few numbers.
Inferential Statistics - mathematical procedures to make statements of a population based on a sample.
![Page 3: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/3.jpg)
More DefinitionsParameter - a number that summarizes or
describes some aspect of a population.
Sample statistic - An estimate of a population parameter based on a random sample taken from the population.
Sampling Error - the difference between a sample statistic that estimates a population parameter and the actual parameter.
Non-parametric Statistics - statistics for observations that do not allow the estimation of the population mean and variance.
![Page 4: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/4.jpg)
Sampling Error - the difference between a sample statistic that estimates a population parameter and the actual parameter.
Differences between sample statistics and population parameters are largely a function of stable, random individual differences and measurement problems
![Page 5: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/5.jpg)
Where we are going
![Page 6: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/6.jpg)
Descriptive Statistics
Number of Observations
Measures of Central Tendency
Measures of Variability
![Page 7: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/7.jpg)
Observations
Each score is represented by the letter X.
The total number of observations is represented by N.
![Page 8: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/8.jpg)
Measures of Central Tendency
Finding the most typical score
median - the middle score
mode - the most frequent score
mean - the average score
In this course, the mean will be
our most important measure of
central tendency
![Page 9: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/9.jpg)
Greek letters are used to represent population parameters.
(mu) is the mathematical symbol for the mean.
is the mathematical symbol for summation.
Formula - = (X) / NEnglish: To calculate the mean, first add
up all the scores, then divide by the number of scores you added up.
Calculating the Mean
![Page 10: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/10.jpg)
The mode, the median and the mean
606345646570556066
Ages of people retiring from Rutgers this year. 455560606364656670
X = 548N = 9
Mean = 60.89
Mode is 60. Median is 63.
![Page 11: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/11.jpg)
Measures of Variability – less important
Range - the distance from the highest to the lowest score.
Inter-quartile Range - the distance from the top 25% to the bottom 25%.
Sum of Squares (SS) – the total squared distance of all scores from the mean. You calculate it by finding the distance of each score from the mean, squared and then summed over all the scores.
![Page 12: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/12.jpg)
Measures of Variability – more important
Variance (2)- also called sigma2. The variance is the average squared distance of scores from mu. It is found by dividing the total squared distance of all the scores from the mean and then dividing by the
number of scores (2=SS/N)
Standard Deviation ()- also called sigma. The standard deviation is the square root of the variance. It is the average unsquared distance of scores in the population from their mean. (That is almost, but not exactly like saying that the standard deviation is the average distance of scores from the population mean.)
![Page 13: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/13.jpg)
Computing the variance and the standard deviation
Scores on a 10 question Psychology quiz
Student
John
JenniferArthurPatrickMarie
X
7
8357
X = 30 N = 5 = 6.00
X -
+1.00
+2.00-3.00-1.00+1.00
(X- ) = 0.00
(X - )2
1.00
4.009.001.001.00
(X- )2 = SS = 16.00
2 = SS/N = 3.20 = = 1.7920.3
![Page 14: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/14.jpg)
The variance is our most basic and important measure of variability
The variance ( =sigma squared) is the average squared distance of individual scores from the population mean.
Other indices of variation are derived from the variance.
For example,. as noted above, sigma is the average unsquared distance of scores from mu is the standard deviation. To find it you compute the square root of the variance.
2
![Page 15: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/15.jpg)
Other measures of variability derived from the variance We can randomly choose scores from a population
to form a random sample and then find the mean of such samples.
Each score you add to a sample tends to correct the sample mean back toward the population mean, mu.
The average squared distance of sample means from the population mean is the variance divided by n, the size of the sample.
To find the average unsquared distance of sample means from mu divide the variance by n, then take the square root. The result is called the standard error of the sample mean or, more briefly, the standard error of the mean. We’ll see more of this in Ch. 4.
![Page 16: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/16.jpg)
Making predictions (1)Without any other information, the
population mean (mu) is the best prediction of each and every person’s score.
So you should predict that everyone will score precisely at the population mean.
Why? Because the mean is an unbiased predictor or estimate. The mean is as close to the high as to the low scores in the population.
This is mathematically proven by the fact that deviations around the mean sum to zero.
![Page 17: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/17.jpg)
You should also predict that everyone will score right at the mean because:
The mean is the number that is the smallest average squared distance from all the scores in the distribution.
Thus, the mean is your best prediction, because it is a least squares, unbiased predictor.
![Page 18: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/18.jpg)
What happens if we make a prediction other than mu.Scores on a Psychology quiz (mu = 6.00) What if we predict everyone will score 5.50? Deviations don’t sum to zero and the average squared distance of scores from the prediction increases
Student
John
JenniferArthurPatrickMarie
X
7
8357
X = 30 N = 5 = 6.00
X - 5.5
+1.50
+2.50-2.50-0.50+1.50
(X- ?) = 2.50
(X - )2
2.25
6.256.250.252.25
(X- ?)2 = SS = 17.25
2 = SS/N = 3.45 = = 1.8620.3
X - 5.50 (X - 5.50)2
![Page 19: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/19.jpg)
Compare that to predicting that everyone will score right at the mean (mu).
Scores on a 10 question Psychology quiz
Student
John
JenniferArthurPatrickMarie
X
7
8357
X = 30 N = 5 = 6.00
X -
+1.00
+2.00-3.00-1.00+1.00
(X- ) = 0.00
(X - )2
1.00
4.009.001.001.00
(X- )2 = SS = 16.00
2 = SS/N = 3.20 = = 1.7920.3
![Page 20: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/20.jpg)
But when you predict that everyone will score at the mean, you will be wrong. In fact, it is often the case that no one will score precisely at the mean.
In statistics, we don’t expect our predictions to be precisely right.
We want to make predictions that are wrong in a particular way.
We want our predictions to be as close to the high scores as to the low scores in the population.
The mean is the only number that is an unbiased predictor, it is the only number around which deviations sum to zero.
![Page 21: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/21.jpg)
We want to be wrong by the least amount possible In statistics, we consider error to be the
squared distance between a prediction and the actual score.
The mean is the least average squared distance from all the scores in the population.
The number that is the least average squared distance from the scores in the population is the prediction that is least wrong, the least in error.
Thus, saying that everyone will score at the mean (even if no one does!) is the prediction that gives you the smallest amount of error.
![Page 22: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/22.jpg)
Why doesn’t everyone score right at the mean?Sources of Error
Individual differences – people have stable differences from one another. They differ in an infinite number of ways and combination of ways.
PROOF OF THAT: AREN’T YOU ARE MORE LIKE WHO YOU WILL BE IN 5 MINUTES THAN YOU ARE LIKE THE PERSON NEXT TO YOU??!
![Page 23: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/23.jpg)
AND – THERE ARE ALWAYS MEASUREMENT PROBLEMS! Instruments are imperfect, scores get mistranscribed, participants may be uninterested or have a stomach ache, etc. etc. etc. …
![Page 24: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/24.jpg)
Remember: THERE ARE ALWAYS MEASUREMENT PROBLEMS
NO MEASUREMENT DEVICE IS EVER PERFECTLY ACCURATE, WHETHER IT IS A HIGHLY ACCURATE SCALE OR A 12 QUESTION QUESTIONNAIRE
![Page 25: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/25.jpg)
Additionally, transient situational factors make measurement inaccurate
This is especially true when we measure people. Let’s say we are measuring something relatively easy to measure, such as verbal ability. When we are measuring people, lots of transient factors (such as mood, events, time, motivation etc.) all change an individual’s responses and combine to make our measurement of verbal ability imperfect.
![Page 26: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/26.jpg)
The mean square for error
We call the average squared error of prediction when we use the mean as our prediction the “mean square for error”. It tells us how much (squared) error we make, on the average, when we predict that everyone will score precisely at the mean.
![Page 27: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/27.jpg)
Mean square for error = the variance (sigma2)
If we predict that everyone will score right at the mean, how much error do you make on the average? To find out, find the distance of each score from the mean, square that distance and divide by the number of scores to find the average error.
WHOOPS: THAT’S SIGMA2.
![Page 28: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/28.jpg)
Questions and answers – the mean.
WHAT QUALITIES OF THE MEAN (MU) MAKE IT THE BEST PREDICTION YOU CAN MAKE OF WHERE EVERYONE WILL SCORE?
The mean is an unbiased predictor or estimate, because the deviations around the mean always sum to zero.
The mean is a least squares predictor because it is the smallest squared distance on the average from all the scores in the population.
![Page 29: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/29.jpg)
So the variance has a third name.
The variance is called the mean square for error as well as being called sigma2.
As the mean square for error, the variance is our numerical index of the effects of individual differences and measurement problems.
![Page 30: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/30.jpg)
Q & A: the mean
WHY WOULD YOU PREDICT THAT EVERYONE WILL SCORE AT THE MEAN WHEN, IN FACT, OFTEN NO ONE CAN POSSIBLY SCORE PRECISELY AT THE MEAN?
In statistics, we don’t expect our predictions to be precisely right.
We want to make predictions that are close and wrong in a particular way.
We want least squares, unbiased predictors.
![Page 31: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/31.jpg)
Q & A: The variance
WHAT ARE THE OTHER NAMES FOR THE VARIANCE?
Sigma2 and the mean square for error.WHAT OTHER MEASURES OF
VARIABILITY CAN BE EASILY COMPUTED ONCE YOU KNOW THE VARIANCE?
The standard deviation and the standard error of the sample mean.
![Page 32: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/32.jpg)
How do you compute
THE VARIANCE? Find the distance of each score from the mean, square it, sum them up and divide by the number of scores in the population.
THE STANDARD DEVIATION? Compute the square root of the variance.
THE STANDARD ERROR OF THE SAMPLE MEAN? Divide the variance by n, the size of the sample, and then take a square root.
![Page 33: Chapter 1 The mean, the number of observations, the variance and the standard deviation](https://reader034.vdocuments.site/reader034/viewer/2022042608/56649d615503460f94a4283a/html5/thumbnails/33.jpg)
END CHAPTER 1 SLIDES