6.1 what is statistics? definition: statistics – science of collecting, analyzing, and...

6.1 What is Statistics?

• Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively evaluated.

3 Phases:1. Collecting data

2. Analyzing data

3. Interpreting data

• Descriptive Statistics – summarize and describe a characteristic of a groupexample: batting average

• Inferential Statistics – used to estimate, infer, or conclude something about a larger groupexample: polls

• Sample – subset of the group of data available for analysis

• Population – the entire set• Bias – favoring of certain outcomes over

others• Census – collects data from all members of

the population• Parameter – characteristic value of a

population• Statistic – characteristic value of a sample

6.2 Organizing Data

• Stem and Leaf Diagram:data – 35, 52, 37, 44, 51, 48, 45, 12

Stem Leaves

4 4 5 8

6.2 Organizing Data

• Frequency Table:data – 35, 52, 37, 44, 51, 48, 45, 12

Range Frequency

50-59 2

40-49 3

30-39 2

20-29 0

10-19 1

6.3 Displaying Data

• Ways to display data:– Frequency histogram– Relative frequency histogram– Multiple bar graph– Stacked bar graph– Line graph– Pie chart

6.3 Displaying Data

Frequency Histogram

1 2 3 4 5 6 7 8

Series1

6.3 Displaying Data

Relative Frequency Histogram

1 2 3 4 5 6 7 8

Series1

6.3 Displaying Data

Multiple Bar Graph

graduate

6.3 Displaying Data

Stacked Bar Graph

010002000300040005000600070008000

graduate

6.3 Displaying Data

Line Graph

graduate

6.3 Displaying Data

Pie Chart

Natsci

Socsci

6.4 Measures of Central Tendency

• Central Tendency – the propensity of data to be located or clustered about some point.

• Arithmetic Mean – sum of the values of all the observations divided by the total number of observations

• For sample data, mean is

• For population data, the mean is

• Median – the median is the middle value of a set of data when data is arranged in ascending order

• Finding the median:1. Arrange the data in increasing order or

decreasing order.

2. Determine if n is even or odd.

a. If n is odd, pick the middle value

b. If n is even, take the average of the two middle values

• Mode – is the value or values that occur most frequently.Note: If all values occur with the same frequency, then there is no mode.

• Symmetric Distribution

Mean, Median, and Mode

• Distribution skewed to the left

Mean MedianMode

• Distribution skewed to the right

MeanMedianMode

6.5 Measures of Variability

• Definition: The range of a set of n measurements, x1, x2, x3, … xn is the difference between the largest and the smallest amounts.

• Variance -

Problem with the variance: the units are the original units squared.

• Standard deviation – population standard deviation is the square root of the population variance.

• Sample variance -

• s = square root of the sample variance

• Short cut formulas for s2 and 2 are given on page 495 (provided with test).

• Short cut formula for frequency data is given on page 499 (provided with test).

• Short cut formulas are genuinely easier to calculate.

• Approximating the standard deviation:s (R/4) where R is the range.

6.6 Measures of Relative Position

• pth percentile - for a data in increasing order - p% of the data are less than that value and (100 – p)% of the data are greater than that value.

6.6 Measures of Relative Position

• Z-scores – The sample z-score for a measure x is:

The population z-score for a measure x is: z-score represents the # of standard deviations away from the mean.

6.7 Normal Distribution

• Definition: Standardizing – converting data to z-scores.

• Some empirical rules:1. About 68% of data is within one of the

mean.2. About 95% of data is within two of the

mean.3. About 99% of data is within three of the

• The normal distribution looks like:

1. Bell-shaped2. Symmetric3. Mean = median = mode

• Definition: Standard normal distribution – normal distribution with = 1 and = 0.

The standard normal distribution table (page 511 or in appendix page 647) can be used to determine probabilities for a range of z-values

6.8 Confidence Intervals

• Central Limit Theorem: For a large sample size, the random variable x is approximately normally distributed with mean and standard deviation /n where is the population mean of the x’s and is the population standard deviation of the x’s.

• - may be replaced by s

• Common levels of confidence (n 30):

Level of Confidence z/2

80 1.28

90 1.645

95 1.96

99 2.575

• Margin of Error: margin of error of an estimate of a sample proportion is given by:

6.9 Regression and Correlation

• Scatter Plot – a plot of data consisting of 2 variables

• Linear Regression – modeling the data with the line that “best fits” – usually a “least squares” line or regression line

• Least Squares Line – is the line that minimizes the sum of the squared errors for a set of data points (formulas given on page 531 and shortcut formulas are on page 532 – formulas to be provided on test)

6.9 Regression and Correlation

• Correlation Coefficient r – is a measure of the strength of the linear relationship between the 2 random variables x and y.

Note: The closer the correlation is to 1 or –1, the stronger the relationship between the x and y variables. A correlation of zero means there is no evidence of a linear pattern.

6.1 what is statistics? definition: statistics – science of collecting, analyzing, and...

Documents

understanding and interpreting statistics in assessments...

advanced placement (ap ) statistics (apstats) a...

an introduction to statistics. introduction to statistics i....

usa government learning objectively

a reference guide for interpreting statistics and creating...

460.03a subjectivity objectively

at usda’s national agricultural statistics service ·...

interpreting basic statistics; holcomb 6th ed

what is statistics? statistics the science of collecting,...

interpreting esxtop statistics _ vmware communities

web view#- formal definition of statistics:- #- statistics:...

nrls official statistics publications: guidance...

chapter 1: introduction to statistics. 2 statistics a set of...

towards an objectively complete language

controlling control charts interpreting p -values...

social statistics: introduction. statistics describes a set...

inferential statistics and probabilitynebula2.deanza.edu ›...

advanced placement® ap® statistics...ap® statistics ap...

statistics- landmark summaries interpreting typical values...

statistics: interpreting data and making...