investigation data colllection data presentation tabulation diagrams graphs descriptive statistics...

47
INVESTIGATION INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures of Skewness & Kurtosis Inferential Statistiscs Estimation Hypothesis Testing Ponit estimate Inteval estimate Univariate analysis Multivariate analysis

Upload: dwain-barrett

Post on 05-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

INVESTIGATIONINVESTIGATION

Data Colllection

Data Presentation

TabulationDiagramsGraphs

Descriptive Statistics

Measures of LocationMeasures of Dispersion

Measures of Skewness & Kurtosis

Inferential Statistiscs

Estimation Hypothesis TestingPonit estimateInteval estimate

Univariate analysis

Multivariate analysis

Page 2: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

EXAMPLE:

(1) 7,8,9,10,11 n=5, x=45, =45/5=9

(2) 3,4,9,12,15 n=5, x=45, =45/5=9

(3) 1,5,9,13,17 n=5, x=45, =45/5=9

S.D. : (1) 1.58 (2) 4.74 (3) 6.32

x

x

x

Page 3: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

3

Measures of Dispersion

Or

Measures of variability

Page 4: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

measures of dispersion summarize differences in the data, how the numbers differ from one another.

Measures of DispersionMeasures of Dispersion

Page 5: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

5

Series I: 70 70 70 70 70 70 70 70 70 70

Series II: 66 67 68 69 70 70 71 72 73 74

Series III: 1 19 50 60 70 80 90 100 110 120

Page 6: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

6

Measures of Variability

• A single summary figure that describes the spread of observations within a distribution.

Page 7: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

7

MEASURES OF DESPERSION

• RANGE

• INTERQUARTILE RANGE

• VARIANCE

• STANDARD DEVIATION

Page 8: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

8

Measures of Variability

• Range– Difference between the smallest and largest observations.

• Interquartile Range– Range of the middle half of scores.

• Variance– Mean of all squared deviations from the mean.

• Standard Deviation– Rough measure of the average amount by which observations deviate from

the mean. The square root of the variance.

Page 9: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

9

Range

• The difference between the lowest and highest values in the data set.

• The range can be misleading with outliers

data: 2,4,5,2,5,6,1,6,8,25,2

Sorted data: 1,2,2,2,3,4,5,6,6,8,25

24125 MinimumMaximumRange

Page 10: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

10

Variability Example: Range

• Hotel Rates

52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891

• Range: 891-52 = 839

Page 11: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

11

Measures of Position

Quartiles, Deciles,

Percentiles

Page 12: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

12

Q1, Q2, Q3 divides ranked scores into four equal parts

Quartiles

25% 25% 25% 25%

Q3Q2Q1(minimum) (maximum)

(median)

Page 13: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

13

Quartiles: 1 = Q n+1

th4

2 = Q 2(n+1) n+1 = th

4 2

3 = Q 3(n+1) th

4Inter quartile : IQR = Q3 – Q1

Page 14: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

14

Inter quartile Range

• The inter quartile range is Q3-Q1

• 50% of the observations in the distribution are in the inter quartile range.

• The following figure shows the interaction between the quartiles, the median and the inter quartile range.

Page 15: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

15

Inter quartile Range

Page 16: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

16

Sample Number Unsorted Values1 252 27 3 20 4 235 266 247 198 169 2510 1811 3012 2913 3214 2615 2416 2117 2818 2719 2020 1621 14

Page 17: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

17

Sample Number Unsorted Values Ranked Values1 25 142 27 163 20 164 23 185 26 196 24 20 7 19 208 16 219 25 2310 18 2411 30 2412 29 2513 32 2514 26 2615 24 2616 21 2717 28 2718 27 2819 20 2920 16 3021 14 32

Page 18: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

18

Sample Number Unsorted Values Ranked Values1 25 14 Minimum2 27 163 20 164 23 185 26 196 24 20 LQ or Q1

7 19 208 16 219 25 2310 18 2411 30 24 Md or Q2

12 29 2513 32 2514 26 2615 24 2616 21 27 UQ or Q3

17 28 2718 27 2819 20 2920 16 3021 14 32 Maximum

Page 19: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

19

D1, D2, D3, D4, D5, D6, D7, D8, D9

divides ranked data into ten equal parts

Deciles

10% 10% 10% 10% 10% 10% 10% 10% 10% 10%

D1 D2 D3 D4 D5 D6 D7 D8 D9

Page 20: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

20

Quartiles

Q1 = P25

Q2 = P50

Q3 = P75

D1 = P10

D2 = P20

D3 = P30

• • •

D9 = P90

Deciles

Page 21: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

21

Quartiles, Deciles, Percentiles

Fractiles

(Quantiles)partitions data into approximately equal parts

Page 22: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

22

Percentiles and Quartiles

• Maximum is 100th percentile: 100% of values lie at or below the maximum

• Median is 50th percentile: 50% of values lie at or below the median

• Any percentile can be calculated. But the most common are 25th (1st Quartile) and 75th (3rd Quartile)

Page 23: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

23

Locating Percentiles in a Frequency Distribution

• A percentile is a score below which a specific percentage of the distribution falls(the median is the 50th percentile.

• The 75th percentile is a score below which 75% of the cases fall.

• The median is the 50th percentile: 50% of the cases fall below it

• Another type of percentile :The quartile lower quartile is 25th percentile and the upper quartile is the 75th percentile

Page 24: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

24

NUMBER OF CHILDREN

260 26.6 26.6 26.6

161 16.4 16.5 43.1

260 26.6 26.6 69.7

155 15.8 15.9 85.6

70 7.2 7.2 92.7

31 3.2 3.2 95.9

21 2.1 2.1 98.1

11 1.1 1.1 99.2

8 .8 .8 100.0

977 99.8 100.0

2 .2

979 100.0

0

1

2

3

4

5

6

7

EIGHT OR MORE

Total

Valid

NAMissing

Total

Frequency PercentValid

PercentCumulative

Percent

50th percentile

80th percentile

50% included here

80% included

here

25th percentile

25% included here

Page 25: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

25

Page 26: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

26

Page 27: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

27

Five Number Summary

• Minimum Value

• 1st Quartile

• Median

• 3rd Quartile

• Maximum Value

Page 28: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

28

VARIANCE:

Deviations of each observation from the mean, then averaging the sum of squares of these deviations.

STANDARD DEVIATION:

“ ROOT- MEANS-SQUARE-DEVIATIONS”

Page 29: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

29

Variance

• The average amount that a score deviates from the typical score.

– Score – Mean = Difference Score

– Average of Difference Scores = 0

– In order to make this number not 0, square the difference scores (no negatives to cancel out the positives).

Page 30: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

30

Variance: Computational Formula

• Population • Sample

2

222

)(

N

XXN

2

222

)(

n

XXnS

Page 31: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

31

Standard Deviation

• To “undo” the squaring of difference scores, take the square root of the variance.

• Return to original units rather than squared units.

Page 32: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

32

Quantifying Uncertainty

• Standard deviation: measures the variation of a variable in the sample.

– Technically,

s x xN ii

N

1

12

1

( )

Page 33: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

33

Standard Deviation

• Population • Sample

Rough measure of the average amount by which observations deviate on either side of the mean. The square root of the variance.

2 s s2

(X )N

2 2)(

n

XXS

N X2 ( X )

N2

22

2

2 )(

n

XXnS

Page 34: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

Example:

-1 1

3 9

-2 4

-3 9

2 4

1 1

Data: X = {6, 10, 5, 4, 9, 8}; N = 6

Total: 42 Total: 28

Standard Deviation:

76

42

N

XX

Mean:

Variance:2

2 ( ) 284.67

6

X Xs

N

16.267.42 ss

XX 2)( XX X

6

10

5

4

9

8

Page 35: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

35

Calculating a Mean and a Standard Deviation

Absolute SquaredData Deviation Deviation Deviation

x x - Mean |x - Mean| (x-Mean)²10 -20 20 40020 -10 10 10030 0 0 040 10 10 10050 20 20 400

Sums 150 0 60 1000Means 30 0 12 200

Variance

14.1421356Standard deviation = Variance

Page 36: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

36

Example of SD with discrete data

• Marks achieved by 7 students: 3, 4, 6, 2, 8, 8, 5

• Mean of these marks = 36/7 = 5.14

• Deviations from mean…

x x-x

3 3 - 5.14= -2.14

4 4 - 5.14= -1.14

6 6 - 5.14= 0.86

2 2 - 5.14= -3.14

8 8 - 5.14= 2.86

8 2.86

5 5 - 5.14= -0.14

Total = 0Problem! The sum of the deviations is always going to be 0!

Solution! Square them to get rid of the negatives…

(x – x)2

Page 37: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

37

Example of SD with discrete data

• Marks achieved by 7 students: 3, 4, 6, 2, 8, 8, 5

• Mean of these marks = 36/7 = 5.14

• Deviations from mean…

x x-x

3 3 - 5.14= -2.14 4.59

4 4 - 5.14= -1.14 1.31

6 6 - 5.14= 0.86 0.73

2 2 - 5.14= -3.14 9.88

8 8 - 5.14= 2.86 8.16

8 2.86 8.16

5 5 - 5.14= -0.14 0.02

Total = 0

(x – x)2

Total = 32.85

Variance = 32.85 / 7 = 4.69

SD = √4.69 = 2.17

Page 38: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

38

Variability Example: Standard Deviation

Mean: 6

Standard Deviation: 20.2

0.4

100

36004000

10

)60()400(102

2

S

S

S

S

0.210

40

10

)69()68()68()67()67()66()64()64()64()63(

)(

2222222222

2

S

S

n

XXS

2

2

2 )(

n

XXnS

X X2

3 94 164 164 166 367 497 498 648 649 81

Sum: 60 Sum: 400

Page 39: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

39

Page 40: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

40

Page 41: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

41

Page 42: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

42

Mean and Standard Deviation

• Using the mean and standard deviation together:– Is an efficient way to describe a distribution with just two numbers.

– Allows a direct comparison between distributions that are on different scales.

Page 43: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

43

WHICH MEASURE TO USE ?

DISTRIBUTION OF DATA IS SYMMETRIC

---- USE MEAN & S.D.,

DISTRIBUTION OF DATA IS SKEWED

---- USE MEDIAN & QUARTILES

Page 44: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

44

Mean, Median and Mode

Page 45: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

45

Distributions

• Bell-Shaped (also known as symmetric” or “normal”)

• Skewed:– positively (skewed to the right) –

it tails off toward larger values

– negatively (skewed to the left) – it tails off toward smaller values

Page 46: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

46

15 30 45 60 75

95% Confidence Interval for Mu

28 33 38 43

95% Confidence Interval for Median

Variable: Age

A-Squared:P-Value:

MeanStDevVarianceSkewnessKurtosisN

Minimum1st QuartileMedian3rd QuartileMaximum

32.3851

13.3380

28.0000

0.9620.014

36.450015.7356247.608

0.6796268.51E-02

60

11.000025.000031.500046.750079.0000

40.5149

19.1921

42.0000

Anderson-Darling Normality Test

95% Confidence Interval for Mu

95% Confidence Interval for Sigma

95% Confidence Interval for Median

Descriptive Statistics

Page 47: INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures

47

ANY QUESTIONS