descriptive statistics roger l. brown, ph.d. medical research consulting middleton, wi online course...

37
Descriptive Statistics Roger L. Brown, Ph.D. Roger L. Brown, Ph.D. Medical Research Medical Research Consulting Consulting Middleton, WI Middleton, WI Online Course #1

Upload: darrell-nash

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Descriptive Statistics

Roger L. Brown, Ph.D.Roger L. Brown, Ph.D.

Medical Research ConsultingMedical Research Consulting

Middleton, WIMiddleton, WI

Online Course #1

Page 2: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

This online course is a FREE This online course is a FREE service to all MRC clientsservice to all MRC clients

Page 3: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Purpose of this seriesPurpose of this series

To assist researchers in the To assist researchers in the interpretation and application of interpretation and application of

statistical analysesstatistical analyses

Page 4: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Statistics ?Statistics ?

The Science of collecting, The Science of collecting, organizing, analyzing, organizing, analyzing,

interpreting and presenting datainterpreting and presenting data

Page 5: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Topics we will reviewTopics we will review

• Descriptive Statistics

• Frequency Distributions and Histograms

Relative / Cumulative Frequency

• Measures of Central Tendency

Mean, Median, Mode, Midrange

Page 6: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Topics (continued)

• Measures of Dispersion (Variation) Range, Standard Deviation, Variance and Coefficient of variation• Shape Symmetric, Skewed, using Box-and- Whisker Plots• Quartile• Statistical Relationships Correlation , Covariance

Page 7: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

A collection of quantitative measures and

ways of describing data. This includes:

Frequency distributions & histograms, measures of central tendency

and

measures of dispersion

Descriptive Statistics

Page 8: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Descriptive Statistics

•Collect Data e.g. Survey

•Present Data e.g. Tables and Graphs

•Characterize Data e.g. Mean

nx i

A Characteristic of a: Population is a Parameter

Sample is a Statistic.

Page 9: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Collection of DataCollection of Data

Survey/questionnaires/interviewsSurvey/questionnaires/interviewsDirect observationDirect observationSecondary data source (e.g., Medical charts)Secondary data source (e.g., Medical charts)

Page 10: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Presenting DataPresenting DataGraphicsGraphics

The visual representation of data may be used not The visual representation of data may be used not only to present results/findings in the data, but only to present results/findings in the data, but may also be used to learn about the data.may also be used to learn about the data.

Page 11: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Summary Measures in Descriptive Statistics

Central Tendency

Mean

Median

Mode

Midrange

Quartile

Summary Measures

Variation

Variance

Standard Deviation

Coefficient of Variation

Range

Page 12: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Measures of Central Tendency

Central Tendency

Mean Median Mode

Midrange

Page 13: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

The Mean (Arithmetic Average)

•It is the Arithmetic Average of data values:

•The Most Common Measure of Central Tendency

•Affected by Extreme Values (Outliers)

n

xn

1ii

n

xxx n2i

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6

xSample Mean

Page 14: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

The Median

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5

•Important Measure of Central Tendency

•In an ordered array, the median is the “middle” number.

•If n is odd, the median is the middle number.•If n is even, the median is the average of the 2

middle numbers.•Not Affected by Extreme Values

Page 15: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

The Mode

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

•A Measure of Central Tendency•Value that Occurs Most Often•Not Affected by Extreme Values•There May Not be a Mode•There May be Several Modes•Used for Either Numerical or Categorical Data

0 1 2 3 4 5 6

No Mode

Page 16: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Midrange

•A Measure of Central Tendency

•Average of Smallest and Largest

Observation:

•Affected by Extreme Value

2

xx smallestestl arg

Midrange

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

Midrange = 5 Midrange = 5

Page 17: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Summary Measures in Descriptive Statistics

Central Tendency

Mean

Median

Mode

Midrange

Quartile

Summary Measures

Variation

Variance

Standard Deviation

Coefficient of Variation

Range

Page 18: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

QuartilesQuartiles

Not a Measure of Central Tendency Split Ordered Data into 4 Quarters

Position of i-th Quartile: position of point

25% 25% 25% 25%

Q1 Q2 Q3

Q i(n+1)i 4

Data in Ordered Array: 11 12 13 16 16 17 18 21 22

Position of Q1 = 2.50 Q1 =12.5= 1•(9 + 1)4

Page 19: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

QuartilesQuartiles

Not a Measure of Central Tendency Split Ordered Data into 4 Quarters

Position of i-th Quartile: position of point

25% 25% 25% 25%

Q1 Q2 Q3

Q i(n+1)i 4

Data in Ordered Array: 11 12 13 16 16 17 18 21 22

Position of Q3 = 7.50 Q3 =19.5= 3•(9 + 1)4

Page 20: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Summary Measures

Central Tendency

Mean

Median

Mode

Midrange

Quartile

Summary Measures

Variation

Variance

Standard Deviation

Coefficient of Variation

Range

Page 21: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Measures of Dispersion (Variation)

Variation

Variance Standard Deviation Coefficient of Variation

PopulationVariance

Sample

Variance

PopulationStandardDeviationSample

Standard

Deviation

Range

Page 22: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Understanding Variation

• The more Spread out or dispersed data

the larger the measures of variation

• The more concentrated or homogenous the data the smaller the measures of variation

• If all observations are equal

measures of variation = Zero

• All measures of variation are Nonnegative

Page 23: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

• Measure of Variation

• Difference Between Largest & Smallest Observations:

Range =

• Ignores How Data Are Distributed:

The Range

SmallestrgestLa xx

7 8 9 10 11 12

Range = 12 - 7 = 5

7 8 9 10 11 12

Range = 12 - 7 = 5

Page 24: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

•Important Measure of Variation

•Shows Variation About the Mean:

•For the Population:

•For the Sample:

Variance

N

X i

22

1

22

n

XXs i

For the Population: use N in the denominator.

For the Sample : use n - 1 in the denominator.

Page 25: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

•Most Important Measure of Variation

•Shows Variation About the Mean:

•For the Population:

•For the Sample:

Standard Deviation

N

X i

2

1

2

n

XXs i

For the Population: use N in the denominator.

For the Sample : use n - 1 in the denominator.

Page 26: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Sample Standard Deviation

1

2

n

XX i For the Sample : use n - 1 in the denominator.

Data: 10 12 14 15 17 18 18 24

s =

n = 8 Mean =16

18

1624161816171615161416121610 2222222

)()()()()()()(

= 4.2426

s

:X i

Page 27: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Comparing Standard Deviations

1

2

n

XX is =

= 4.2426

N

X i

2 = 3.9686

Value for the Standard Deviation is larger for data considered as a Sample.

Data : 10 12 14 15 17 18 18 24:X i

N= 8 Mean =16

Page 28: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Comparing Standard Deviations

Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5 s = .9258

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5 s = 4.57

Data C

Page 29: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Coefficient of VariationCoefficient of Variation

Measure of Relative Variation

Always a %

Shows Variation Relative to Mean

Used to Compare 2 or More Groups

Formula ( for Sample):

100%

X

SCV

Page 30: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Comparing Coefficient of VariationComparing Coefficient of Variation

Group A: Average Health Measure = 50 Standard Deviation = 5 Group B: Average Health Measure = 100 Standard Deviation = 5

100%

X

SCV

Coefficient of Variation:

Group A: CV = 10%

Group B: CV = 5%

Page 31: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

ShapeShape

Describes How Data Are Distributed Measures of Shape: Symmetric or skewed

Page 32: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

ShapeShape

Describes How Data Are Distributed Measures of Shape: Symmetric or skewed

SymmetricMean = Median = Mode

-0.5 <0 < 0.5

Page 33: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

ShapeShape

Describes How Data Are Distributed Measures of Shape: Symmetric or skewed

Left-Skewed SymmetricMean = Median = ModeMean Median Mode

< -1 -0.5 <0 < 0.5

Page 34: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

ShapeShape

Describes How Data Are Distributed Measures of Shape: Symmetric or skewed

Right-SkewedLeft-Skewed SymmetricMean = Median = ModeMean Median Mode Median MeanMode

< -1 > 1 -0.5 <0 < 0.5

Negatively Skewed Positively Skewed

Page 35: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Box-and-Whisker PlotBox-and-Whisker Plot

Graphical Display of Data Using5-Number Summary

Median

4 6 8 10 12

Q3Q1 XlargestXsmallest

Page 36: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

Distribution Shape & Distribution Shape & Box-and-Whisker PlotsBox-and-Whisker Plots

Right-SkewedLeft-Skewed Symmetric

Q1 Median Q3Q1 Median Q3 Q1

Median Q3

Page 37: Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1

SummarySummary Discussed Measures of Central Tendency Mean, Median, Mode, Midrange Quartiles Addressed Measures of Variation The Range, Interquartile Range, Variance, Standard Deviation, Coefficient of Variation Determined Shape of Distributions Symmetric, Skewed, Box-and-Whisker Plot

Mean = Median = ModeMean Median Mode Mode Median Mean