section 1 topic 31 summarising metric data: median, iqr, and boxplots

19
Section 1 Topic 3 1 Section1 Topic 3 Summarising metric data: Median, IQR, and boxplots

Upload: magdalen-short

Post on 01-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 1

Section1 Topic 3

Summarising metric data:Median, IQR, and boxplots

Page 2: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 2

Summarising metric data: Median, IQR & Box Plots

Can we describe a distribution with just one or two numbers?

What is the median, how is it calculated and what does it tell us?

What is the interquartile range, how is it calculated and what does it tell us?

What is a five number summary? What is a box plot and why is it

useful?

Page 3: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 3

Will less than the whole picture do?

Summary Statistics Measures of centre

Median Mean

Measures of spread Range Interquartile Range Standard Deviation

Page 4: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 4

Median3 5 1 4 8

Firstly numerically order the data set

1 3 4 5 8

50% higher than or equal to median

50% lower than or equal to median

Location of Median = (n+1)/2

= (5+1)/2

= 3rd observationNotes p.97

Page 5: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

For an odd number of data values the median will be one of the data values

1 3 4 5 8

Median = 4

For an even number of data values the median may not coincide with an actual data value

3 4 5 8

Median = 4.5

Location of Median = (4+1)/2

= (5)/2

= 2.5 observation

Page 6: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 6

Limitations: Range Depends on only two extreme values.

Data set 1 5 6 7 8 9 10 11 12 Range = 12 - 5 = 7 Data set 2 5 12 12 12 12 12 12 12

Page 7: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 7

Interquartile range

Quartiles are the points that divide a distribution into quarters

Q1 Q2 Q3

25% 50% 75%Median

IQR = Q3 - Q1

The interquartile range (IQR) is defined to be the spread of the middle 50% of data values, so that

Notes p.99

Page 8: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 8

Why is the IQR more useful that the range?

IQR describes the middle 50% of observations.

Upper 25% and lower 25% of observations are discarded.

IQR generally not affected by outliers.

Page 9: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 9

Fre

qu

ency

0

2

4

6

8

10

12

14

bottom 25% middle 50% top 25%

Q 1

Q 2

Q 3

Picturing quartiles with histogram

Notes p.97

Page 10: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 10

Five number summary

Minimum value, Q1, Median, Q3, Maximum value

Page 11: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 11

The BoxplotGraphical representation of five number summary

Notes p.98

Page 12: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 12

Constructing a Boxplot

Notes p.99

Page 13: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 13

*Exercise 4

Notes p.103

Page 14: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 14

Q1 Q3M

For a symmetric distribution, the box plot is also symmetric. The median

is in the middle of the box and the whiskers are approximately equal in

length.

Relating a boxplot to the shape of the distribution : Symmetric

Notes p.104

Page 15: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 15

Positively skewed distributions

Q1 Q3M

positive skew

The box plot of a positively skewed distribution has the median off-centre

and to the left. The left hand whisker will be short, while the right hand

whisker will be long reflecting the gradual tailing off data values to the

right.

Page 16: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 16

Q3Q1 M

negative skew

The box plot of a negatively-skewed distribution has the median off-centre

and to the right. The right hand whisker will be short, while the left hand

whisker will be long reflecting the gradual tailing off data values to the left.

Negatively skewed distributions

Page 17: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 17

Boxplot with outliers Possible outliers defined as any values

outside of the interval

(Q1-1.5 X IQR, Q3 + 1.5 X IQR)

We say possible, since the point may just be part of the tail of the distribution but we may not have enough data to be sure

Notes p.101

Page 18: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 18

Boxplot with outliers

Min Q1 M Q3 Max

38 63 70 75 76

Page 19: Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots

Section 1 Topic 3 19

*Exercise 5

Notes p.107