statisticsforbiologists colstons

47
BIOLOGY Spacebar to contin

Upload: andymartin

Post on 20-Jun-2015

180 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Statisticsforbiologists colstons

BIOLOGY

Spacebar to continue

Page 2: Statisticsforbiologists colstons

Introduction• Biological studies deal with organisms

which show variety

• We cannot rely on a single measurement and so we must take a sample

• This sample of data must be summarised and analyzed to find out if it is reliable

Spacebar to continue

Page 3: Statisticsforbiologists colstons

Summarising data• MEAN Sum of samples ÷ sample size

x ÷ n

• MEDIAN Middle number in a list when arranged in rank order: 2, 5, 7, 7, 8, 23, 31

• MODE The measurement which occurs most frequently ; 2, 5, 7, 7, 8, 23, 31

Spacebar to continue

Page 4: Statisticsforbiologists colstons

Distribution Curves• A visual summary of data

• They can be produced by;1. Collect data

2. Split results into equal size classes

3. Make a tally chart

4. Plot a histogram of frequency against size class

• Data can show normal distribution or skewed distribution

Spacebar to continue

Page 5: Statisticsforbiologists colstons

Distribution curves

• Normal distribution• Symmetrical bell

shaped curve around the mean

• Use parametric tests to analyse data

0

2

4

6

8

10

12

14

16

Spacebar to continue

Page 6: Statisticsforbiologists colstons

Distribution curves

• Skewed data• Asymmetrical curve

around the mode• Use non-parametric

tests to analyse data

0

2

4

6

8

10

12

14

16

18

Spacebar to continue

Page 7: Statisticsforbiologists colstons

Standard Deviation

• Standard deviation (SD) is a measure of the spread of the data

Large SDSmall SD

Page 8: Statisticsforbiologists colstons

Standard deviation

• A high SD indicates data which shows great variation from the mean

• A low SD indicates data which shows little variation from the mean value

• By definition, 68% of all data values lie within the range MEAN 1SD

• 95% of all values lie within 2SD

Spacebar to continue

Page 9: Statisticsforbiologists colstons

SD and confidence limits

0

2

4

6

8

10

12

14

68%

95%

Page 10: Statisticsforbiologists colstons

Calculating SD

• Can only be used for normally distributed data

• Calculate as follows;– Sum the values for x2 ie (x2) – Sum the values for x, then square it ie (x)2

– Divide (x)2 by n– Take one from the other and divide by n– Take the square root of this. (see hand-out)

Spacebar to continue

Page 11: Statisticsforbiologists colstons

Calculating SD

Spacebar to continue

S = x2 - ((x)2/n)

n

Page 12: Statisticsforbiologists colstons

Confidence limits

• 95% of all values lie within 2SD of the mean

• Any value which lies outside this range is said to be significantly different from the others

• We say that we are working to 95% confidence limits or to a 5% significance level.

Spacebar to continue

Page 13: Statisticsforbiologists colstons

Comparison tests

• To compare two samples of data we look at the overlap between the two distribution curves.

• This depends on;– The distance between the two mean values– The spread of each sample (standard deviation)

• The greater the overlap, the more similar the two samples are.

Spacebar to continue

Page 14: Statisticsforbiologists colstons

Comparison tests

Spacebar to continue

MeanMean

Sample 2OverlapSample 1

Page 15: Statisticsforbiologists colstons

Comparison tests

Spacebar to continue

Sample 2OverlapSample 1

When the SD is small, the overlap is less;

Page 16: Statisticsforbiologists colstons

The null hypothesis

• In order to compare two sets of data we must first assume that there is no difference between them.

• This is called the null hypothesis

• We must also produce an alternative hypothesis which states that there is a difference.

Spacebar to continue

Page 17: Statisticsforbiologists colstons

The t-test

• Used to compare the overlap of two sets of data

• Samples must show normal distribution

• Sample size (n) should be greater than 30

• This tests for differences between two sets of data

Spacebar to continue

Page 18: Statisticsforbiologists colstons

The t-test

• To calculate t;– Check data is normally distributed by drawing a

tally chart

– Work out difference in means |x1 – x2|

– Calculate variance for each set of data (this is s2 ÷ n)

– Put these into the equation for t:

Spacebar to continue

Page 19: Statisticsforbiologists colstons

The t-test

Spacebar to continue

t =

|x1 – x2|

s12 s2

2

n1 n2

Page 20: Statisticsforbiologists colstons

The t-test

• Compare the value of t with the critical value at n1 + n2 – 2 degrees of freedom

• Use a probability value of 5%• If t is greater than the critical value we can

reject the null hypothesis…• … there is a significant difference between the

two sets of data • … there is only a 5% chance that any

similarity is due to chance

Page 21: Statisticsforbiologists colstons

Mann-Whitney u-test

• Compares two sets of data

• Data can be skewed

• Sample size can be small; 5<n<30

• For details refer to stats book

Spacebar to continue

Page 22: Statisticsforbiologists colstons

Chi squared

• Some data is categoric• This means that it belongs to one or more

categories• Examples include

– eye colour – presence or absence data– texture of seeds

• For these we use a chi squared test 2

• This tests for an association between two or more variables

Page 23: Statisticsforbiologists colstons

Chi squared

• Draw a contingency table

• These are the observed values

Blue eyes Green eyes Row totals

Fair hair a b a+b

Ginger hair c d c+d

Column totals

a+c b+d a+b+c+d

Page 24: Statisticsforbiologists colstons

Chi squared

• Now work out the expected values:

• Where,

E =(Row total) x (Column total)

(Grand total)

Page 25: Statisticsforbiologists colstons

Chi squared

Blue eyes Green eyes Row totals

Fair hair(a+b)(a+c)

(a+b+c+d)

(a+b)(b+d)

(a+b+c+d)a+b

Ginger hair(c+d)(a+c)

(a+b+c+d)

(c+d)(b+d)

(a+b+c+d)c+d

Column totals

a+c b+d a+b+c+d

Page 26: Statisticsforbiologists colstons

Chi squared

• For each box work out (O-E)2 ÷ E

• Find the sum of these to get 2

2 =(O-E)2

E

Page 27: Statisticsforbiologists colstons

Chi squared

• Compare 2 with the critical value at 5% confidence limits

• There will be (no. rows – 1) x (no. columns – 1)

degrees of freedom

• If 2 is greater than the critical value we can say that the variables are associated with one another in some way

• We reject the null hypothesis

Page 28: Statisticsforbiologists colstons

Spearman Rank

• Two sets of data may show a correlation

• The data can be plotted on a scatter graph:

Positive correlation No correlationNegative correlation

Page 29: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12

14

18

18

Data 2 Rank

24

29

29

38

Page 30: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14

18

18

Data 2 Rank

24

29

29

38

This is the Lowest value – So we call it rank 1

Page 31: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18

18

Data 2 Rank

24

29

29

38

This is the 2nd lowestvalue – so we call it rank 2

Page 32: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 ?

18 ?

Data 2 Rank

24

29

29

38

These should be rank 3 & 4 – but they are the same. We find the average of 3 + 4 and give them this rank

Page 33: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 3.5

18 3.5

Data 2 Rank

24

29

29

38(3+4)/2 = 3.5

Page 34: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 3.5

18 3.5

Data 2 Rank

24

29

29

38

Similarly on thisside

Page 35: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 3.5

18 3.5

Data 2 Rank

24 1

29

29

38

Page 36: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 3.5

18 3.5

Data 2 Rank

24 1

29 2.5

29 2.5

38

The averageof 2 & 3

Page 37: Statisticsforbiologists colstons

Spearman Rank

• We calculate the correlation by assigning a rank to the values:

Data 1 Rank

12 1

14 2

18 3.5

18 3.5

Data 2 Rank

24 1

29 2.5

29 2.5

38 4

Page 38: Statisticsforbiologists colstons

Spearman Rank

• Find the difference D between each rank

• Square this difference

• Sum the D2 values

• Calculate the Spearman Rank Correlation Coefficient rs

rs = 1 -6D2

n(n2-1)

Page 39: Statisticsforbiologists colstons

Spearman Rank

• Compare rs with the critical value at the 5% level

• If it is greater than the critical value (ignoring the sign) then we reject the null hypothesis

• … there is a significant correlation between the two sets of data

• If the value is positive there is a positive correlation

• If it is negative then there is a negative correlation

Page 40: Statisticsforbiologists colstons

Quick guide

Is your data interval data or is it categoric data (it can only be placed in a number of categories)

IntervalInterval CategoricCategoric

Page 41: Statisticsforbiologists colstons

Quick guide

Are you looking for a correlation between two sets of data – eg the rate of photosynthesis and light intensity

YesYes NoNo

Page 42: Statisticsforbiologists colstons

Quick guide

Use the Chi squared test

BackBack EndEnd Chi squaredChi squared

Page 43: Statisticsforbiologists colstons

Quick guide

Use the Spearman Rank test

BackBack EndEnd Chi squaredChi squared

Page 44: Statisticsforbiologists colstons

Quick guide

Are you comparing data from two populations?

YesYes NoNo

Page 45: Statisticsforbiologists colstons

Quick guide

Is your data normally distributed?

YesYes NoNo

0

2

4

6

8

10

12

14

16

Page 46: Statisticsforbiologists colstons

Quick guide

Use a t-test

t-testt-test BackBack

Page 47: Statisticsforbiologists colstons

Quick guide

Use a Mann-Whitney U test

BackBack ExitExit