2. basic statistics

30
Steve Saffhill Research Methods in Sport & Exercise Basic Statistics

Upload: steve-saffhill

Post on 19-Jan-2017

191 views

Category:

Education


0 download

TRANSCRIPT

Page 1: 2. basic statistics

Steve Saffhill

Research Methods in Sport & ExerciseBasic Statistics

Page 2: 2. basic statistics

Analysing Quantitative Data

• Data collected on its own does not answer your research question(s)

• The data needs to be interpreted – to do this we need to organise it and analyse it!

• Most students panic at this stage – statistics work!!!!

• Have you heard of terms in your reading of “multiple regression”, “t-test”, “ANOVA” etc.

Page 3: 2. basic statistics

Statistics• The more exercise classes attended by an individual the fitter they get.

• There is a positive and statistically significant relationship between student attendance and academic achievement

• There is a positive and statistically significant relationship between 5km and 10km PB’s

• There is a statistically significant difference in mortality rates and the ownership of red cars compared to other car colours

Page 4: 2. basic statistics

• Statistics can be split into 2 forms:

1. Descriptives – organise the data...describe it!2. Inferentials – allow you to make inferences...what does it mean?

• You need to ask yourself:1. What exactly do I need to find out from this data to answer my

research question?2. What statistical test will give me this info?3. What do the results from this test mean?

• Statistics themselves have no meaning! The importance lies in how you interpret them!!!

Page 5: 2. basic statistics

Computer Software

• Most common to use software to conduct the tests

• Most common is SPSS (statistical package for the social sciences).

• Globally used in health and social sciences!

• Knowledge of SPSS is a very valuable transferable skill to have….C.V!

Page 6: 2. basic statistics

Statistical Inference

• Work of statistician = making predictions about a group based on collected data from a small sample of that population

• (exploring differences, relationships and making statements about the meaningfulness)

• Stats allow us to make a statement and then cite the odds that it is correct!

• Computers make stats easier: organise, analyse, and display data much faster than we can.

• BUT, the PC is only an extension of you. It will only perform if you enter the data correctly and understand its output.

• Before a PC is useful to you, you must know what you want it to do and what is expected. That is where this module comes in!

Page 7: 2. basic statistics

Descriptive statistics• Important to know the full range of information for different

variables• PB, seasons best, distance, age, BMI etc

• We need to know the mean, median & mode to describe the data!

• 1st type: central tendency (M, M, M) • If these are known we can interpret the value of a single

score (e.g., 2nd place) by comparing it to the mean, median and mode!

Page 8: 2. basic statistics

• Suppose we collect data on the monthly wage of employees from two specific companies.

Company one Company two

1000 18901200 21351215 7861300 980990 1200875 7681345 1000

1000 18901200 21421215 1390850 9800970 12001875 32561345 1000

Such RAWDATA on itsown is notparticularlyinformative

We usuallyneed toSUMMARISE/DESCRIBEour data.

Central Tendency

Page 9: 2. basic statistics

• Mean (m)– Average of scores of a particular set of scores

• Median – Central value (mid-point)– E.g., if weekly hrs spent training for a sport were 2, 2, 4,

5, 6, 10, 10, 11, 15 the median = 6– If you have two groups

(e.g., males & females, high v low fear of failure) you can calculate a median split – to make a comparison!

• Mode– Most frequent number

(e.g., most common age for people who drop out from sport)

Page 10: 2. basic statistics

Name Club Annual SalaryRobin van Persie Arsenal £4.5 millionDarren Bent Aston Villa £3.5 millionNicolas Anelka Chelsea £3 millionDidier Drogba Chelsea £3.5 millionMario Balotelli Manchester City £4 millionMichael Owen Manchester United £2.5 millionCarlos Tevez Manchester City £13 millionTuncay Sanli Stoke City £3 millionDarren Bent Sunderland £2.25 millionJermain Defoe Tottenham £3 million

£42.25m Total: Mean = £4.225m

• What has happened is that an outlier (extreme value) has distorted the information carried by the mean.

However, Averages can be distorted by outliers!

Page 11: 2. basic statistics

The Median• In some instances the MEDIAN maybe a better measure of

central tendency• The median is basically the central value of a data set when

that set is numerically ordered• Sometimes the median is very simple to find...

£100,000 £125,000 £200,000 £225,000 £300,000

Page 12: 2. basic statistics

The Median• At times, the median can be slightly more difficult to calculate

due to the fact that there may not be a single middle value• In such instances we take the MEAN of the TWO MIDDLE

VALUES to be the MEDIAN

£100,000 £150000 £200,000 £250,000 £300,000 £500000 £225,000

Page 13: 2. basic statistics

The Mode

• The mode is the most frequently occurring value in a given data set:

• 7, 4, 8, 8, 9, 2, 4, 5, 7, 8, 4, 8, 8, 6Here the mode would be 8

• But what about this data set?• 3, 4, 5, 6, 4, 5, 4, 5, 6, 9, 1, 2

Such data sets obviously have two modes. These are usually referred to as bi-modal (4 & 5).

Page 14: 2. basic statistics

Numerical• They convey info about the degree of your measure….

Graphical = e.g., box plot• Contains detailed info about the distribution of scores• Usual to use both in your study!

Information can be presented as...

Page 15: 2. basic statistics

Measures of Dispersion/Variance• Central tendency alone does not always provide an adequate

summary of our data– the dispersion or variability of scores within a data set give

us supplementary information about the data

• We often need an idea of how each of our data values vary around the central measure

• For example, we might know the mean of a data set, but people might vary quite dramatically around that central value

• Suppose the manager, asked for a comparison of the wages of his four most featured strikers and his four most featured midfielders...

Page 16: 2. basic statistics

Strikers £/week Midfielders £/week

A £130,000 1 £175,000

B £250,000 2 £160,000

C £125,000 3 £150,000

D £125,000 4 £150,000

MEAN = £157,500 MEAN = £158,750

If we present the manager with the means alone, we do not give him the full story…• Clearly the variability in the first data set far outweighs that of the second.

Page 17: 2. basic statistics

Variability of Data• A statistic that allows the spread (dispersion) of the data

to be appreciated is the range.

• The range is simply the difference between the smallest and largest values in the data set.

Page 18: 2. basic statistics

Range

RANGE = £125,000 RANGE = £25,000

Strikers £/week Midfielders £/week

A £130,000 1 £175,000

B £250,000 2 £160,000

C £125,000 3 £150,000

D £125,000 4 £150,000

MEAN = £157,500 MEAN = £158,750

Page 19: 2. basic statistics

BUT…..The range alone does not tell us the full story of how much variability there is on average around the mean

Standard deviation does.

• It is a measure of the extent to which scores deviate from the mean• You will very frequently see it mentioned in research papers:Descriptive statistics suggested that males (M = 4.4, SD = 0.8) had higher levels of confidence than females (M = 3.6, SD = 0.5)

• If SD is large, then the Mean may not be a good representation

Page 20: 2. basic statistics

• Say two samples have identical means:• BUT....They can have different standard deviations (Spread

of scores around the mean)

• This tells the researcher that the measures from the sample with the larger standard deviation are likely to deviate further from the mean score to a greater extent

• i.e., the scores are more spread out.

Page 21: 2. basic statistics

Presenting Descriptive Statistics

• Generally presented in tables and graphs

• Tables should be included where the information is appropriate to the research question

• Include notation to show significance

Page 22: 2. basic statistics

Using SPSS to find out your descriptive data!

Page 23: 2. basic statistics

Coding Data: to find descriptives of groups

• SPSS only deals with numbers and NOT words!!!

• Sometimes (quite often in sport) we need to CODE our data• Coding = translating responses into common categories

each with an assigned numerical value to allow you to run some statistics

It is very easy!

• If you get non-numerical data (e.g., gender, level of participation, sport played etc) you need to give each group a code number (e.g., 1 for male 2 for female).

Page 24: 2. basic statistics

For example...• All males are coded 1 and females 0• All football players are coded 0, rugby players 1 and hockey

players 2 etc.• The computer then knows what is 0 and what is 1• Then when you run your descriptives, SPSS will be able to

give you them for each group and not just the sample as a whole. Therefore it allows you to compare...

• Then when you run the inferential statistics you can actually really compare the results!

Page 25: 2. basic statistics
Page 26: 2. basic statistics

Descriptives

26.5625 1.3290623.7297

29.3953

26.513924.000028.263

5.3162520.0034.0014.0011.25.544 .564

-1.370 1.09156.5238 .8608454.7281

58.3195

56.859858.000015.562

3.9448647.0060.0013.006.00

-1.416 .5011.445 .972

MeanLow er BoundUpper Bound

95% Conf idenceInterval for Mean

5% Trimmed MeanMedianVarianceStd. DeviationMinimumMaximumRangeInterquartile RangeSkew nessKurtosisMean

Low er BoundUpper Bound

95% Conf idenceInterval for Mean

5% Trimmed MeanMedianVarianceStd. DeviationMinimumMaximumRangeInterquartile RangeSkew nessKurtosis

Grouping variableexercise dependent butgood body image/no MD

Exdependent and MD

Social Pysique AnxietyStatistic Std. Error

Page 27: 2. basic statistics

Inferential Statistics• Used to draw inferences (logical conclusion) about a

population from a sample

• E.G. We want to explore the effects of sleep deprivation on performance.

• 10 subjects who performed a task post 24hrs of sleep deprivation scored 12pts less than 10 subjects who performed task after ‘normal’ sleep.

• Is the difference real or due to chance?

• Significant differences tests :- t-test, ANOVA etc• Tests of association:- correlation

Page 28: 2. basic statistics

= most common inferential tests for you!

Page 29: 2. basic statistics

2 Types of Inferential Tests• Inferential tests test a null hypothesis (i.e., there will be no

relationship or difference between two variables).

1. Parametric tests – used on data that meet a strict criteria

2. Non-Parametric tests - used on data that do not meet the strict criteria

• We will be exploring these criteria next week!

Page 30: 2. basic statistics

Summary

• Statistics used to describe data (descriptive stats)

• Also used to discern what data mean (inferential)

• The type of test used determined by experimental design

• First step in data analysis is exploring the data

• What is the effect of one variable on another