this is a teacher caught dozing whilst planning a lesson on reliability and validity

This is a teacher caught dozing whilst planning a lesson on reliability and validity

It’s my favourite part of the

syllabus…honest. Would a teacher

lie to you?

I can reliably predict that

you’re all yawning already

Y

Right, own up. Who wrote that? Detention for a week! Yes, we know it

seems boring...

Reliability and

validity suck!

Anything...I’ll do anything but reliability and validity. Detention

for a year would be lovely or clearing out your cat’s litter tray

…but reliability and validity underpin

everything that we do as scientists – without

them our research would be worthless!

Sorry, not convinced yet, and I had a very late

night, so I have a valid reason for falling asleep

in this lesson...

Did someone say they wanted to

talk about salad?

Valid, not salad!!!!!! We need to find out

if our research is sound. Do our tests measure what they claim to measure?

If a measurement is not reliable, then our research cannot be

valid or ‘true’

Are techniques used to collect data in tests,

questionnaires, interviews and

observations measuring what is claimed?

Can we trust any effect that has been found to

be the result of manipulating our

independent variable and not from another

unwanted variable?

We could just measure

something using a piece of string,

but it wouldn’t be very accurate

We need to be able to measure

or observe something time after time and produce the

same or similar results

I want to measure intelligence. If the same person sits the test on several occasions and the results change each time, then that test

lacks reliability

The test also lacks validity because the scores are meaningless

If I test my participants again several months later and their scores remains consistent, I can

say the test is reliable, but it might still lack validity. The second score might just be

measuring what a person has learned since taking the first test.

Does a driving test measure your competence to drive on the road or is it a measure of your ability to pass the driving test? Would you be able to pass it again in six

months time? Would you do better? Is it a reliable and valid test?

L

This measures consistency from one occasion to another – the same result should be found on different days, in different

labs , observations or interviews, by different researchers

I exposed these teenage brain cells to 1000 PowerPoint slides last

Monday and they’re all dead

I thought that was a fluke but they seem to be shrivelling after only five

minutes!

Participants take the same test on different occasions – a high correlation between test scores indicates the test has good external reliability .

Timing is crucial. Why?

January June

I hope that’s the right

answer this time

This measures the extent to which a test or procedure is consistent within itself, i.e., questionnaire items or questions

in an interview should all be measuring the same thing

Do you like to keep to deadlines?Do you get impatient driving?

Do you like cheese?Do you like doing several tasks at once?

Do you like chocolate?Do you get easily irritated?

Are you competitive?

This interviewer seems a little confused about

Type A personality traits

Compares a participant’s performance on two halves of a test or questionnaire – there should be a close correlation between scores on both halves of the test.

Questions in both halves should be of equal quality for good internal reliability.

Odds/Evens Top/Bottom

This refers to the consistency of a researcher’s behaviour.A researcher should produce similar test results, or make similar observations or

carry out interviews in the same way on more than one occasion.

Thanks for taking part today. Any

problems and I’ll be right over. Take

your time.

Right. Let’s get on. Fast as you can.

How much longer before I can get in the pub and relax

my facial muscles?

In observational studies this is known

as inter-observer reliability – observers have to agree on what they see and carry out the same procedure

Consistency between different researchers working on the some

study is very important for reliability

There should be a high positive

correlation between the

scores of different

observers

Today’s training session for researchers:1. Increase reliability by standardising

instructions2. Carry out a pilot study to improve

procedures and materialsYou will be thoroughly trained in the use of materials and procedures prior to our study taking place

Would you see this as bullying or horseplay in the playground?

You would see this from your own subjective

viewpoint – we’re biased by experience and

expectation

Observers must agree about what

they are observing – they need to use

standardised behavioural categories

The tool is measuring what it is intending to measure

=

=The findings can be generalized

beyond the context of the research situation

Assessing and Improving Internal Validity

Face Validity

Does our measuring tool appear to be doing

what it should?

One or more judges assess whether the test seems appropriate and suggest changes if necessary

Content Validity

You expect me to read through this

lot by Friday?

Does the content of a test cover everything in

the area of interest?

More rigorous – experts in the field

systematically examine the tool’s components

and compare them with set standards

They have to agree the content is appropriate

Concurrent ValidityNew measure test scores are correlated with those from an established valid test

As you can see, we have a high positive correlation between

scores on the new and old tests. I declare this

test valid!

What would we do if the

correlation were low?

Predictive Validity

Can an intelligence test at age 3 predict academic

performance at 21?

Can a diagnosis of a certain mental

illness predict recovery?

Assessing and Improving External Validity

Temporal Validity

Oh, Mr Asch, we’d be

delighted to say those lines are the same size!

‘Ere mate, are you having a laugh? Do I

look that stupid?

Do our findings

endure over time or are they era-

dependent?

Population Validity

Can we generalise findings from our

research participants to other population

groups?

Context Validity

Can we apply our findings to other contexts and situations

outside of the research setting?Ecological Validity

Better known as:

You want me to pretend to do sums as well as talk? I’ll do it for

two bananas

You want me to falsify your accounts? I’ll

do it for £200,000

Now that really wasn’t so bad. Some

of you are still awake!

OK, we get it now. If our studies are unreliable and invalid we may as well not

bother doing research at all

I can reliably predict that if you learn this

you will score higher in your valid A2 exam

I can reliably predict I’m going

to ask her out and that I have a

perfectly valid reason for it