hypothesis testing

Hypothesis Testing with One Sample

James Lind’s experiment

Hypothesis testingHypothesis testing

Draw inferences about a population based on a sample

Testing a claim about a property of a population

Statistical Inference

Inferences about a population are made on

the basis of results obtained from a sample

drawn from that population

Want to talk about the larger population from

which the subjects are drawn, not the

particular subjects!

What Do We Test ?

Effect or Difference we are interested in Difference in Means or Proportions

Odds Ratio (OR)

Relative Risk (RR)

Correlation Coefficient

Clinically important difference Smallest difference considered biologically

or clinically relevant

Example: Gender Selection

Hypothesis TestingGoal: Make statement(s) regarding unknown population parameter values based on sample data

Elements of a hypothesis test:

Null hypothesis - Statement regarding the value(s) of

unknown parameter(s). Typically will imply no

association between explanatory and response

variables in our applications (will always contain an

equality)

Alternative hypothesis - Statement contradictory to

the null hypothesis (will always contain an inequality)

Null Hypothesis

Usually that there is no effectMean = 0OR = 1RR = 1Correlation Coefficient = 0

Alternative Hypothesis

Contradicts the null

There is an effect

What you want to prove ?

Null Hypothesis expresses no difference

Example:

H0: = 0Often said “H naught” Or any number

Later…….H0: 1 = 2

Alternative Hypothesis

H0: = 0; Null Hypothesis

HA: = 0; Alternative Hypothesis

Researcher’s predictions should be a priori, i.e. before looking at the data

Estimation: From the Sample

Point estimation

Mean

Median

Change in mean/median

Interval estimation

95% Confidence interval

Variation

Parameters and Reference Distributions

Continuous outcome data

Normal distribution: N( μ,σ2) t distribution: t ( = degrees of

freedom)Mean = (sample mean)Variance = s2 (sample variance)

Binary outcome data Binomial distribution: B (n, p)

X

Normal Distribution

t – Distribution

Binomial Distribution

Hypothesis TestingGoal: Make statement(s) regarding unknown population parameter values based on sample data

Elements of a hypothesis test:

Test statistic - Quantity based on sample data

and null hypothesis used to test between null and

alternative hypotheses.

The test statistic is found by converting the sample

statistic (proportion, mean or standard deviation) to

a score (z, t z, t or x x22))

Critical region (Rejection region): Values of the test statistic for which we reject the null in favor of the alternative hypothesis

Critical Region, Significant level, Critical value and p-

value

Significant level (α ): the probability that the test statistic will fall in the critical region when the null hypothesis is actually true.


value

Critical value: is any value that separates the critical region from the values of the test statistic that do not lead to rejection of the null hypothesis .


value

Two tailed: the critical region is in the two extreme regions (tails) under the curve

Two-Tailed, Left Tailed, Right Tailed

Left tailed: the critical region is in the extreme left region (tails) under the curve


Right tailed: the critical region is in the extreme right region (tails) under the curve


P-value (p-value or probability value: is

the probability of getting a value of the test

statistic that is at least as extreme as the

one representing the sample data assuming

the null hypothesis is true.

The null hypothesis is rejected if the p-value

is very small such as 0.05 or less.


value

Reject the null hypothesis (or other)

Fail to reject the null hypothesis

Prove the null hypothesis to be true

Accept the null hypothesis

Support the null hypothesis

Statistically correct

Ok but misleading

Traditional Method: Rejection of the null

hypothesis if the statistic falls within the critical

region

Fail to reject the null hypothesis if the test statistic

does not fall within the critical region

P – value methodP – value method: rejection H0 if p-value < α

(where α is the significant level such as 0.05)

Decision Criterion

Another option: Another option: Instead of using a significant level such

as α = 0.05, simply identify the P value and leave the

decision to the reader

Confidence intervals: Confidence intervals: Because a Confidence interval

estimate of the population parameter contains the likely

values of that parameter, reject a claim that the

population parameter has a value that is not included in

the confidence interval

Decision Criterion

Statistical Error

Sometimes H0 will be rejected (based on large test statistic & small p-value) even though H0 is really true

i.e., if you had been able to measure the entire population, not a sample, you would have found no difference between and some value but based on X you see a difference.

The mistake of rejecting a true H0 will happen with frequency

So, if H0 is true, it will be rejected ~5% of the time as frequently = 0.05

0

0 20

Population mean = 0

Sample mean = 20

Conclude based on sample mean that population mean 0, but it really does (H0 true), therefore you have falsely rejected H0

Type I Error

population=“True”

Sample=What you see

H0 : mean = 0

Statistical Error

Sometimes H0 will be accepted (based on small test statistic & large p-value) even though H0 is really false

i.e., if you had been able to measure the entire

population, not a sample, you would have found

a difference between and some value- but

based on X you do not see a difference.

The mistake of accepting a false H0 will happen with frequency β

0

Sample mean = 00 20

Sample mean = 20

Conclude based on sample mean that population mean = 0, but it really does not (H0 really false), therefore you have falsely failed to reject H0

Type II Error

Population= “True”

Sample= what you see

H0 : mean = 0

20

1. The treatments do not differ, and we correctly conclude that they do not differ.

2. The treatments do not differ, but we conclude that they do differ.

3. The treatments differ, but we conclude that they do not differ.

4. The treatments do differ, and we correctly conclude that they do differ.

Four Possibilities in Testing Whether the Treatments Differ

Type I errorConcluded that

there is difference while in reality there is no difference

α probability

Type II errorConcluded that

there is no difference while in reality there is a difference

β probability

Controlling Type I & Type II Errors

α β

Power (1 – β)Sample Size

The power of the hypothesis test is the probability (1-(1- ββ)) rejecting a false null hypothesis, which is computed by using:

A particular significant level α Sample size nn A particular assumed value of the population

parameter in the null hypothesis A particular assumed value of the population

parameter that is alternative to the value in the null hypothesis

Power of the test

Term Definitionsα = Probability of making a type I error = Probability of concluding the treatments differ when in

reality they do not differ

β = Probability of making a type II error = Probability of concluding that the treatments do not

differ when in reality they do differ

Power = 1 - Probability of making a type II error = 1 - β = Probability of correctly concluding that the treatments

differ = Probability of detecting a difference between the

treatments if the treatments do in fact differ

hypothesis testing

Education

critical value

pvalue pvalue

probability value

null hypothesis statement

hypothesis testinggoal

null hypothesisha

null hypothesissupport

null hypothesisprove