hypothesis testing
DESCRIPTION
Hypothesis testing by Dr. Badr Aljaser as part of the 5th Research Summer School - Jeddah at KAIMRC - WRTRANSCRIPT
Hypothesis Testing with One Sample
James Lind’s experiment
Hypothesis testingHypothesis testing
Draw inferences about a population based on a sample
Testing a claim about a property of a population
Statistical Inference
Inferences about a population are made on
the basis of results obtained from a sample
drawn from that population
Want to talk about the larger population from
which the subjects are drawn, not the
particular subjects!
What Do We Test ?
Effect or Difference we are interested in Difference in Means or Proportions
Odds Ratio (OR)
Relative Risk (RR)
Correlation Coefficient
Clinically important difference Smallest difference considered biologically
or clinically relevant
Example: Gender Selection
Hypothesis TestingGoal: Make statement(s) regarding unknown population parameter values based on sample data
Elements of a hypothesis test:
Null hypothesis - Statement regarding the value(s) of
unknown parameter(s). Typically will imply no
association between explanatory and response
variables in our applications (will always contain an
equality)
Alternative hypothesis - Statement contradictory to
the null hypothesis (will always contain an inequality)
Null Hypothesis
Usually that there is no effectMean = 0OR = 1RR = 1Correlation Coefficient = 0
Alternative Hypothesis
Contradicts the null
There is an effect
What you want to prove ?
Null Hypothesis expresses no difference
Example:
H0: = 0Often said “H naught” Or any number
Later…….H0: 1 = 2
Alternative Hypothesis
H0: = 0; Null Hypothesis
HA: = 0; Alternative Hypothesis
Researcher’s predictions should be a priori, i.e. before looking at the data
Estimation: From the Sample
Point estimation
Mean
Median
Change in mean/median
Interval estimation
95% Confidence interval
Variation
Parameters and Reference Distributions
Continuous outcome data
Normal distribution: N( μ,σ2) t distribution: t ( = degrees of
freedom)Mean = (sample mean)Variance = s2 (sample variance)
Binary outcome data Binomial distribution: B (n, p)
X
Normal Distribution
t – Distribution
Binomial Distribution
Hypothesis TestingGoal: Make statement(s) regarding unknown population parameter values based on sample data
Elements of a hypothesis test:
Test statistic - Quantity based on sample data
and null hypothesis used to test between null and
alternative hypotheses.
The test statistic is found by converting the sample
statistic (proportion, mean or standard deviation) to
a score (z, t z, t or x x22))
Critical region (Rejection region): Values of the test statistic for which we reject the null in favor of the alternative hypothesis
Critical Region, Significant level, Critical value and p-
value
Significant level (α ): the probability that the test statistic will fall in the critical region when the null hypothesis is actually true.
Critical Region, Significant level, Critical value and p-
value
Critical value: is any value that separates the critical region from the values of the test statistic that do not lead to rejection of the null hypothesis .
Critical Region, Significant level, Critical value and p-
value
Two tailed: the critical region is in the two extreme regions (tails) under the curve
Two-Tailed, Left Tailed, Right Tailed
Left tailed: the critical region is in the extreme left region (tails) under the curve
Two-Tailed, Left Tailed, Right Tailed
Right tailed: the critical region is in the extreme right region (tails) under the curve
Two-Tailed, Left Tailed, Right Tailed
P-value (p-value or probability value: is
the probability of getting a value of the test
statistic that is at least as extreme as the
one representing the sample data assuming
the null hypothesis is true.
The null hypothesis is rejected if the p-value
is very small such as 0.05 or less.
Critical Region, Significant level, Critical value and p-
value
Reject the null hypothesis (or other)
Fail to reject the null hypothesis
Prove the null hypothesis to be true
Accept the null hypothesis
Support the null hypothesis
Statistically correct
Ok but misleading
Traditional Method: Rejection of the null
hypothesis if the statistic falls within the critical
region
Fail to reject the null hypothesis if the test statistic
does not fall within the critical region
P – value methodP – value method: rejection H0 if p-value < α
(where α is the significant level such as 0.05)
Decision Criterion
Another option: Another option: Instead of using a significant level such
as α = 0.05, simply identify the P value and leave the
decision to the reader
Confidence intervals: Confidence intervals: Because a Confidence interval
estimate of the population parameter contains the likely
values of that parameter, reject a claim that the
population parameter has a value that is not included in
the confidence interval
Decision Criterion
Statistical Error
Sometimes H0 will be rejected (based on large test statistic & small p-value) even though H0 is really true
i.e., if you had been able to measure the entire population, not a sample, you would have found no difference between and some value but based on X you see a difference.
The mistake of rejecting a true H0 will happen with frequency
So, if H0 is true, it will be rejected ~5% of the time as frequently = 0.05
0
0 20
Population mean = 0
Sample mean = 20
Conclude based on sample mean that population mean 0, but it really does (H0 true), therefore you have falsely rejected H0
Type I Error
population=“True”
Sample=What you see
H0 : mean = 0
Statistical Error
Sometimes H0 will be accepted (based on small test statistic & large p-value) even though H0 is really false
i.e., if you had been able to measure the entire
population, not a sample, you would have found
a difference between and some value- but
based on X you do not see a difference.
The mistake of accepting a false H0 will happen with frequency β
0
Sample mean = 00 20
Sample mean = 20
Conclude based on sample mean that population mean = 0, but it really does not (H0 really false), therefore you have falsely failed to reject H0
Type II Error
Population= “True”
Sample= what you see
H0 : mean = 0
20
1. The treatments do not differ, and we correctly conclude that they do not differ.
2. The treatments do not differ, but we conclude that they do differ.
3. The treatments differ, but we conclude that they do not differ.
4. The treatments do differ, and we correctly conclude that they do differ.
Four Possibilities in Testing Whether the Treatments Differ
Type I errorConcluded that
there is difference while in reality there is no difference
α probability
Type II errorConcluded that
there is no difference while in reality there is a difference
β probability
Controlling Type I & Type II Errors
α β
Power (1 – β)Sample Size
The power of the hypothesis test is the probability (1-(1- ββ)) rejecting a false null hypothesis, which is computed by using:
A particular significant level α Sample size nn A particular assumed value of the population
parameter in the null hypothesis A particular assumed value of the population
parameter that is alternative to the value in the null hypothesis
Power of the test
Term Definitionsα = Probability of making a type I error = Probability of concluding the treatments differ when in
reality they do not differ
β = Probability of making a type II error = Probability of concluding that the treatments do not
differ when in reality they do differ
Power = 1 - Probability of making a type II error = 1 - β = Probability of correctly concluding that the treatments
differ = Probability of detecting a difference between the
treatments if the treatments do in fact differ