hypothesis testing i. - universiteit leidenpub.math.leidenuniv.nl/~szabobt/stan/stan10_1.pdf ·...
TRANSCRIPT
-
Introduction Wald test p-values t-test Quiz
Hypothesis testing I.
Botond Szabo
Leiden University
Leiden, 26 March 2018
-
Introduction Wald test p-values t-test Quiz
Outline
1 Introduction
2 Wald test
3 p-values
4 t-test
5 Quiz
-
Introduction Wald test p-values t-test Quiz
Example
Suppose we want to know if exposure to asbestos isassociated with lung disease.
We take some rats and randomly divide them into two groups.
We expose one group to asbestos and then compare thedisease rate in two groups.
We consider the following two hypotheses:1 The null hypothesis: the disease rate is the same in both
groups.2 Alternative hypothesis: The disease rate is not the same in two
groups.
If the exposed group has a much higher disease rate, we willreject the null hypothesis and conclude that the evidencefavours the alternative hypothesis.
-
Introduction Wald test p-values t-test Quiz
Null and alternative hypotheses
We partition the parameter space Θ into sets Θ0 and Θ1 andwish to test
H0 : θ ∈ Θ0 versus H1 : θ ∈ Θ1.
We call H0 the null hypothesis and H1 the alternativehypothesis.
In our previous example θ is the difference between thedisease rates in the exposed and not exposed groups,Θ0 = {0}, and Θ1 = (0,∞).
-
Introduction Wald test p-values t-test Quiz
Rejection region
Let T be a random variable and let T be the range of X . Wetest the hypothesis by finding a subset of outcomes R ⊂ X ,called the rejection region. If T ∈ R, we reject the nullhypothesis, otherwise we retain it:
T ∈ R ⇒ reject H0T /∈ R ⇒ retain H0.
In our previous example T is an estimate of the differencebetween the disease rates in the exposed and not exposedgroups. We reject the null hypothesis if T is large.
-
Introduction Wald test p-values t-test Quiz
Test statistic and critical value
Usually, the rejection region R is of the form
R = {x : T (x) > c},
where T is a test statistic and c is critical value.
The problem in hypothesis testing is to find an appropriatetest statistic T and an appropriate critical value c .
-
Introduction Wald test p-values t-test Quiz
Warning
There is a tendency to use hypothesis testing even when theyare not appropriate.
Point estimates and confidence intervals are at times a betteroption.
Use testing only with well-defined hypotheses and in generaltake care of formulating them: things can go wrong whentesting ill-defined or overly complex hypotheses.
-
Introduction Wald test p-values t-test Quiz
Type I and type II errors
By rejecting H0 when H0 is true we commit type I error.
By retaining H0 when H1 is true we commit type II error.
Retain Null Reject Null
H0 true X type I errorH1 true type II error X
-
Introduction Wald test p-values t-test Quiz
Power function
Definition
The power function of a test T with rejection region R is definedby
β(θ) = Pθ(T ∈ R).
The size of a test is defined to be
α = supθ∈Θ0
β(θ).
A test is said to be of level α, if its size is less or equal to α.
-
Introduction Wald test p-values t-test Quiz
Comments
Let Θ0 = {θ0} and Θ1 = {θ1}.In this case β(θ0) is the probability of type I error.
β(θ1) gives 1 minus the probability of type II error.
In the ideal case we would like the probabilities of both typesof errors be equal to zero. Except trivial cases, this is notpossible. So we concentrate on tests of size or level α, whereα is taken to be small, e.g. α = 0.05.
-
Introduction Wald test p-values t-test Quiz
Hypotheses and tests
A hypothesis of the form θ = θ0 is called a simple hypothesis.
A hypothesis of the form θ > θ0 or θ < θ0 (or both) is calleda composite hypothesis.
A test of the form
H0 : θ = θ0 versus H1 : θ 6= θ0
is called a two-sided test.
A test of the form
H0 : θ ≤ θ0 versus H1 : θ > θ0
orH0 : θ ≥ θ0 versus H1 : θ < θ0
is called a one-sided test.
-
Introduction Wald test p-values t-test Quiz
Example
Example
Let X1, . . . ,Xn ∼ N(µ, σ), with σ known. We want to testH0 : µ ≤ 0 versus µ > 0. Hence Θ0 = (−∞, 0] and Θ1 = (0,∞).Let T = X n and conisder the test
reject H0 if T > c.
The rejection region is
R = {(x1, . . . , xn) : T (x1, . . . , xn) > c} .
-
Introduction Wald test p-values t-test Quiz
Example (continued)
Example
The power function is
β(µ) = Pµ(X n > c)
= Pµ(√
n(X n − µ)σ
>
√n(c − µ)σ
)= P
(Z >
√n(c − µ)σ
)= 1− Φ
(√n(c − µ)σ
).
The power function is increasing in µ.
-
Introduction Wald test p-values t-test Quiz
Example (continued)
Example
Hence
size = supµ≤0
β(µ) = β(0) = 1− Φ(√
nc
σ
).
For size α test we set this equal to α and solve for c to get
c =σΦ−1(1− α)√
n.
We reject the null hypothesis when X n > c . Equivalently, when
√nX nσ
> zα = Φ−1(1− α).
-
Introduction Wald test p-values t-test Quiz
Example 2
Example
Let X be an observation from a Uniform(0, θ) distribution.Suppose one wants to test H0 : θ = 2 versus H1 : θ 6= 2. Supposeone rejects H0 if X ≥ 1.9. Compute the probability of type 1 error.
-
Introduction Wald test p-values t-test Quiz
Most powerful test
It would be desirable to find the test with highest power underH1 among all size α tests.
Such a test, if it exists, is called the most powerful test.
Finding such tests is not easy, and often they don’t even exist.
We shall instead consider several widely used tests.
-
Introduction Wald test p-values t-test Quiz
Wald test
Definition
Let θ be a scalar parameter and θ̂ an estimate of θ with ŝe itsestimated standard error. Consider the test
H0 : θ = θ0 versus H1 : θ 6= θ0.
Assume that θ̂ is asymptotically normal:
θ̂ − θ0ŝe
N(0, 1).
The size α Wald test is: reject H0 when |W | > zα/2, where
W =θ̂ − θ0
ŝe.
-
Introduction Wald test p-values t-test Quiz
Size of the Wald test
Theorem
Asymptotically, the Wald test has size α, i.e.
Pθ0(|W | > zα/2)→ α
as n→∞.
Proof.
We have
Pθ0(|W | > zα/2) = Pθ0
(|θ̂ − θ0|
ŝe> zα/2
)→ P(|Z | > zα/2)= α.
-
Introduction Wald test p-values t-test Quiz
Remark
An alternative version of the Wald test statistic is
W =θ̂ − θ0
se0,
where se0 is the standard error of θ̂ computed at θ0.
Both versions of the test are valid.
-
Introduction Wald test p-values t-test Quiz
Power of the Wald test
Theorem
Suppose the true value of θ is θ∗ 6= θ0. The power β(θ∗) (theprobability of correctly rejecting the null hypothesis) isapproximately given by
1− Φ
(θ̂ − θ∗
ŝe+ zα/2
)+ Φ
(θ̂ − θ∗
ŝe− zα/2
).
-
Introduction Wald test p-values t-test Quiz
Example
Example
Let X1, . . . ,Xm and Y1, . . . ,Yn be two independent samples fromdistributions with means µ1 and µ2, respectively. Let us test thenull hypothesis that µ1 = µ2. Write this as
H0 : δ = 0 versus H1 : δ 6= 0,
where δ = µ1 − µ2. The plug-in estimate of δ is δ̂ = Xm −Y n withestimated standard error ŝe =
√s21m +
s22n , where s
21 , s
22 are sample
variances. The size α Wald test rejects H0 when |W | > zα/2, where
W =δ̂ − 0
ŝe=
Xm − Y n√s21m +
s22n
.
-
Introduction Wald test p-values t-test Quiz
Example 2
Example
Let us assume that we flip a coin 100 times and 25 out of the 100we got head. Do we have strong enough evidence to say that thecoin is not fair?
-
Introduction Wald test p-values t-test Quiz
Relation to confidence intervals
Theorem
The size α Wald test rejects H0 : θ = θ0 versus H1 : θ 6= θ0 if andonly if θ0 /∈ C , where
C = (θ̂ − ŝezα/2, θ̂ + ŝezα/2).
In words, testing the hypothesis with Wald test isequivalent tochecking whether the null value θ0 is in the confidence interval.
-
Introduction Wald test p-values t-test Quiz
Warning
When we reject H0, we say the result is statistically significant.
The result may be statistically significant without the effectsize being large.
In such a case we have a result that is statistically significant,but perhaps not scientifically or practically significant.
Any confidence interval that excludes θ0 corresponds torejecting θ0. But θ0 can be close to an endpoint of theconfidence interval (not scientifically significant), or far away(scientifically significant).
-
Introduction Wald test p-values t-test Quiz
Heuristics
Consider a hypothesis test with test statistic T and therejection region Rα at level α.
Fix α. Upon observing the data X we evaluate the statisticT (X ) and ask whether the test rejects the null hypothesis atlevel α.
If yes, we gradually decrease α, while asking the samequestion.
Eventually we arrive at α at which the test does not reject thenull hypothesis.
This value of α is called the p-value. It measures the strengthof evidence against H0: when the p-value is small, the dataoffers little support in favour of H0.
-
Introduction Wald test p-values t-test Quiz
Evidence scale
The p-value offers a ‘continuous’ scale of evidence against thenull hypothesis and can be more informative than a simple‘accept/reject’ of the hypothesis testing.
The following evidence scale is typically used:
p-value evidence
< 0.01 very strong evidence against H00.01− 0.05 strong evidence against H00.05− 0.10 weak evidence against H0> 0.1 no or little evidence against H0
Warning: a large p-value is not strong evidence in favour ofH0. The large p-value can occur for two reasons:
(i) H0 is true, or(ii) H0 is false, but the test has low power.
-
Introduction Wald test p-values t-test Quiz
p-value
Definition
Suppose for every α ∈ (0, 1) we have a size α test with rejectionregion Rα. Then
p-value = inf{α : T (X ) ∈ Rα}.
That is, the p-value is the smallest level at which we can reject H0.
-
Introduction Wald test p-values t-test Quiz
Computation
Theorem
Suppose that the size α test is of the form
reject H0 if and only if T (X ) ≥ cα.
Then, with x the observed value of X ,
p-value = supθ∈Θ0
Pθ(T (X ) ≥ T (x)).
Thus if Θ0 = {θ0},
p-value = Pθ0(T (X ) ≥ T (x)).
In other words, p-value is the probability under H0 of observing avalue of the test statistic the same or more extreme than what weactually observed.
-
Introduction Wald test p-values t-test Quiz
Wald statistic
Theorem
Let w = (θ̂ − θ0)/ŝe denote the observed value of the Waldstatistic. The p-value is given by
p-value = Pθ0(|W | > |w |) ≈ Pθ0(|Z | > |w |) = 2Φ(−|w |).
-
Introduction Wald test p-values t-test Quiz
Distribution of the p-value
Theorem
If the test statistic has a continuous distribution, then underH0 : θ = θ0 the p-value has a Uniform(0, 1) distribution.Therefore, if we reject H0 when the p-value is less than α, theprobability of type I error is α.
-
Introduction Wald test p-values t-test Quiz
t-test
To test H0 : µ = µ0 versus µ 6= µ0, where µ = E[Xi ], we canuse the Wald test.
When X1, . . . ,Xn ∼ N(µ, σ2) (with θ = (µ, σ2) unknown) andn is small, it is more common to use the t-test.
A random variable has a t-distribution with k degrees offreedom, if its PDF is given by
f (t) =Γ(k+1
2
)√kπΓ
(k2
) (1 + t
2
k
)(k+1)/2 .When k →∞, this tends to the N(0, 1) distribution.
-
Introduction Wald test p-values t-test Quiz
t-distribution
-
Introduction Wald test p-values t-test Quiz
t-test (continued)
Let
T =
√n(X n − µ0)
Sn,
where S2n is the sample variance.
Under H0, T ∼ tn−1.Let tn−1,α/2 denote the upper α/2-quantile of thet-distribution.
If we reject the null hypothesis when |T | > tn−1,α/2, we get asize α test.
However, already for moderately large n the test is essentiallyidentical to the Wald test.
-
Introduction Wald test p-values t-test Quiz
Example
Example
The bload pressure of 6 patients were measured before taking amedicine (124, 134, 117, 150, 160, 150) and after taking themedicine (135, 136, 120, 145, 155, 126), respectively. Do we havesufficient statistical evidence that the medicine was useful inlowering the bload pressure?
-
Introduction Wald test p-values t-test Quiz
Question 1
What is our aim in hypothesis testing?
Answers:
1 We check whether we can accept our theory (null hypothesis).
2 We test whether the data provides sufficient evidence to rejectour theory (null hypothesis).
3 We check whether the null hypothesis or the alternativehypothesis is more likely.
4 We check whether the data is normally distributed?
-
Introduction Wald test p-values t-test Quiz
Question 1
What is our aim in hypothesis testing?
Answers:
1 We check whether we can accept our theory (null hypothesis).
2 We test whether the data provides sufficient evidence to rejectour theory (null hypothesis).
3 We check whether the null hypothesis or the alternativehypothesis is more likely.
4 We check whether the data is normally distributed?
-
Introduction Wald test p-values t-test Quiz
Question 2
What is not true for the rejection region R?
Answers:
1 It is a random variable.
2 We reject the null hypothesis if the test statistics T is insideof R.
3 Typically is of the form R = {x : T (x) > c}.4 We make decision on the null hypothesis based on the
rejection region and the test statistics, which form a “pair”.
-
Introduction Wald test p-values t-test Quiz
Question 2
What is not true for the rejection region R?
Answers:
1 It is a random variable.
2 We reject the null hypothesis if the test statistics T is insideof R.
3 Typically is of the form R = {x : T (x) > c}.4 We make decision on the null hypothesis based on the
rejection region and the test statistics, which form a “pair”.
-
Introduction Wald test p-values t-test Quiz
Question 3
Which of the following statements is not true for the error types inhypothesis?
Answers:
1 The type I error is when we reject the null hypothesis altoughit is true.
2 Type II error is when we retain the null hypothesis although itwas not true.
3 We make a type II error when we do not reject the nullhypothesis when the alternate is true.
4 We make a type II error if we accept the null hypothesis and itwas true.
-
Introduction Wald test p-values t-test Quiz
Question 3
Which of the following statements is not true for the error types inhypothesis?
Answers:
1 The type I error is when we reject the null hypothesis altoughit is true.
2 Type II error is when we retain the null hypothesis although itwas not true.
3 We make a type II error when we do not reject the nullhypothesis when the alternate is true.
4 We make a type II error if we accept the null hypothesis and itwas true.
-
Introduction Wald test p-values t-test Quiz
Question 4
What is not true for the Wald test?
Answers:
1 The Wald test statistic is W = (θ̂ − θ0)/ŝe and the criticalregion |W | > zα/2.
2 Pθ0(|W | > zα/2)→ α.3 The data has to be normally distributed to apply it.
4 An alternative version of the Wald test is W = (θ̂ − θ0)/se0.
-
Introduction Wald test p-values t-test Quiz
Question 4
What is not true for the Wald test?
Answers:
1 The Wald test statistic is W = (θ̂ − θ0)/ŝe and the criticalregion |W | > zα/2.
2 Pθ0(|W | > zα/2)→ α.3 The data has to be normally distributed to apply it.
4 An alternative version of the Wald test is W = (θ̂ − θ0)/se0.
-
Introduction Wald test p-values t-test Quiz
Question 5
What is a p value?
Answers:
1 The probability of the null hypothesis.
2 It measures the strength of evidence against H0.
3 It is equivalent to the first type of error.
4 It provides a strong evidence in favour of H0.
-
Introduction Wald test p-values t-test Quiz
Question 5
What is a p-value?
Answers:
1 The probability of the null hypothesis.
2 It measures the strength of evidence against H0.
3 It is equivalent to the first type of error.
4 It provides a strong evidence in favour of H0.
-
Introduction Wald test p-values t-test Quiz
Question 6
When does one use a t-test instead of a Wald test?
Answers:
1 For modorate sample size.
2 For small sample size
3 If the data is not normally distributed.
4 If the variance of the normal distribution is known.
-
Introduction Wald test p-values t-test Quiz
Question 6
When does one use a t-test instead of a Wald test?
Answers:
1 For modorate sample size.
2 For small sample size.
3 If the data is not normally distributed.
4 If the variance of the normal distribution is known.
IntroductionWald testp-valuest-testQuiz