sociology 5811: lecture 10: hypothesis tests copyright © 2005 by evan schofer do not copy or...

Sociology 5811:Lecture 10: Hypothesis Tests

Copyright © 2005 by Evan Schofer

Do not copy or distribute without permission

Announcements

• Problem Set #3 Due next week• Problem set posted on course website

• We are a bit ahead of reading assignments in Knoke book

• Try to keep up; read ahead if necessary

Z-scores for Sampling Distributions

• New application of Z-Scores:

Yσ

)μ()(

Y

s

YYZ

Y

ii

• “Old” formula is for variable distributions:• It calculates the # standard deviations a case falls from Y-bar

• New formula is for sampling distributions• It tells you the number of standard errors Y-bar falls from the

population mean • We can also compute distance from Y-bar to a hypothetical

value of (as we did last class).

Hypothesis Testing

• Hypothesis Testing:

• A formal language and method for examining claims using inferential statistics

• Designed for use with probabilistic empirical assessments

• Because of the probabilistic nature of inferential statistics, we cannot draw conclusions with absolute certainty

• We cannot “prove” our claims are “true”

• While it is improbable, we will occasionally draw a sample that is highly unusual, leading to incorrect conclusions.

Hypothesis Testing

• The logic of hypothesis testing:

• We cannot “prove” anything

• Instead, we will cast doubt on other claims, thus indirectly supporting our own

• Strategy:– 1. We first state an “opposing” claim

• The opposite of what we want to claim

– 2. If we can cast sufficient doubt on it, we are forced (grudgingly) to accept our own claim.

Hypothesis Testing

• Example: Suppose we wish to argue that our school is above the national standard

• First we state the opposite:• “Our school is not above the national standard”

• Next we state our alternative:• “Our school is above the national standard”

• If our statistical analysis shows that the first claim is highly improbable, we can “reject” it, in favor of the second claim

• …“accepting” the claim that our school is doing well.

Hypothesis Testing: Jargon

• Hypotheses: Claims we wish to test

• Typically, these are stated in a manner specific enough to test directly with statistical tools– We typically do not test hypotheses such as “Marx

was right” / “Marx was wrong”– Rather: The mean years of education for Americans

is/is not above 18 years.

Hypothesis Testing: Jargon

• The hypothesis we hope to find support for is referred to as the alternate hypothesis

• The hypothesis counter to our argument is referred to as the null hypothesis

• Null and alternative hypotheses are denoted as:

• H0: School does not exceed the national standard• H-zero indicates null hypothesis

• H1: School does exceed national standard • H-1 indicates alternate hypotheses

• Sometimes called: “Ha”

Hypothesis Testing: More Jargon

• If evidence suggests that the null hypothesis is highly improbable, we “reject” it

• And, we “accept” the alternative hypothesis

• So, typically we:

• Reject H0, accept H1

• Or:

• Fail to reject H0, do not find support for H1

• That was what happened when we “tested” whether our school exceeded the national standard (=60).

Hypothesis Testing• In order to conduct a test to evaluate hypotheses,

we need two things:• 1. A statistical test which reflects on the

probability of H0 being true rather than H1• Here, we used a z-score/t-score to determine the probability

of H0 being true

• 2. A pre-determined level of probability below which we feel safe in rejecting H0 ()

• In the example, we wanted to be 95% confident… =.05• But, the probability was .105, so we couldn’t conclude that

the school met the national standard!

Hypothesis Test for the Mean

• Example: Corporate Salaries– Imagine I’m a human resources director of “Evan.com”

• Our engineers are paid 50,000/year

• I suspect that our salaries are not competitive

– So, I survey employees of our main competitor…• I sample 20 people and observe a mean salary of 55K

– Y-bar is 55K, but we don’t know …

• Issue: Are our salaries below the industry?

– Hypotheses:• H0: Competitor’s salaries are no better ( <= 50K)

• H1: Competitors salaries are better ( > 50K).

Hypothesis Test: Example

• It looks like the other company pays more::• Average Salary is 55K, compared to our baseline of 50K

• Question: Can we reject the null hypothesis and accept the alternate hypothesis?

• Answer: No! It is possible that we just drew an atypical sample.

• The true population mean for the competitor may be higher.


• We need to use our statistical knowledge to determine:

• What is the probability of drawing a sample (N=20) with mean of 55K from a population of mean 50K?

• If that is a probable event, we can’t draw very strong conclusions. It is likely that competitor salaries are the same.

• But, if the event is very improbable, we can conclude that the competitor salaries exceed 50K.


• How would we determine the probability that the competitor mean salary is really only 50K?

• Answer: We apply the Central Limit Theorem to determine the shape of the sampling distribution

• And then calculate a Z-value or T-value based on it

• Suppose we chose an alpha () of .05• If we observe a t-value with probability of only .0023, then

we can reject the null hypothesis.

• If we observe a t-value with probability of .361, we cannot reject the null hypothesis.

Hypothesis Test: Steps

• 1. State the research hypothesis (“alternate hypothesis), H1

• 2. State the null hypothesis, H0

• 3. Choose an -level (alpha-level)• Typically .05, sometimes .10 or .01

• 4. Look up value of test statistic corresponding to the -level (called the “critical value”)

• Example: find the “critical” t-value associated with =.05


• 5. Use statistics to calculate a relevant test statistic.

• T-value or Z-value

• Soon we will learn additional ones

• 6. Compare test statistic to “critical value”• If test statistic is greater, we reject H0

• If it is smaller, we cannot reject H0


• Alternate steps:

• 3. Choose an alpha-level

• 4. Get software to conduct relevant statistical test• Software will compute test statistic and provide a

probability… the probability of observing a test statistic of a given size.

• If this is lower than alpha, reject H0

Hypothesis Test: Errors

• Due to the probabilistic nature of such tests, there will be periodic errors.– Sometimes the null hypothesis will be true, but we

will reject it• When we falsely reject H0, it is called a Type I error

• Our alpha-level determines the probability of this

– Sometimes we do not reject the null hypothesis, even though it is false

• When we falsely fail to reject H0, it is called a Type II error

– In general, we are most concerned about Type I errors… we try to be conservative.

Hypothesis Tests About a Mean

• Possible hypothesis tests for a single mean:

• 1. Population mean is not equal to a certain value• Null hypothesis is that the mean is equal to that value

• 2. Population mean is higher than a value• Null hypothesis: mean is equal or less than a value

• 3. Population mean is lower than a value• Null hypothesis: mean is equal or greater than a value

• We will learn more interesting kinds of tests:• Tests comparing means of two groups

• Tests about correlations, regressions, etc.

Hypothesis Tests About Means• Example: Bohrnstedt & Knoke, section 3.93, pp.

108-110. N = 1015, Y-bar = 2.91, s=1.45• H0: Population mean = 4• H1: Population mean ≠ 4• Strategy:• 1. Choose Alpha (let’s use .001)• 2. Determine the Standard Error• 3. Use S.E. to determine the probability of the

observed mean (Y-bar), IF the population mean is really 4.

• 4. If the probability is below .001, reject H0

Example: Is =4?• Let’s determine how far Y-bar is from hypothetical =4

• In units of standard errors

0.24.046/09.1 t

YYY σ̂

09.1

σ̂

)491.2(

σ

)μ(

Yt

• Y-bar is 24 standard errors below 4.0!

046.1015

45.1

N

sσ̂ Y

Y


• A Z-table (if N is large) or a T-table will tell us probabilities of Y-bar falling Z (or T) standard deviations from

• In this example, the desired = .001• Which corresponds to t=3.3 (taken from t-table)

– That is: .001 (i.e, .1%) of samples (of size 1015) fall beyond 3.3 standard errors of the population mean

• 99.9% fall within 3.3 S.E.’s.


• There are two ways to finish the “test”

• 1. Compare “critical t” to “observed t”• Critical t is 3.3, observed t = -24

• We reject H0: t of -24 is HUGE, very improbable

• It is highly unlikely that = 4

• 2. Actually calculate the probability of observing a t-value of 24, compare to pre-determined

• If observed probability is below , reject H0

– In this case, probability of t=27 is .0000000000000…• Very improbable. Reject H0!

Two-Tail Tests

• Visually: Most Y-bars should fall near • 99.9% CI: –3.3 < t < 3.3, or 3.85 to 4.15

Sampling Distribution of the Mean

3.85 4 4.15 Z=-3.3 Z=+3.3

Mean of 2.91 (t=24) is far into the red

area (beyond edge of graph)


• Note: This test was set up as a “two-tailed test”• Hypothesis was ≠ 4… It didn’t specify a direction

• Meaning, that we reject H0 if observed Y-bar falls in either tail of the sampling distribution

• Not all tests are done that way… Sometimes you only want to reject H0 if Y-bar falls in one particular tail.

Hypothesis Testing

• Definition: Two-tailed test: A hypothesis test in which the -area of interest falls in both tails of a Z or T distribution.

• Example: H0: m = 4; H1: m ≠ 4

• Definition: One-tailed test: A hypothesis test in which the -area of interest falls in just one tail of a Z or T distribution.

• Example: H0: > or = 4; H1: < 4

• Example: H0: < or = 4; H1: > 4

• This is called a “directional” hypothesis test.

Hypothesis Tests About Means

• A one-tailed test: H1: < 4

• Entire -area is on left, as opposed to half (/2) on each side. SO: the critical t-value changes.

4


• T-value changes because the alpha area (e.g., 5%) is all concentrated in one size of distribution, rather than split half and half.

• One tail vs. Two-tail:

Looking Up T-TablesHow much does the 95% t-value change

when you switch from a 2-tailed to 1-tailed

test?

Two-tailed test (20 df):

t=2.086

One-tailed test (20df)

t=1.725


• Use one-tailed tests when you have a directional hypothesis

• e.g., > 5

• Otherwise, use 2-tailed tests

• Note: Switching to a one-tailed test lowers the critical t-value needed to reject H0.

Tests for Differences in Means

• A more useful and interesting application of these same ideas…

• Hypothesis tests about the means of two different groups

• Up until now, we’ve focused on a single mean for a homogeneous group

• It is more interesting to begin to compare groups

• Are they the same? Different?

• We’ll do that next class!

sociology 5811: lecture 10: hypothesis tests copyright © 2005 by evan schofer do not copy or...

Documents