lecture 10 outline: tue, oct 7 resistance of two sample t-tools (chapter 3.3) practical strategies...

Lecture 10 Outline: Tue, Oct 7

• Resistance of two sample t-tools (Chapter 3.3)• Practical strategies for two-sample problem

(Chapter 3.4)• Review• Office hours:

– Today after class– Tomorrow morning (9 a.m. – 11:30 a.m.)– Note: I will be out of town Thursday. I will try to

check e-mail Wednesday night around 10 but will not be able to check after that.

Matched Pairs Studies

• Studies in which units in the two groups can be blocked into pairs that are more “similar” to each other than to the other units are called matched pairs studies (e.g., schizophrenia twins study)

• In a matched pairs study, the samples are not independent – knowledge of the outcome of one in a pair helps to predict the outcome of the other.

• The proper tool for analyzing matched pairs study is the paired (one-sample) t-test.

• Motivation for doing matched pairs studies: Blocking. By controlling the influence of outside variables, we reduce variability in the responses ( ), decreasing the margin of error of a CI. More on this later in the course.

Recognizing Matched Pairs Studies

• Does there exist some natural relationship between the first pair of observations that makes it more appropriate to compare the first pair than the first observation in group 1 and the second observation in group 2?

• Before and after designs• Example: A researcher for OSHA wants to see

whether cutbacks in enforcement of safety regulations coincided with an increase in work related accidents. For 20 industrial plants, she has number of accidents in 1980 and 1995.

Conceptual Question #16

• A researcher has taken tissue cultures from 25 subjects. Each culture is divided in half, and a treatment is applied to one of the halves chosen at random. The other half is used as a control. After determining the percent change in the sizes of all culture sections, the researcher calculates the standard error for the treatment-minus-control differences using both the paired t-analysis and the two independent sample t-analysis. Finding that the paired t-analysis gives a slightly larger standard error (and gives only half the degrees of freedom), the researcher decides to use the results from the unpaired analysis. Is this legitimate?

Outliers and resistance

• Outliers are observations relatively far from their estimated means.

• Outliers may arise either– (a) if the population distribution is long-tailed.

– (b) they don’t belong to the population of interest (come from contaminating population)

• A statistical procedure is resistant if one or a few outliers cannot have an undue influence on result.

Resistance

• Illustration for understanding resistance: the sample mean is not resistant; the sample median is.– Sample: 9, 3, 5, 8, 100– Mean with outlier: 25, without: 6.2– Median with outlier: 8, without: 6.5

• t-tools are not resistant to outliers because they are based on sample means.

Practical two-sample strategy

• Think about independence – use tools from later in course (or matched pairs) if there’s a potential problem.

• Use graphical displays to assess: normality (particularly skewness, multimodality and heavy tails), equal spread, outliers

• If there are outliers, investigate them and see whether they (i) change conclusions; (ii) warrant removal. Follow the outlier examination strategy in Display 3.6.

Excluding Observations from Analysis in JMP for Investigating Outliers

• Click on row you want to exclude.• Click on rows menu and then click

exclude/unexclude. A red circle with a line through it will appear next to the excluded observation.

• Multiple observations can be excluded. • To include an observation that was excluded back

into the analysis, click on excluded row, click on rows menu and then click exclude/unexclude. The red circle next to observation should disappear.

Notes on Outliers

• In the examination strategy of Display 3.6, in order to warrant the removal of an outlier, an explanation for why it is different must be established.

• It is not surprising that the outliers in the Agent Orange example have little effect, since the sample sizes are so large.

• The apparent differences in the box plots may be due to differences in sample sizes. If the population distributions are identical, more observations will appear in the extreme tails from a sample of size 646 than a sample of size 97.

Conceptual Question #6

• (a) What course of action would you propose for the statistical analysis if it was learned that Vietnam veteran #646 (the largest observation in Display 3.6) worked for several years, after Vietnam, handling herbicides with dioxin?

• (b) What would you propose if this was learned instead for Vietnam veteran #645 (second largest observation)?

Review

• Material: 1.1-3.4, 4.5.1-4.5.3. Class notes.• Review class notes, homework, textbook.• Themes:

– Study design (randomized experiments vs. observational studies, random sampling vs. non-random sampling) and what inferences they permit

– Hypothesis tests and confidence intervals for two group problems.

Inference

• A statistical inference is an inference justified by a probability model linking the data to a broader context. Statistical inferences involve measures of uncertainty about the conclusions (e.g., p-values and confidence intervals).

• Population inference: an inference about population characteristics, like the difference between two population means

• Causal inference: an inference that a subject would have received a different numerical outcome had the subject belonged to a different group

Statistical inferences permitted by study designs

• Display 1.5

Confounding Variables

• A confounding variable is a variable that is related to both group membership and the outcome. Its presence makes it hard to establish the outcome as being a direct consequence of group membership. Example: experience in sex discrimination study.

• Observational studies: Always have to worry about confounding variables even in very large studies.

• Randomized experiments: Because group membership is randomly assigned, there are no confounding variables. Differences between groups are due to play of chance and can be made almost surely small in large studies.

Measuring Uncertainty

• Probability model for two treatment randomized experiment: Randomly shuffle and deal red and black cards to assign group membership.

• Additive treatment effect model: = the effect of being assigned to group II rather than group I.

Randomization Test

• Two-sided test vs. • Test statistic: • Distribution of test statistic under :

Distribution of T under all possible regroupings.• p-value: Probability that T will be at least as large

as the observed T, To, under .

• p-value: Measure of evidence against . See Display 2.12 for interpreting.

0:0 H 0:1 H

|| 12 YYT 0H

0H

0H

Graphical Methods

• Box plots

• Histograms

• Stem-and-leaf diagrams

• Note on box plots: To produce two plots with same scale using Analyze, Distribution, stack them and click uniform scaling under both groups.

Sampling

• Simple random sample (of size n): each subset of population of size n has same probability of being chosen.

• Need a frame: a numbered list of all subjects• Sampling units: In conducting a random sample, it

is important that we are randomly sampling the units of interest. Otherwise, we may create a selection bias, e.g., sampling families instead of individuals, homework 2 Problem #3.

Inferences Under Random Sampling Model

• Two types of samples– One sample/matched pairs – random sample

from one population (paired t tools)– Two independent samples (two sample t tools)

• Tools can also be used to analyze randomized experiments if group sizes are reasonably large.

Testing a hypothesis about (one sample/matched pairs)

• • Could the difference of from (the

hypothesized value for , =0 here ) be due to chance (in random sampling)?

• Test statistic: • If H0 is true, then t equals the t-ratio and has

the Student’s t-distribution with n-1 degrees of freedom

0:,0: 10 HH

Y *

)(

|*|||

YSE

Yt

P-value

• The (2-sided) p-value is the proportion of random samples with absolute value of t ratios >= observed test statistic (|t|)

• Schizophrenia example: t = 3.23

0

1

2

3

4

5

6

7

8

Y

Estim Mean 0.1986666667 Hypoth Mean 0 T Ratio 3.2289280811 P Value 0.0060615436

-0.4 -0.3 -0.2 -0.1 .0 .1 .2 .3 .4X

Sample Size = 15

Confidence Intervals

• A confidence interval is a range of “plausible values” for a statistical parameter (e.g., the population mean) based on the data.

• If the population distribution of Y is normal, 95% CI for mean of single population:

• A 95% confidence interval will contain the true parameter (e.g., the population mean) 95% of the time if repeated random samples are taken.

n

stY

YSEtY

n

n

*)975(.

)(*)975(.

1

1

Two sample t-test

• H0: , H1: • Test statistic: T= • If population distributions are normal with equal

, then if H0 is true, the test statistic t has a Student’s t distribution with degrees of freedom.

• p-value equals probability that T would be greater than observed |t| under random sampling model if H0 is true; calculated from Student’s t distribution.

*12 *12

)(

|*)(|||

12

12

YYSE

YYt

221 nn

Practical vs. Statistical Significance

• The p-value of a test depends on the sample size. With a large sample, even a small difference can be “statistically significant,” that is hard to explain by the luck of the draw. This doesn’t necessarily make it important. Conversely, an important difference may not be statistically significant if the sample is too small.

• Always accompany p-values for tests of hypotheses with confidence intervals. Confidence intervals provide information about the likely magnitude of the difference and thus provide information about its practical importance.

Designing a Study

• Types of confidence interval for key parameter in a study – Display 23.1

• Role of research design is to avoid outcome D. • Margin of error of 95% confidence interval:

Approximately , one sample problem:

• For one sample study: choose sample size to be greater than where PSD denotes least practically significant difference.

)(*2 estimateSE

)/(4 22 PSDs

ns /2

Robustness of t-tools

• A statistical procedure is robust to departures from a particular assumption if it is valid even when the assumption is not met exactly

• Valid means that the uncertainty measures – the confidence levels and p-values – are nearly equal to the stated rules

• If the sample sizes are large, the t-tests will be valid no matter how nonnormal the populations are.

• If the two populations have same S.D. and approximately the same shape and if , validity of t-tools is affected moderately by long-tailedness and very little by skewness.

21 nn

lecture 10 outline: tue, oct 7 resistance of two sample t-tools (chapter 3.3) practical strategies...

Documents

matched pairs studies

matched pairs study

sample mean

sample median

resistance outliers

sample ttest

independent sample tanalysis

sample ttools chapter