statistical inference with small...

30
Statistical Inference with Small Samples

Upload: others

Post on 04-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Statistical Inference with Small Samples

Page 2: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Administriviao Homework 5 due Friday Nov 10

2

Page 3: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Previously on CSCI 3022Statistical inference for population mean when data is normal and n is large and …

is known:

is unknown:

3

Z - TEST

( Ife ) - Mai ) I±z'a. Eu

( *- µ

Thru) - Non )

Page 4: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Previously on CSCI 3022Statistical inference for population mean when data is NOT normal and n is large and …

is known:

is unknown:

4

c III ) Enron)I - µ

CLT

( ¥) - Mon )

Page 5: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Previously on CSCI 3022Statistical inference for population mean when data is normal and n is small and …

is known:

is unknown:

5

*(a) ~ Nloil )

? ? ? ? ? ? ?

Page 6: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The Story so Far for MeansThus far, we’ve talked about Hypothesis Testing / Confidence Intervals for the mean of a population in the following cases:

n � 30 n < 30

Normal Data / Known �

Normal Data / Unknown �

Non-Normal Data / Known �

Non-Normal Data / Unknown �

6

" " " " " "

1%1/1111/1114114411111'

4/111111/1/1

#t Z - TEST 1¥ BOOTSTRAP Ex t - TEST

Page 7: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Small-Sample Tests for o When n is small we cannot invoke the Central Limit Theorem

o When n is small and the variance is unknown we need to do something else …

When is the sample mean of a random sample of size n from a normal distribution with mean , the random variable

follows a probability distribution called a t-Distribution with parameter degrees of freedom.

µ

µX̄

⌫ = n� 1

7

⇐skin

Page 8: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The t-DistributionThe following figure shows the pdf of some members of the family of t-Distributions

8

Page 9: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Properties of t-DistributionsLet denote the t-Distribution with parameter degrees of freedom

o Each –curve is bell-shaped and centered at 0

o Each -curve is more spread out than the standard normal distribution

o As increases, the spread of the corresponding -curve decreases

o As the sequence of -curves approaches the standard normal curve

t⌫

t⌫

t⌫

t⌫

t⌫

⌫ ! 1

9

= n -1

Page 10: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The t-Critical ValueWe can extend all of our inferential mechanics to the small-sample case by introducing the so-called t-critical value, which we denote

Def: The t-critical value, , is the point such that the area under the -curve to the right of is equal to

Example: is the t-critical value that captures the upper-tail area of 0.05 under the t curve with 6 degrees of freedom.

t↵,⌫

t↵,⌫t↵,⌫

t⌫

t0.05,6

10

*a

taw

Page 11: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The t-Confidence Interval for the MeanLet and s be the sample mean and sample standard deviation computed from the results of a random sample with of size n from a normal population with mean .

Then a t-confidence interval for the mean is given by:

Or, more compactly:

µ

x̄µ

100(1� ↵)%

11

[I- th, n . I Fu

,

I + them ¥ ]

I ± txn,ni In

Page 12: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

t-Confidence Interval ExampleExample: Suppose the GPAs for 23 students have a histogram that looks as follows:

The sample mean of the data is 3.146 and the sample standard deviation is 0.308. Find a 90% confidence interval for the mean GPA.

12

I -- 3.146t • 05,22 5=0.308

n=23

Shot ,o5qq.es#0G=.0642t.os,2z=stats.t.ppf/

. 95,227=1.717 a

3.146+-1.717 *• 0642 ⇒ [ 3.033 ,

3.259 ]

Page 13: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The t-Test, Critical Regions and P-Values

13

Alternative Hypothesis Critical Region Level Test

H1 : ✓ > ✓0

H1 : ✓ < ✓0

H1 : ✓ 6= ✓0

t � t↵,⌫

t �t↵,⌫

�t �t↵/2,⌫

�or

�t � t↵/2,⌫

STUDENT IZED

TEST STATISTIC

t.x.us/In

Page 14: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The t-Test, Critical Regions and P-Values

14

Alternative Hypothesis P-Value Level Test

H1 : ✓ > ✓0

H1 : ✓ < ✓0

H1 : ✓ 6= ✓0

P (T � t | H0) ↵

P (T t | H0) ↵

2min(P (T t | H0), P (T � t | H0)) ↵

G-

Page 15: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

t-Test ExampleExample: Suppose the GPAs for 23 students have a histogram that looks as follows:

The sample mean of the data is 3.146 and the sample standard deviation is 0.308. Determine if there is sufficient evidence to conclude at the 0.10 significance level that the mean GPA is not equal to 3.30.

15

I -- 3.146 Ho :µ= 3.30

g- 0.308 He : M¥3.30n= 23

+ =3.146 - 3.30

0.3087= - 2.398

Page 16: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

t-Test Example

16

( w/p - values )

than-2.398 0 2.398

PNAIUEzx stats .t .

Cd f ( -2.398,227=0.025420.10

Page 17: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

t-Test ExampleExample: Suppose the GPAs for 23 students have a histogram that looks as follows:

The sample mean of the data is 3.146 and the sample standard deviation is 0.308. Determine if there is sufficient evidence to conclude at the 0.10 significance level that the mean GPA is not equal to 3.30.

17

Page 18: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

t-Test Example

18

£42,22 = t.os.az = stats .t . ppf ( 0,95 ,

22 ) = 1,71

RE ) Ecton Region⇒.tk#p%EeMztsion-t= -£3980

S ) NLE t= - 2.3 98 < - t. 05,22=-1.717

WE Reject Ho

Page 19: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Inference for VariancesWe’ve talked about estimating confidence intervals for the variance of a population using the Bootstrap

But if your population is normally distributed, we have some theory which gives us a better confidence interval and works for both large and small sample sizes

Question: What does the sampling distribution of the variance look like when the population is normally distributed?

19

Page 20: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The Chi-Squared DistributionThe chi-squared ( ) distribution is also parameterized by degrees of freedom

The pdfs of the family of distributions are gross, so lets just draw them.

20

⌫ = n� 1�2⌫

�2⌫

Page 21: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

A Confidence Interval for the VarianceLet be a random sample from a normal distribution with mean and standard deviation . Define the sample variance in the usual way as

Then the random variable follows the distribution .

21

�2n�1(n� 1)S2/�2

µX1, X2, . . . , Xn�

Then it follows that

s'

- ± , I§, lxi .- est

P ( Xiank¥2 < Xin, n

. , ) = I - a

Page 22: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

The Chi-Squared Dist is Non-Symmetric

22

Because the distribution is non-symmetric, we need to use two different critical values.

yngSTATS . CHIZ ,ppF( ( - Liz )

2/2× ah

1 tt

Page 23: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

A Confidence Interval for the VarianceFor a confidence interval we choose the two critical values and

which attributes probability to each tail. Then, with confidence we can say that

23

↵/2�21�↵/2,n�1

�2↵/2,n�1

100(1� ↵)%100(1� ↵)%

P ( Xian,.fm#< XI . ,n .,|=t×

⇒ Inches . 'x¥nm( n - 1) s

2

⇒ In . ,

< r'

< heX. a. nt

Page 24: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

A Confidence Interval for the VarianceFor a confidence interval we choose the two critical values and

which attributes probability to each tail. Then, with confidence we can say that

Question: How can we use this to get a confidence interval for the standard deviation?

24

↵/2�21�↵/2,n�1

�2↵/2,n�1

100(1� ↵)%100(1� ↵)%

(n� 1)S2

�2↵/2,n�1

< �2 <(n� 1)S2

�21�↵/2,n�1

100(1� ↵)%

iii.'÷En⇐I÷t

Page 25: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

Variance CI Example

25

Example: A large candy manufacturer produces packages of candy targeted to weight 52g. The weight of the packages of candy is known to be normally distributed, but a QC engineer is concerned that the variation in the produced packages is larger than acceptable. In an attempt to estimate the variance she selects n=10 bags at random and weighs them. The sample yields a sample variance of 4.2g. Find a 95% confidence interval for the variance and a 95% confidence interval for the standard deviation.

,

X=,05 Xm = • 025 n=10 SE 4.2

XIE,n - inX2.gs ,g= Stats .< Hlz . ppf( . 05,9 ) = 2.70

X3h. ,n -1 = X ?os,g= stats . CHLZ . ppfl . 95,9 ) = 19.02

( 10 - 1) 4.2 ( 10 - ' 742 14,0- = 1,9g -

=

19,02' 2.70

95% CI For 02 :[ 1.99 ,14.0 ]

,95% CI Fort :[ 1.41 ,

3.74 ]

Page 26: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

OK! Let’s Go to Work! Get in groups, get out laptop, and open the Lecture 19 In-Class Notebook

Let’s: o Do some stuff!

26

Page 27: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

27

Page 28: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

28

Page 29: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

29

Page 30: Statistical Inference with Small Samplesketelsen/files/courses/csci3022/slides/lesson19.pdfStatistical inference for population mean when data is normal and n is small and ... The

30