top schools in noida

31
By: School.edhole.com

Upload: edholecom

Post on 17-Nov-2014

66 views

Category:

Education


3 download

DESCRIPTION

schools information and every detail for students

TRANSCRIPT

Page 1: Top schools in noida

By:School.edhole.com

Page 2: Top schools in noida

Sociology 601 Class 8: September 24, 2009

6.6: Small-sample inference for a proportion

7.1: Large sample comparisons for two independent sample means.

7.2: Difference between two large sample proportions.

2

School.edhole.com

Page 3: Top schools in noida

7.1 Large sample comparisons for two independent means

So far, we have been making estimates and inferences about a single sample statistic

Now, we will begin making estimates and inferences for two sample statistics at once.many real-life problems involve such comparisons

two-group problems often serve as a starting point for more involved statistics, as we shall see in this class.

3

School.edhole.com

Page 4: Top schools in noida

Independent and dependent samplesTwo independent random samples:

Two subsamples, each with a mean score for some other variable

example: Comparisons of work hours by race or sex example: Comparison of earnings by marital status

Two dependent random samples: Two observations are being compared for each “unit” in

the sampleexample: before-and-after measurements of the same

person at two time points example: earnings before and after marriagehusband-wife differences

4

School.edhole.com

Page 5: Top schools in noida

Comparison of two large-sample means for independent groups

Hypothesis testing as we have done it so far:

Test statistic: z = (Ybar - o) / (s /SQRT(n))

What can we do when we make inferences about a difference between population means (2 - 1)?

Treat one sample mean as if it were o ?

(NO: too much type I error)

Calculate a confidence interval for each sample mean and see if they overlap?

(NO: too much type II error)5

School.edhole.com

Page 6: Top schools in noida

Figuring out a test statistic for a comparison of two means

Is Y2 –Y1an appropriate way to evaluate 2 - 1?

• Answer: Yes. We can appropriately define (2 - 1) as a parameter of interest and estimate it in an unbiased way with (Y2 – Y1) just as we would estimate with Y.

• This line of argument may seem trivial, but it becomes important when we work with variance and standard deviations.

6

School.edhole.com

Page 7: Top schools in noida

Figuring out a standard error for a comparison of two means

Comparing standard errors:

A&F 213: formula without derivation

Is s2Ybar2 - s2

Ybar1an appropriate way to estimate 2

(Ybar2-Ybar1)?

No!

2(Ybar2-Ybar1)= 2

(Ybar2) - 2(Ybar2,Ybar1) + 2(Ybar1)

Where 2(Ybar2,Ybar1) reflects how much the observations for the two groups are dependent.

For independent groups, 2(Ybar2,Ybar1) = 0, so 2

(Ybar2-Ybar1)= 2(Ybar2) + 2

(Ybar1)7

School.edhole.com

Page 8: Top schools in noida

Step 1: Significance test for 2 - 1

The parameter of interest is 2 - 1

Assumptions: the sample is drawn from a random sample of some sort,

the parameter of interest is a variable with an interval scale,

the sample size is large enough that the sampling distribution of Ybar2 – Ybar1 is approximately normal.

The two samples are drawn independently

8

School.edhole.com

Page 9: Top schools in noida

Step 2: Significance test for 2 - 1

The null hypothesis will be that there is no difference between the population means. This means that any difference we observe is due to random chance.

Ho: 2 - 1 = 0

(We can specify an alpha level now if we want)

Q: Would it matter if we used

Ho: 1 - 2 = 0 ?

Ho: 1 = 2 ?

9

School.edhole.com

Page 10: Top schools in noida

Step 3: Significance test for 2 - 1

The test statistic has a standard form:z = (estimate of parameter – Ho value of parameter)

standard error of parameter

Q: If the null hypothesis is that the means are the same, why do we estimate two different standard deviations?

10

2

22

1

21

12 0)(

ns

ns

YYz

School.edhole.com

Page 11: Top schools in noida

Step 4: Significance test for 2 - 1

P-value of calculated z: • Table A

• Stata: display 2 * (1 – normal(z) )

• Stata: testi (no data, just parameters)

• Stata: ttest (if data file in memory)

11

School.edhole.com

Page 12: Top schools in noida

Step 5: Significance test for 2 - 1

Step 5: Conclusion.

Compare the p-value from step 4 to the alpha level in step 1.If p < α, reject H0 If p ≥ α, do not reject H0

State a conclusion about the statistical significance of the test.

Briefly discuss the substantive importance of your findings.

12

School.edhole.com

Page 13: Top schools in noida

Significance test for 2 - 1: Example

Do women spend more time on housework than men?

Data from the 1988 National Survey of Families and Households:sex sample size mean hours s.dmen 4252 18.1 12.9women 6764 32.6

18.2

The parameter of interest is 2 - 1 13

School.edhole.com

Page 14: Top schools in noida

Significance test for 2 - 1: Example

1. Assumptions: random sample, interval-scale variable, sample size large enough that the sampling distribution of 2 - 1is approximately normal, independent groups

2. Hypothesis: Ho: 2 - 1= 0

3. Test statistic: z = ((32.6 – 18.1) – 0) / SQRT((12.9)2/4252 + (18.2)2/6764) = 48.8

4. p-value: p<.001

5. conclusion:

a. reject H0: these sample differences are very unlikely to occur if men and women do the same number of hours of housework.

b. furthermore, the observed difference of 14.5 hours per week is a substantively important difference in the amount of housework.

14

School.edhole.com

Page 15: Top schools in noida

Confidence interval for 2 - 1:

housework example with 99% interval:c.i….

= (32.6 – 18.1) +/- 2.58*( √((12.9)2/4252 + (18.2)2/6764))

= 14.5 +/- 2.58*.30

= 14.5 +/- .8, or (13.7,15.3)

By this analysis, the 99% confidence interval for the difference in housework is 13.7 to 15.3 hours.

15

2

22

1

21

12..n

s

n

szYYic

School.edhole.com

Page 16: Top schools in noida

Stata: Large sample significance test for 2 - 1

Immediate (no data, just parameters)ttesti 4252 18.1 12.9 6764 32.6 18.2, unequal

• Q: why ttesti with large samples?

For the immediate command, you need the following:sample size for group 1 (n = 4252)

mean for group 1

standard deviation for group 1

sample size for group 2

mean for group 2

standard deviation for group 2

instructions to not assume equal variance (, unequal)16

School.edhole.com

Page 17: Top schools in noida

Stata: Large sample significance test for 2 - 1, an example

. ttesti 4252 18.1 12.9 6764 32.6 18.2, unequal

Two-sample t test with unequal variances

------------------------------------------------------------------------------ | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]---------+-------------------------------------------------------------------- x | 4252 18.1 .1978304 12.9 17.71215 18.48785 y | 6764 32.6 .221294 18.2 32.16619 33.03381---------+--------------------------------------------------------------------combined | 11016 27.00323 .1697512 17.8166 26.67049 27.33597---------+-------------------------------------------------------------------- diff | -14.5 .2968297 -15.08184 -13.91816------------------------------------------------------------------------------Satterthwaite's degrees of freedom: 10858.6

Ho: mean(x) - mean(y) = diff = 0

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -48.8496 t = -48.8496 t = -48.8496 P < t = 0.0000 P > |t| = 0.0000 P > t = 1.0000

17

School.edhole.com

Page 18: Top schools in noida

Large sample significance test for 2 - 1: command for a data set (#1)

. ttest YEARSJOB, by(nonstandard) unequal

Two-sample t test with unequal variances------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]---------+-------------------------------------------------------------------- 0 | 980 9.430612 .2788544 8.729523 8.883391 9.977833 1 | 379 7.907652 .3880947 7.555398 7.144557 8.670747---------+--------------------------------------------------------------------combined | 1359 9.005887 .2290413 8.443521 8.556573 9.4552---------+-------------------------------------------------------------------- diff | 1.522961 .4778884 .5848756 2.461045------------------------------------------------------------------------------ diff = mean(0) - mean(1) t = 3.1869Ho: diff = 0 Satterthwaite's degrees of freedom = 787.963

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.9993 Pr(|T| > |t|) = 0.0015 Pr(T > t) = 0.0007

18

School.edhole.com

Page 19: Top schools in noida

Large sample significance test for 2 - 1: command for a data set (#2)

. ttest conrinc if wrkstat==1, by(wrkslf) unequal

Two-sample t test with unequal variances------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]---------+--------------------------------------------------------------------self-emp | 190 48514.62 2406.263 33168.05 43768.03 53261.2 someone | 1263 34417.11 636.9954 22638 33167.43 35666.8---------+--------------------------------------------------------------------combined | 1453 36260.56 648.5844 24722.9 34988.3 37532.82---------+-------------------------------------------------------------------- diff | 14097.5 2489.15 9191.402 19003.6------------------------------------------------------------------------------ diff = mean(self-emp) - mean(someone) t = 5.6636Ho: diff = 0 Satterthwaite's degrees of freedom = 216.259

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000

19

School.edhole.com

Page 20: Top schools in noida

7.2: Comparisons of two independent population proportions

In 1982 and 1994, respondents in the General Social Survey were asked: “Do you agree or disagree with this statement? ‘Women should take care of running their homes and leave running the country up to men.’”

Year Agree Disagree Total

1982 122 223 345

1994 268 1632 1900

Total 390 1855 2245

Do a formal test to decide whether opinions differed in the two years.

20

School.edhole.com

Page 21: Top schools in noida

Step 1: Significance test for π2 - π1

The parameter of interest is π2 - π1

Assumptions: the sample is drawn from a random sample of some sort,

the parameter of interest is a variable with an interval scale,

the sample size is large enough that the sampling distribution of Pihat2 – Pihat1 is approximately normal.

The two samples are drawn independently

21

School.edhole.com

Page 22: Top schools in noida

Step 2: Significance test for π2 - π1

The null hypothesis will be that there is no difference between the population proportions. This means that any difference we observe is due to random chance.

Ho: π2 - π1 = 0

(State an alpha here if you want to.)

22

School.edhole.com

Page 23: Top schools in noida

Step 3: Significance test for π2 - π1

The test statistic has a standard form:z = (estimate of parameter – Ho value of parameter)

standard error of parameter

Where pihat is the overall weighted average

This means we are assuming equal variance in the two populations.

Q: why do we use an assumption of equal variance to estimate the standard error for the t-test? 23

21

12

11ˆ1ˆ

)ˆˆ(

nn

z

School.edhole.com

Page 24: Top schools in noida

Step 4: Significance test for π2 - π1

P-value of calculated z: • Table A, or

• Stata: display 2 * (1 – normal(z) ), or

• Stata: testi (no data, just parameters)

• Stata: ttest (if data file in memory)

24

School.edhole.com

Page 25: Top schools in noida

Step 5: Significance test for π2 - π1

Conclusion:

Compare the p-value from step 4 to the alpha level in step 1.If p < α, reject H0 If p ≥ α, do not reject H0

State a conclusion about the statistical significance of the test.

Briefly discuss the substantive importance of your findings.

25

School.edhole.com

Page 26: Top schools in noida

Significance test for π2 - π1: Example

1. Assumptions: random sample, interval-scale variable, sample size large enough that the sampling distribution of 2 - 1is approximately normal, independent groups

2. Hypothesis: Ho: π2 - π1= 0

3. Test statistic:

z = (122/345 – 268/1900) /

SQRT[(390/2245)*(1 - 390/2245)*(1/345 + 1/1900)]

= 9.59

4. p-value: p<<.001

5. conclusion:

a. reject H0: attitudes were clearly different in 1994 than in 1982.

b. furthermore, the observed difference of .21 is a substantively important change in attitudes. 26

School.edhole.com

Page 27: Top schools in noida

Comparisons of two independent population proportions: Confidence Interval

confidence interval:

Notice that there is no overall weighted average Pihat, as there is in a significance test for proportions.Instead, we estimate two separate variances from the

separate proportions. Why?

27

2

22

1

1112

)1()1(..

n

PP

n

PPzPPic

School.edhole.com

Page 28: Top schools in noida

STATA: Significance test for π2 - π1:immediate command

. prtesti 345 .3536 1900 .1411

STATA needs the following information:sample size for group 1 (n = 345)proportion for group 1 (p = 122/345)sample size for group 2 (n = 1900)proportion for group 2 (p = 268/1900)

28

School.edhole.com

Page 29: Top schools in noida

STATA: Significance test for π2 - π1:immediate command

. prtesti 345 .3536 1900 .1411

Two-sample test of proportion x: Number of obs = 345 y: Number of obs = 1900

------------------------------------------------------------------------------ Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- x | .3536 .0257393 .3031518 .4040482 y | .1411 .0079865 .1254467 .1567533-------------+---------------------------------------------------------------- diff | .2125 .0269499 .1596791 .2653209 | under Ho: .0221741 9.58 0.000------------------------------------------------------------------------------

Ho: proportion(x) - proportion(y) = diff = 0

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 z = 9.583 z = 9.583 z = 9.583 P < z = 1.0000 P > |z| = 0.0000 P > z = 0.0000

Note the use of one standard error (unequal variance) for the confidence interval, and another (equal variance) for the significance test. 29

School.edhole.com

Page 30: Top schools in noida

STATA command for a data set (#1)

. prtest nonstandard if (RACECEN1==1 | RACECEN1==2), by(RACECEN1)

Two-sample test of proportion 1: Number of obs = 1389

2: Number of obs = 260

------------------------------------------------------------------------------

Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

1 | .2800576 .0120482 .2564436 .3036716

2 | .3538462 .0296544 .2957247 .4119676

-------------+----------------------------------------------------------------

diff | -.0737886 .0320084 -.1365239 -.0110532

| under Ho: .0307147 -2.40 0.016

------------------------------------------------------------------------------

diff = prop(1) - prop(2) z = -2.4024

Ho: diff = 0

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(Z < z) = 0.0081 Pr(|Z| < |z|) = 0.0163 Pr(Z > z) = 0.9919

30

School.edhole.com

Page 31: Top schools in noida

STATA command for a data set (#1). gen byte wrkslf0=wrkslf-1

(152 missing values generated)

. prtest wrkslf0 if wrkstat==1, by(sex)

Two-sample test of proportion male: Number of obs = 874

female: Number of obs = 743

------------------------------------------------------------------------------

Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

male | .8272311 .0127876 .8021678 .8522944

female | .9044415 .0107853 .8833027 .9255802

-------------+----------------------------------------------------------------

diff | -.0772103 .0167286 -.1099978 -.0444229

| under Ho: .0171735 -4.50 0.000

------------------------------------------------------------------------------

diff = prop(male) - prop(female) z = -4.4959

Ho: diff = 0

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(Z < z) = 0.0000 Pr(|Z| < |z|) = 0.0000 Pr(Z > z) = 1.0000

31

School.edhole.com