professor william greene stern school of business ioms department department of economics...

99
Professor William Greene Stern School of Business IOMS Department Department of Economics Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat- UB.0015.01

Upload: nestor-baney

Post on 01-Apr-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1

Professor William Greene Stern School of Business IOMS Department Department of Economics Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01 Slide 2 Part 5 Hypothesis Testing Slide 3 3/100 Part 5 Hypothesis Testing Objectives of Statistical Analysis Estimation How long do hard drives last? What is the median income among the 99%ers? Inference hypothesis testing Did minorities pay higher mortgage rates during the housing boom? Is there a link between environmental factors and breast cancer on eastern long island? Slide 4 4/100 Part 5 Hypothesis Testing General Frameworks Parametric Tests: features of specific distributions such as the mean of a Bernoulli or normal distribution. Specification Tests (Semiparametric) Do the data arrive from a Poisson process Are the data normally distributed Nonparametric Tests: Are two discrete processes independent? Slide 5 5/100 Part 5 Hypothesis Testing Hypotheses Hypotheses - labels State 0 of Nature Null Hypothesis State 1 Alternative Hypothesis Exclusive: Prob(H 0 H 1 ) = 0 Exhaustive: Prob(H 0 ) + Prob(H 1 ) = 1 Symmetric: Neither is intrinsically preferred the objective of the study is only to support one or the other. (Rare?) Slide 6 6/100 Part 5 Hypothesis Testing Testing Strategy Slide 7 7/100 Part 5 Hypothesis Testing Posterior (to the Evidence) Odds Slide 8 8/100 Part 5 Hypothesis Testing Does the New Drug Work? Hypotheses: H 0 =.50, H 1 =.75 Priors: P 0 =.40, P 1 =.60 Clinical Trial: N = 50, 31 patients respond p =.62 Likelihoods: L 0 (31| =.50) = Binomial(50,31,.50) =.0270059 L 1 (31| =.75) = Binomial(50,31,.75) =.0148156 Posterior odds in favor of H 0 = (.4/.6)(.0270059/.0148156) = 1.2152 > 1 Priors favored H 1 1.5 to 1, but the posterior odds favor H 0, 1.2152 to 1. The evidence discredits H 1 even though the data seem more consistent with prior P 1. Slide 9 9/100 Part 5 Hypothesis Testing Decision Strategy Prefer the hypothesis with the higher posterior odds A gap in the theory: How does the investigator do the cost benefit test? Starting a new business venture or entering a new market: Priors and market research FDA approving a new drug or medical device. Priors and clinical trials Statistical Decision Theory adds the costs and benefits of decisions and errors. Slide 10 10/100 Part 5 Hypothesis Testing An Alternative Strategy Recognize the asymmetry of null and alternative hypotheses. Eliminate the prior odds (which are rarely formed or available). Slide 11 11/100 Part 5 Hypothesis Testing http://query.nytimes.com/gst/fullpage.html?res=9C00E4DF113BF935A3575BC0A9649C8B63 Slide 12 12/100 Part 5 Hypothesis Testing Classical Hypothesis Testing The scientific method applied to statistical hypothesis testing Hypothesis: The world works according to my hypothesis Testing or supporting the hypothesis Data gathering Rejection of the hypothesis if the data are inconsistent with it Retention and exposure to further investigation if the data are consistent with the hypothesis Failure to reject is not equivalent to acceptance. Slide 13 13/100 Part 5 Hypothesis Testing Asymmetric Hypotheses Null Hypothesis: The proposed state of nature Alternative hypothesis: The state of nature that is believed to prevail if the null is rejected. Slide 14 14/100 Part 5 Hypothesis Testing Hypothesis Testing Strategy Formulate the null hypothesis Gather the evidence Question: If my null hypothesis were true, how likely is it that I would have observed this evidence? Very unlikely: Reject the hypothesis Not unlikely: Do not reject. (Retain the hypothesis for continued scrutiny.) Slide 15 15/100 Part 5 Hypothesis Testing Some Terms of Art Type I error: Incorrectly rejecting a true null Type II error: Failure to reject a false null Power of a test: Probability a test will correctly reject a false null Alpha level: Probability that a test will incorrectly reject a true null. This is sometimes called the size of the test. Significance Level: Probability that a test will retain a true null = 1 alpha. Rejection Region: Evidence that will lead to rejection of the null Test statistic: Specific sample evidence used to test the hypothesis Distribution of the test statistic under the null hypothesis: Probability model used to compute probability of rejecting the null. (Crucial to the testing strategy how does the analyst assess the evidence?) Slide 16 16/100 Part 5 Hypothesis Testing Possible Errors in Testing Correct Decision Type II Error Type I Error Correct Decision Hypothesis is Hypothesis is True False I Do Not Reject the Hypothesis I Reject the Hypothesis Slide 17 17/100 Part 5 Hypothesis Testing A Legal Analogy: The Null Hypothesis is INNOCENT Correct Decision Type II Error Guilty defendant goes free T ype I Error Innocent defendant is convicted Correct Decision Null Hypothesis Alternative Hypothesis Not Guilty Guilty Finding: Verdict Not Guilty Finding: Verdict Guilty The errors are not symmetric. Most thinkers consider Type I errors to be more serious than Type II in this setting. Slide 18 18/100 Part 5 Hypothesis Testing (Jerzy) Neyman (Karl) Pearson Methodology Statistical testing Methodology Formulate the null hypothesis Decide (in advance) what kinds of evidence (data) will lead to rejection of the null hypothesis. I.e., define the rejection region Gather the data Mechanically carry out the test. Slide 19 19/100 Part 5 Hypothesis Testing Formulating the Null Hypothesis Stating the hypothesis: A belief about the state of nature A parameter takes a particular value There is a relationship between variables And so on The null vs. the alternative By induction: If we wish to find evidence of something, first assume it is not true. Look for evidence that leads to rejection of the assumed hypothesis. Evidence that rejects the null hypothesis is significant Slide 20 20/100 Part 5 Hypothesis Testing Example: Credit Scoring Rule Investigation: I believe that Fair Isaacs relies on home ownership in deciding whether to accept an application. Null hypothesis: There is no relationship Alternative hypothesis: They do use homeownership data. What decision rule should I use? Slide 21 21/100 Part 5 Hypothesis Testing Some Evidence = Homeowners 5469 5030 1845 1100 Slide 22 22/100 Part 5 Hypothesis Testing Hypothesis Test Acceptance rate for homeowners = 5030/(5030+1100) =.82055 Acceptance rate for renters is.74774 H 0 : Acceptance rate for renters is not less than for owners. H 0 : p(renters) >.82055 H 1 : p(renters) 4) =.0021 Slide 74 74/100 Part 5 Hypothesis Testing Interpreting The Process = 0.852 Probabilities: P(X=0) =.4266 P(X=1) =.3634 P(X=2) =.1548 P(X=3) =.0437 P(X=4) =.0094 P(X>4) =.0021 There are 169 squares There are 144 trials Expect.4266*169 = 72.1 to have 0 hits/square Expect.3634*169 = 61.4 to have 1 hit/square Etc. Expect the average number of hits/square to =.852. Slide 75 75/100 Part 5 Hypothesis Testing Does the Theory Work? Theoretical Outcomes Sample Outcomes OutcomeProbabilityNumber of Cells Sample ProportionNumber of cells 0.426672.473380 1.363461.289949 2.154826.153926 3.04377.076913 4.00942.00591 > 4.00211.00000 169*Prob(Outcome)Observed frequencies Slide 76 76/100 Part 5 Hypothesis Testing Chi Squared for the Bombing Run 76 Slide 77 77/100 Part 5 Hypothesis Testing Difference in Means of Two Populations Two Independent Normal Populations Common known variance Common unknown variance Different Variances One and two sided tests Paired Samples Means of paired observations Treatments and Controls Diff-in-Diff SAT Nonparametric Mann/Whitney Two Bernoulli Populations Slide 78 78/100 Part 5 Hypothesis Testing Comparing Two Normal Populations Slide 79 79/100 Part 5 Hypothesis Testing Unknown Common Variance Slide 80 80/100 Part 5 Hypothesis Testing Household Incomes, Equal Variances ------------------------------------------------------ t test of equal means INCOME by MARRIED ------------------------------------------------------ MARRIED = 0 Nx = 817 MARRIED = 1 Ny = 3057 t [ 3872] = 3.7238 P value =.0002 ------------------------------------------------------ Mean Std.Dev. Std.Error INCOME ---------------------------------------------- MARRIED = 0.27982.12939.00453 MARRIED = 1.30145.15194.00275 ------------------------------------------------------ Slide 81 81/100 Part 5 Hypothesis Testing Unknown Different Variances Slide 82 82/100 Part 5 Hypothesis Testing 2 Proportions Two Bernoulli Populations: X i ~ Bernoulli with Prob(x i =1) = x Y i ~ Bernoulli with Prob(y i =1) = y H 0 : x = y The sample proportions are p x = (1/N x ) i x i and p y = (1/N y ) i y i Sample variances are p x (1-p x ) and p y (1-p y ). Use the Central Limit Theorem to form the test statistic. Slide 83 83/100 Part 5 Hypothesis Testing z Test for Equality of Proportions Application: Take up of public health insurance. ------------------------------------------------------ t test of equal means PUBLIC by FEMALE ------------------------------------------------------ FEMALE =0 Nx = 1812 FEMALE =1 Ny = 1565 t [ 3375] = 5.8627 P value =.0000 ------------------------------------------------------ Mean Std.Dev. Std.Error PUBLIC ---------------------------------------------- FEMALE = 0.84713.35996.00846 FEMALE = 1.91310.28178.00712 Slide 84 84/100 Part 5 Hypothesis Testing Paired Sample t and z Test Observations are pairs (X i,Y i ), i = 1,,N Hypothesis x = y. Both normal distributions. May be correlated. Medical Trials: Smoking vs. Nonsmoking (separate individuals, probably independent) SAT repeat tests, before and after. (Definitely correlated) Test is based on D i = X i Y i. Same as earlier with H0: D = 0. Slide 85 85/100 Part 5 Hypothesis Testing Treatment Effects SAT Do Overs Experiment: X 1, X 2, , X N = first SAT score, Y 1, Y 2, , Y N = second Treatment: T 1,,T N = whether or not the student took a Kaplan (or similar) prep score Hypothesis, y > x. Placebo: In Medical trials, N1 subjects receive a drug (treatment), N2 receive a placebo. Hypothesis: Effect is greater in the treatment group than in the control (placebo) group. Slide 86 86/100 Part 5 Hypothesis Testing Measuring Treatment Effects Slide 87 87/100 Part 5 Hypothesis Testing Treatment Effects in Clinical Trials Does Phenogyrabluthefentanoel (Zorgrab) work? Investigate: Carry out a clinical trial. N+0 = The placebo effect N+T N+0 = The treatment effect The hypothesis is that the difference in differences has mean zero. Placebo Drug Treatment No Effect N00 N0T Positive Effect N+0 N+T Slide 88 88/100 Part 5 Hypothesis Testing A Test of Independence In the credit card example, are Own/Rent and Accept/Reject independent? Hypothesis: Prob(Ownership) and Prob(Acceptance) are independent Formal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities. Rejection region: Joint frequencies that do not look like the products of the marginal frequencies. Slide 89 89/100 Part 5 Hypothesis Testing Contingency Table Analysis The Data: Frequencies Reject Accept Total Rent 1,845 5,469 7,214 Own 1,100 5,030 6,630 Total 2,945 10,499 13,444 Step 1: Convert to Actual Proportions Reject Accept Total Rent 0.13724 0.40680 0.54404 Own 0.08182 0.37414 0.45596 Total 0.21906 0.78094 1.00000 Slide 90 90/100 Part 5 Hypothesis Testing Independence Test Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions. [Rent,Reject] 0.54404 x 0.21906 = 0.11918 [Rent,Accept] 0.54404 x 0.78094 = 0.42486 [Own,Reject] 0.45596 x 0.21906 = 0.09988 [Own,Accept] 0.45596 x 0.78094 = 0.35606 Slide 91 91/100 Part 5 Hypothesis Testing Comparing Actual to Expected Slide 92 92/100 Part 5 Hypothesis Testing When is the Chi Squared Large? Critical values from chi squared table Degrees of freedom = (R-1)(C-1). Critical chi squared D.F..05.01 1 3.84 6.63 2 5.99 9.21 3 7.81 11.34 4 9.49 13.28 5 11.07 15.09 6 12.59 16.81 7 14.07 18.48 8 15.51 20.09 9 16.92 21.67 10 18.31 23.21 Slide 93 93/100 Part 5 Hypothesis Testing Analyzing Default Do renters default more often (at a different rate) than owners? To investigate, we study the cardholders (only) DEFAULT OWNRENT 0 1 All 0 4854 615 5469 46.23 5.86 52.09 1 4649 381 5030 44.28 3.63 47.91 All 9503 996 10499 90.51 9.49 100.00 Slide 94 94/100 Part 5 Hypothesis Testing Hypothesis Test Slide 95 95/100 Part 5 Hypothesis Testing Multiple Choices: Travel Mode 210 Travelers between Sydney and Melbourne 4 available modes, air, train, bus, car Among the observed variables is income. Does income help to explain mode choice? Hypothesis: Mode choice and income are independent. Slide 96 96/100 Part 5 Hypothesis Testing Travel Mode Choices Slide 97 97/100 Part 5 Hypothesis Testing Travel Mode Choices and Income +----------------------------------------------------------+ | Travel MODE Data | +--------+-------------------------------------------------+ |INCOME | AIR TRAIN BUS CAR || Total | +--------+-------------------------------------++----------+ |LOW | 10 36 9 8 || 63 | | | 0.04761 0.17143 0.04286 0.03810 || 0.30000 | |----------------------------------------------++----------+ |MEDIUM | 19 20 13 24 || 76 | | | 0.09048 0.09524 0.06190 0.11429 || 0.36190 | |----------------------------------------------++----------+ |HIGH | 29 7 8 27 || 71 | | | 0.13810 0.03333 0.03810 0.12857 || 0.33810 | |==============================================++==========+ |Total | 58 63 30 59 || 210 | | | 0.27619 0.30000 0.14286 0.28095 || 1.00000 | +--------+-------------------------------------+-----------+ Slide 98 98/100 Part 5 Hypothesis Testing Contingency Table +----------------------------------------------------------+ | Travel MODE Data | +--------+-------------------------------------------------+ |INCOME | AIR TRAIN BUS CAR || Total | +--------+-------------------------------------++----------+ | | 10 36 9 8 || 63 | |LOW | 0.04761 0.17143 0.04286 0.03810 || 0.30000 | | | 0.08286 0.09000 0.04286 0.08429 || |----------------------------------------------++----------+ | | 19 20 13 24 || 76 | |MEDIUM | 0.09048 0.09524 0.06190 0.11429 || 0.36190 | | | 0.09995 0.10857 0.05170 0.10168 || |----------------------------------------------++----------+ | | 29 7 8 27 || 71 | |HIGH | 0.13810 0.03333 0.03810 0.12857 || 0.33810 | | | 0.09338 0.10143 0.04830 0.09499 || |==============================================++==========+ |Total | 58 63 30 59 || 210 | | | 0.27619 0.30000 0.14286 0.28095 || 1.00000 | +--------+-------------------------------------+-----------+ Assuming independence, P(Income,Mode) = P(Income) x P(Mode). Slide 99 99/100 Part 5 Hypothesis Testing Computing Chi Squared For our transport mode problem, R = 3, C = 4, so DF = 2x3 = 6. The critical value is 12.59. The hypothesis of independence is rejected.