planning and analysis of observational studies · read any chapter, some other chapters ought to be...

17
Planning and Analysis of Observational Studies WILLIAM G. COCHRAN Edited by LINCOLN E. MOSES and FREDERICK MOSTELLER John Wiley & Sons New York Chichester Brisbane Toronto Singapore

Upload: others

Post on 07-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Planning and Analysis of Observational Studies

WILLIAM G. COCHRAN

Edited by

LINCOLN E. MOSES and

FREDERICK MOSTELLER

John Wiley & Sons New York Chichester Brisbane Toronto Singapore

Page 2: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

This Page Intentionally Left Blank

Page 3: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Planning and Analysis of Observational Studies

Page 4: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

This Page Intentionally Left Blank

Page 5: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Planning and Analysis of Observational Studies

WILLIAM G. COCHRAN

Edited by

LINCOLN E. MOSES and

FREDERICK MOSTELLER

John Wiley & Sons New York Chichester Brisbane Toronto Singapore

Page 6: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Copyright 0 1983 by John Wiley & Sons, Inc.

All rights reserved. Published simultaneously in Canada.

Reproduction or translation of any part of this work beyond that permitted by Section 107 or 108 of the 1976 United States Copyright Act without the permission of the copyright owner is unlawful. Requests for permission or further information should be addressed to the Permissions Department, John Wiley & Sons, Inc.

Library of Congress Cataloging in Publication Data:

Cochran, William Gemmell, 1909- 1980 Planning and analysis of observational studies.

(Wiley series in probability and mathematical statistics, ISSN 0271-6356. Applied probability and statistics)

Includes bibliographies and index. I . Experimental design. 2. Analysis of variance.

I. Moses, Lincoln E. 1921- 1916- . 111. Title. IV. Series: Wiley series in probability and mathematical statistics. Applied probability and statistics section.

. 11. Mosteller, Frederick

QA279.C63 1983 001.4'2 83-6461 ISBN 0-471-88719-6

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

Page 7: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Preface

In his notes on a Monograph on Non-Experimental Studies, W. G.. Cochran wrote, before he had proceeded very far with the outline, “My plan has been for a short book (e.g., 150 pages) addressed not to statisticians but to subject-matter people who do or may do these studies. Because my experi- ence lies there, the selection of topics and examples will be a bit oriented toward field studies in health, but I’ll try to avoid too much of this. . . . I’d like to keep it as simple as seems feasible. It will be more of a reference-type book than a text, but might be the basis of a seminar in a subject-matter department.”

He spoke also of the difficulty of choosing an order for chapters in such a work. He thought the reader would have problems because, to prepare to read any chapter, some other chapters ought to be read. He felt this was symptomatic of the topic. The level of difficulty would vary a great deal from part to part, and the reader would find it necessary to adjust to this. He also found the problem that every author of a monograph has encoun- tered: more research is needed in many spots “before I’ll have something worth saying. I think, however, that if I pay too much attention to this point, it will never be written.” Ultimately, he wrote six and a half of his intended seven chapters, and we present them here.

Cochran wrote his book on observational studies by assembling about a chapter a year, usually writing in the summer. He taught a course on the subject at Harvard using his notes. As the manuscript neared completion, his health suffered a sequence of blows, each of which required substantial time for recovery. His other extensive writings, both new research articles and revisions of Snedecor and Cochran’s Statistical Methods and of his own Sampling Techniques, as well as his teaching, continued at a good pace in spite of these medical setbacks. Nevertheless, he did not get back to the book before his death, although he sometimes spoke of the possibility of getting someone to help complete it.

V

Page 8: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

vi PREFACE

After his death, Cochran’s wife, Betty, edited his collected papers. Mosteller consulted with her on the possibility that the manuscript on observational studies might be publishable. She searched through Cochran’s papers and identified the manuscript. Nevertheless, the revision of the papers did not progress until Moses agreed to help with the editing. Since then it has been a joyous enterprise for both of us.

Several considerations encouraged us to undertake the work of editing the manuscript for publication. First, the planning and analysis of observa- tional studies is an important area of statistical methodology. Second, Cochran made many strong contributions to this topic over his career, and he wrote from broad experience and with theoretical insight; surely his book should be published. Third, the manuscript itself was attractive, especially for its characteristic of considering failure of assumptions. Again and again Cochran gives attention to the behavior of a statistical procedure when one or more of )he assumptions underlying its mathematical justification is false in some degree. Thus, regressions may not be linear; they may not be parallel; matching may be inexact; variances may be inhomogeneous; and so forth. The manuscript contained, as do many others of his publications, tables indicating the quantitative results of assumptions that failed by various amounts. These analyses both increase understanding of the statisti- cian’s tools and facilitate practical planning of studies.

The editors of a posthumous work have an obligation to explain the nature and extent of their interventions. We agreed that we would leave the material as originally written, except when there was clear need to make modifications. Once this decision was reached, we had a fairly straightfor- ward path. Cochran did not write and repeatedly revise as some authors do, and so we felt justified in treating the manuscript as finished prose for the most part.

Cochran had told Mosteller repeatedly that the book was complete except for one chapter, but that the order of the chapters was still a puzzle. We had two candidates for the opening chapter-the slightly technical chapter that we have put first and the more chatty, unfinished Chapter 7. Some readers may wish to read Chapter 7 first. We chose to put it last because opening the book with a fragmentary chapter might give the reader the mistaken impression that much of the work was unfinished. Neverthe- less, a completed Chapter 7 probably would have made a beautiful opening.

One topic was treated twice, both in Chapter 1 and Chapter 6. We removed most of this topic from Chapter 1 and placed some of it as the Appendix to Section 6.12. This move had the advantage of substantially lowering the technical level of Chapter 1.

We found occasional errors in formulas and corrected them. We also struggled to unify the notation, whose variety may have stemmed from

Page 9: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

PREFACE vii

chapters being written at disparate times and places. We are sure that Cochran would have corrected this in his own final review of the book.

We have preserved the economy of the technical writing; some brevity was achieved by relying on the reader to provide parallelism, by using implicit definitions, by saying in words what might require several sub- scripts, or by adopting occasional tricky notation that the reader will need to detect. Using these devices, Cochran avoids many ugly equations and mathematical expressions. We have decided that terms should usually be defined, and so we have added some definitions of expressions where the reader would otherwise have to guess at their meanings.

The references needed attention-their state was mixed from chapter to chapter. They were sometimes complete, or nearly complete, and sometimes were only indicated in the text by author or by author and a dubious date. We have not added new references, except possibly where we may not have understood the source intended. If we have not found the appropriate ones, we apologize and will appreciate having misjudgments brought to our attention. We have included with each reference to Cochran’s own work, the number (in square brackets) which refers to the presentation of the refer- enced work in William G. Cochran, Contribution to Statistics, John Wiley & Sons, New York, New York, 1982. For some readers, this will be a more convenient source than the original publication.

Although we had the original outline for Chapter 7, Cochran did not keep to it and so we cannot conjecture what the rest of the chapter would have been like. Because Cochran expresses many personal views based on vast experience, it seemed wise to stop with his words.

LINCOLN E. MOSES FREDERICK MOSTELLER

Stanford California Boston, Massachusetts May I983

Page 10: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

This Page Intentionally Left Blank

Page 11: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Acknowledgments

In preparing the manuscript, we have been aided by Nina Leech, Marjorie Olson, and Beverly Weintraub, who have corrected and typed repeated versions with great care. Cleo Youtz helped us enormously with the refer- ences, as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor Halvorsen, David Hoaglin, Marjorie Olson, and Cleo Youtz also advised us about other matters. Donald B. Rubin has kindly given permis- sion for the reproduction of parts of tables from some of his work on matching and adjustment.

William Cochran’s original work was partly facilitated by a grant from the National Science Foundation, and the preparation of this manuscript has been partly facilitated by National Science Foundation Grant No. SES 8023644.

L. E. M. F. M.

Page 12: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

This Page Intentionally Left Blank

Page 13: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

Contents

1. VARIATION, CONTROL, AND BIAS

1. I. Introduction 1.2. Strategy in Controlled Experiments- Sampled and

Target Populations 1.3. The Principal Sources of Variation in the Responses 1.4. Methods of Control 1.5. Effects of Bias 1.6. Summary

References

2. STATISTICAL INTRODUCTION

2.1. Drawing Conclusions from Data 2.2. Tests of Significance 2.3. Confidence Intervals 2.4. 2.5. 2.6. Summary

Systematic Differences Between the Populations The Model When Bias is Present

References

3. PRELIMINARY ASPECTS OF PLANNING

3.1. Introdudtion 3.2. The Statement of Objectives 3.3. The Treatments 3.4.

3.5.

Measurements of Treatment Levels for Individual Persons and the Effects of Grouping Other Points Related to Treatments

3 7 8

12 13 14

15

15 18 20 22 25 29 31

32

32 33 35

38 40

xi

Page 14: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

xii CONTENTS

3.6. Control Treatments 3.7. The Responses 3.8. Timing of Measurements 3.9. summary

References

4. FURTHER ASPECTS OF PLANNING

4.1. 4.2. 4.3. 4.4. 4.5. 4.6. 4.7. 4.8. 4.9. 4.10.

Sample Size in Relation to Tests of Significance Sample Size for Estimation The Effect of Bias More Complex Comparisons Samples of Clusters Plans for Reducing Nonresponse Relationship Between Sampled and Target Populations Pilot Studies and Pretests The Devil’s Advocate Summary References

5 . MATCHING

5.1. 5.2. 5.3. 5.4. 5.5. 5.6. 5.7. 5.8. 5.9. 5.10.

5.11. 5.12.

Confounding Variables Matching The Construction of Matches Effect of Within-Class Matching on x Effect of Caliper Matching on x Effect of “Nearest Available” Matching on x Effect of Mean Matching on x Effects on bias of p, - p2 Effect of Matching on the Variance of p, - p2 Introduction to Statistical Analysis of Pair-Matched Samples Analysis with Mean Matching: y Continuous

References summary

6. ADJUSTMENTS IN ANALYSIS

6.1. Introduction 6.2. y Continuous: x’s Classified 6.3. y Binomial: x’s Classified 6.4. Treatment Difference Varying From Cell to Cell

42 44 46 48 49

50

SO 55 58 59 61 65 68 70 71 72 73

74

74 78 81 84 87 88 90 90 93

94 97 97

100

102

102 102 105 106

Page 15: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

CONTENTS

6.5. y and x’s Quantitative: Adjustments by Regression (Covariance)

6.6. Regression Adjustments with Some x’s Classified 6.7. Effect of Regression Adjustments on Bias in 7, - 6.8. Effect of Curvature on Linear-Regression Adjustments 6.9. Effectiveness of Regression Adjustments on Precision 6.10. Effect of Errors in the Measurement of x 6.1 1. Matching and Adjustment Compared: In Experiments 6.12. Matching and Adjustment Compared: In

Observational Studies Appendix to Section 6.12 6.13. A Preliminary Test of Comparability 6.14. Summary

References

7. SIMPLE STUDY STRUCTURES

7.1. Introduction 7.2. 7.3.

The Single Group: Measured After Treatment Only The Single Group: Measured Before and After Treatment

xiii

108 110 112 113 115 117 119

121 123 125 126 128

130

130 130

131 7.4. The Single Group: Series of Measurements Before and After 132

References 138

INDEX 141

Page 16: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

This Page Intentionally Left Blank

Page 17: Planning and Analysis of Observational Studies · read any chapter, some other chapters ought to be read. ... as did Elaine Ung. John Emerson, Katherine Godfrey, Katherine Taylor

WILEY SERIES IN PROBABILITY AND MATHEMATICAL STATISTICS

ESTABLISHED BY WALTER A. SHEWHART AND SAMUEL s. WlLKS Editors Ralph A. Bradlej J. Stuart Hunter, David G. Kendall, Rupert G. Miller. Jr., Geoflrey S. Watson

ADLER The Geometry of Random Fields ANDERSON The Statistical Analysis of Time Series ANDERSON An Introduction to Multivariate Statistical Analysis ARAUJO and GlNE The Central Limit Theorem for Real and Banach

ARNOLD The Theory of Linear Models and Multivariate Analysis BARLOW, BARTHOLOMEW, BREMNER, and BRUNK Statistical

BARNETT Comparative Statistical Inference, Second Edition BHATTACHARYYA and JOHNSON Statistical Concepts and Methods BILLINGSLEY Probability and Measure CASSEL, SARNDAL, and WRETMAN Foundations of Inference in

COCHRAN Contributions to Statistics COCHRAN Planning and Analysis of Observational Studies DE FINETTI Theory of Probability, Volumes 1 and I1 DOOB Stochastic Processes EATON Multivariate Statistics: A Vector Space Approach FELLER An Introduction to Probability Theory and Its Applications,

FULLER Introduction to Statistical Time Series GRENANDER Abstract Inference GUTTMAN Linear Models: An Introduction HANNAN Multiple Time Series HANSEN, HURWITZ, and MADOW Sample Survey Methods and

HARDING and KENDALL Stochastic Geometry HOEL Introduction to Mathematical Statistics, Fourth Edition HUBER Robust Statistics IMAN and CONOVER A Modern Approach to Statistics IOSIFESCU Finite Markov Processes and Applications ISAACSON and MADSEN Markov Chains KAGAN, LINNIK, and RAO Characterization Problems in Mathematical

LAHA and ROHATGI Probability Theory LARSON Introduction to Probability Theory and Statistical Inference,

LEHMANN Testing Statistical Hypotheses LEHMANN Theory of Point Estimation MATTHES, KERSTAN, and MECKE Infinitely Divisible Point Processes MUIRHEAD Aspects of Multivariate Statistical Theory PARZEN Modern Probability Theory and Its Applications PURI and SEN Nonparametric Methods in Multivariate Analysis RANDLES and WOLFE Introduction to the Theory of Nonparametric

RAO Linear Statistical Inference and Its Applications, Second Edition ROHATGI An Introduction to Probability Theory and Mathematical

ROSS Stochastic Processes RUBINSTEIN Simulation and The Monte Carlo Method SCHEFFE The Analysis of Variance SEBER Linear Regression Analysis SEN Sequential Nonparametrics: lnvariance Principles and Statistical

Probability and Mathematical Statistics

Valued Random Variables

Inference Under Order Restrictions

Survey Sampling

Volume I, Third Edition, Revised; Volume 11, Second Edition

Theory, Volumes I and I1 '

Statistics

Third Edition

Statistics

Statistics

Inference