montgomery applied statistics_probability_engineers_5th_txtbk

Download Montgomery applied statistics_probability_engineers_5th_txtbk

If you can't read please download the document

Upload: ankit-katiyar

Post on 06-May-2015

31.659 views

Category:

Documents


34 download

TRANSCRIPT

  • 1.JWCL232_fm_i-xvi.qxd 1/23/10 1:18 PM Page xvi

2. JWCL232_fm_i-xvi.qxd 1/21/10 7:40 PM Page iThis online teaching and learning environmentintegrates the entire digital textbook with themost effective instructor and student resources WileyPLUS From multiple study paths, to self-assessment, to a wealth of interactive visual and audio resources, WileyPLUS gives you everything you need to personalize the teaching and learning experience. plus 3. JWCL232_fm_i-xvi.qxd 1/21/10 7:40 PM Page iiALL THE HELP, RESOURCES, AND PERSONAL SUPPORT YOU AND YOUR STUDENTS NEED!www.wileyplus.com/firstday www.WhereFacultyConnect.com WileyPLUSwww.wiley.com/college/quickstartwww.wileyplus.com/support www.wileyplus.com/accountmanager 4. JWCL232_fm_i-xvi.qxd 1/21/10 7:40 PM Page iii Applied Statistics and Probability for Engineers Fifth Edition Douglas C. Montgomery Arizona State University George C. Runger Arizona State University John Wiley & Sons, Inc. 5. JWCL232_fm_i-xvi.qxd 1/23/10 1:17 PMPage ivTo:Meredith, Neil, Colin, and CherylRebecca, Elisa, George, and Taylor EXECUTIVE PUBLISHER Don Fowley ASSOCIATE PUBLISHER Daniel Sayre ACQUISITIONS EDITOR Jennifer Welter PRODUCTION EDITOR Trish McFadden MARKETING MANAGER Christopher Ruel SENIOR DESIGNER Kevin Murphy MEDIA EDITOR Lauren Sapira PHOTO ASSOCIATE Sheena Goldstein EDITORIAL ASSISTANT Alexandra Spicehandler PRODUCTION SERVICES MANAGEMENT Aptara COVER IMAGE Norm Christiansen This book was set in 10/12 pt. TimesNewRomanPS by Aptara and printed and bound by R.R. Donnelley/Willard Division. The cover was printed by Phoenix Color. This book is printed on acid-free paper. Copyright 2011 John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc. 222 Rosewood Drive, Danvers, MA 01923, website www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201) 748-6011, fax (201) 748-6008, website http://www.wiley.com/go/permissions. To order books or for customer service, please call 1-800-CALL WILEY (225-5945). ISBN13: 978-0-470-05304-1 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 6. JWCL232_fm_i-xvi.qxd 1/21/10 10:21 PMPage v Wiley Books by These Authors Website: www.wiley.com/college/montgomery Engineering Statistics, Fourth Edition by Montgomery, Runger, and Hubele Introduction to engineering statistics, with topical coverage appropriate for a one-semester course. A modest mathematical level, and an applied approach. Applied Statistics and Probability for Engineers, Fifth Edition by Montgomery and Runger Introduction to engineering statistics, with topical coverage appropriate for either a one- or two-semester course. An applied approach to solving real-world engineering problems. Introduction to Statistical Quality Control, Sixth Edition by Douglas C. Montgomery For a rst course in statistical quality control. A comprehensive treatment of statistical methodology for quality control and improvement. Some aspects of quality management are also included, such as the six-sigma approach. Design and Analysis of Experiments, Seventh Edition by Douglas C. Montgomery An introduction to design and analysis of experiments, with the modest prerequisite of a rst course in statistical methods. For senior and graduate students and for practitioners, to design and analyze experiments for improving the quality and efciency of working systems. Introduction to Linear Regression Analysis, Fourth Edition by Montgomery, Peck, and Vining A comprehensive and thoroughly up-to-date look at regression analysis, still the most widely used technique in statistics today. Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Third Edition by Myers, Montgomery, and Anderson-Cook Website: www.wiley.com/college/myers The exploration and optimization of response surfaces, for graduate courses in experimental design, and for applied statisticians, engineers, and chemical and physical scientists. Generalized Linear Models: With Applications in Engineering and the Sciences by Myers, Montgomery, and Vining Website: www.wiley.com/college/myers An introductory text or reference on generalized linear models (GLMs). The range of theoretical topics and applications appeals both to students and practicing professionals. Introduction to Time Series Analysis and Forecasting by Montgomery, Jennings, and Kulahci Methods for modeling and analyzing time series data, to draw inferences about the data and generate forecasts useful to the decision maker. Minitab and SAS are used to illustrate how the methods are implemented in practice. For advanced undergrad/rst-year graduate, with a prerequisite of basic statistical methods. Portions of the book require calculus and matrix algebra. 7. JWCL232_fm_i-xvi.qxd 1/21/10 7:40 PM Page vi Preface INTENDED AUDIENCE This is an introductory textbook for a rst course in applied statistics and probability for undergraduate students in engineering and the physical or chemical sciences. These individuals play a signicant role in designing and developing new products and manufacturing systems and processes, and they also improve existing systems. Statistical methods are an important tool in these activities because they provide the en- gineer with both descriptive and analytical methods for dealing with the variability in observed data. Although many of the methods we present are fundamental to statistical analysis in other disciplines, such as business and management, the life sciences, and the social sciences, we have elected to focus on an engineering-oriented audience. We believe that this approach will best serve students in engineering and the chemical/physical sciences and will allow them to concentrate on the many applications of statistics in these disciplines. We have worked hard to ensure that our examples and exercises are engineering- and science-based, and in almost all cases we have used examples of real dataeither taken from a published source or based on our consulting experiences. We believe that engineers in all disciplines should take at least one course in statistics. Unfortunately, because of other requirements, most engineers will only take one statistics course. This book can be used for a single course, although we have provided enough material for two courses in the hope that more students will see the important applications of statistics in their everyday work and elect a second course. We believe that this book will also serve as a useful reference. We have retained the relatively modest mathematical level of the rst four editions. We have found that engineering students who have completed one or two semesters of calculus should have no difculty reading almost all of the text. It is our intent to give the reader an understanding of the methodology and how to apply it, not the mathematical theory. We have made many enhancements in this edition, including reorganizing and rewriting major portions of the book and adding a number of new exercises. ORGANIZATION OF THE BOOK Perhaps the most common criticism of engineering statistics texts is that they are too long. Both instructors and students complain that it is impossible to cover all of the topics in the book in one or even two terms. For authors, this is a serious issue because there is great variety in both the content and level of these courses, and the decisions about what material to delete without limiting the value of the text are not easy. Decisions about which topics to include in this edition were made based on a survey of instructors.Chapter 1 is an introduction to the eld of statistics and how engineers use statistical methodology as part of the engineering problem-solving process. This chapter also introduces the reader to some engineer- ing applications of statistics, including building empirical models, designing engineering experiments, and monitoring manufacturing processes. These topics are discussed in more depth in subsequent chapters.Chapters 2, 3, 4, and 5 cover the basic concepts of probability, discrete and continuous random vari- ables, probability distributions, expected values, joint probability distributions, and independence. We have given a reasonably complete treatment of these topics but have avoided many of the mathematical or more theoretical details.Chapter 6 begins the treatment of statistical methods with random sampling; data summary and de- scription techniques, including stem-and-leaf plots, histograms, box plots, and probability plotting; and several types of time series plots. Chapter 7 discusses sampling distributions, the central limit theorem, and point estimation of parameters. This chapter also introduces some of the important properties of esti- mators, the method of maximum likelihood, the method of moments, and Bayesian estimation.Chapter 8 discusses interval estimation for a single sample. Topics included are condence intervals for means, variances or standard deviations, proportions, prediction intervals, and tolerance intervals. Chapter 9 discusses hypothesis tests for a single sample. Chapter 10 presents tests and condence intervals for two samples. This material has been extensively rewritten and reorganized. There is detailed information and examples of methods for determining appropriate sample sizes. We want the student to become familiar with how these techniques are used to solve real-world engineering problems and to get some understanding of vi 8. JWCL232_fm_i-xvi.qxd1/22/10 5:42 AMPage viiPREFACE viithe concepts behind them. We give a logical, heuristic development of the procedures rather than a formal,mathematical one. We have also included some material on nonparametric methods in these chapters.Chapters 11 and 12 present simple and multiple linear regression including model adequacy checking andregression model diagnostics and an introduction to logistic regression. We use matrix algebra throughout themultiple regression material (Chapter 12) because it is the only easy way to understand the concepts presented.Scalar arithmetic presentations of multiple regression are awkward at best, and we have found that undergrad-uate engineers are exposed to enough matrix algebra to understand the presentation of this material.Chapters 13 and 14 deal with single- and multifactor experiments, respectively. The notions of ran-domization, blocking, factorial designs, interactions, graphical data analysis, and fractional factorials areemphasized. Chapter 15 introduces statistical quality control, emphasizing the control chart and the fun-damentals of statistical process control.WHATS NEW IN THIS EDITION?We received much feedback from users of the fourth edition of the book, and in response we have madesubstantial changes in this new edition. The most obvious change is that the chapter on nonparametric methods is gone. We have inte- grated most of this material into Chapter 9 and 10 on statistical hypothesis testing, where we think it is a much better t if instructors elect to cover these techniques. Another substantial change is the increased emphasis on the use of P-value in hypothesis test- ing. Many sections of several chapters were rewritten to reect this. We have also rewritten and modied many portions of the book to improve the explanations and try to make the concepts easier to understand. We have added brief comments at the end of examples to emphasize the practical interpretations of the results. We have also added approximately 200 new homework exercises.FEATURED IN THIS BOOKDenitions, Key Concepts, and Equations 4-4 MEAN AND VARIANCE OF A CONTINUOUSThroughout the text, denitions and key con- RANDOM VARIABLEcepts and equations are highlighted by a boxThe mean and variance can also be dened for a continuous random variable. Integrationreplaces summation in the discrete denitions. If a probability density function is viewed as ato emphasize their importance.loading on a beam as in Fig. 4-1, the mean is the balance point. MeanandSuppose X is a continuous random variable with probability density function f(x). VarianceThe mean or expected value of X, denoted as or E(X), is Learning Objectives E1X 2 xf 1x2 dx (4-4) Learning Objectives at the startThe variance of X, denoted as V(X) or 2, is of each chapter guide the students in what they are 2 V1X 2 1x 2 f 1x2 dx x f 1x2 dx 2 22 expected to take away from this The standard deviation of X is 22 . chapter and serve as a study reference.The equivalence of the two formulas for variance can be derived from the same approach used LEARNING OBJECTIVESfor discrete random variables. After careful study of this chapter you should be able to do the following: 1. Determine probabilities from probability density functions EXAMPLE 4-6 Electric Current 2. Determine probabilities from cumulative distribution functions and cumulative distribution func- measurement in Example 4-1, the mean For the copper current The variance of X istions from probability density functions, and the reverseof X is 3. Calculate means and variances for continuous random variables 2020 20 4. Understand the assumptions for some common continuous probability distributions 20E1X 2 xf 1x2 dx 0.05x 22 ` 10V1X 2 1x 102 2f 1x2 dx 0.051x 102 33 ` 33.33 0 5. Select an appropriate continuous probability distribution to calculate probabilities in specic applications 0 0 0 6. Calculate probabilities, determine means and variances for some common continuous probabilitydistributions 7. Standardize normal random variablesThe expected value of a function h(X ) of a continuous random variable is also dened in a 8. Use the table for the cumulative distribution function of a standard normal distribution to calcu- straightforward manner.late probabilities 9. Approximate probabilities for some binomial and Poisson distributionsExpectedValue of aIf X is a continuous random variable with probability density function f(x),Function of aContinuous RandomVariableE 3 h1X 2 4 h1x2 f 1x2 dx (4-5) 9. JWCL232_fm_i-xvi.qxd1/23/1010:19 PMPage viii viii PREFACE Seven-Step Procedure for Hypothesis Testing 9-1.6 General Procedure for Hypothesis Tests The text introduces a sequence of seven steps in applying hypothesis-testing methodology andThis chapter develops hypothesis-testing procedures for many practical problems. Use of thefollowing sequence of steps in applying hypothesis-testing methodology is recommended. explicitly exhibits this procedure in examples. 1. Parameter of interest: From the problem context, identify the parameter of interest. 2. Null hypothesis, H0: State the null hypothesis, H0. 3. Alternative hypothesis, H1: Specify an appropriate alternative hypothesis, H1. 4. Test statistic: Determine an appropriate test statistic. 5. Reject H0 if: State the rejection criteria for the null hypothesis. 6. Computations: Compute any necessary sample quantities, substitute these into theequation for the test statistic, and compute that value. 7. Draw conclusions: Decide whether or not H0 should be rejected and report that inthe problem context.Steps 14 should be completed prior to examination of the sample data. This sequence ofsteps will be illustrated in subsequent sections. FiguresTable 11-1 Oxygen and Hydrocarbon Levels Numerous gures throughout the textObservationHydrocarbon LevelPurity Numberx (%)y (%) illustrate statistical concepts in multiple1 0.9990.01 formats. 2 1.0289.053 1.1591.434 1.2993.745 1.4696.736 1.3694.451007 0.8787.598 1.2391.77 989 1.5599.4296 10 1.4093.65Purity ( y) 11 1.1993.54 94 12 1.1592.5292 13 0.9890.56y14 1.0189.54 90(Oxygen purity) 15 1.1189.8588 16 1.2090.39 0 + 1 (1.25) 17 True regression line1.2693.25 86 18 Y x = 0 + 1x1.3293.410.85 0.95 1.051.151.25 1.35 1.45 1.55= 75 + 15xHydrocarbon level (x) 19 1.4394.98 0 + 1 (1.00) Figure 11-1 Scatter diagram of oxygen purity versus hydrocarbon 20 0.9587.33level from Table 11-1.x = 1.00x = 1.25 x (Hydrocarbon level)Figure 11-2 The distribution of Y for a given value of x for theoxygen purityhydrocarbon data. Minitab Output Character Stem-and-Leaf Display Throughout the book, we have Stem-and-leaf of StrengthN = 80 Leaf Unit = 1.0 presented output from Minitab as 176 typical examples of what can be done 287 with modern statistical software.397510 15811 05811 12 01317 13 13345525 14 1235689937 15 001344678888(10)16000335778933 17 011244566823 18 001134616 19 03469910 20 0178621 8Figure 6-6 A stem- 5 22 189and-leaf diagram from 223 7Minitab.124 5 10. JWCL232_fm_i-xvi.qxd 1/21/108:40 PMPage ixPREFACEixExample ProblemsEXAMPLE 10-1 Paint Drying TimeA set of example problems provides the stu- A product developer is interested in reducing the drying time4. Test statistic: The test statistic isof a primer paint. Two formulations of the paint are tested; for-dent with detailed solutions and comments forx1 x2 0mulation 1 is the standard chemistry, and formulation 2 has a z0 new drying ingredient that should reduce the drying time.22interesting, real-world situations. Brief practi- From experience, it is known that the standard deviation of1 B n1 2 n2cal interpretations have been added in this drying time is 8 minutes, and this inherent variability shouldwhere 2 2 182 2 64 and n1 n2 10. 12be unaffected by the addition of the new ingredient. Ten spec-edition.imens are painted with formulation 1, and another 10 speci-5. Reject H0 if: Reject H0: 1 2 if the P-value is lessmens are painted with formulation 2; the 20 specimens are than 0.05.painted in random order. The two sample average drying times 6. Computations: Since x1 121 minutes and x2 112are x 1 121 minutes and x 2 112 minutes, respectively. minutes, the test statistic isWhat conclusions can the product developer draw about the121 112effectiveness of the new ingredient, using 0.05? z0 2.52We apply the seven-step procedure to this problem as 182 2 182 2 follows:B 10101. Parameter of interest: The quantity of interest is the dif- 7. Conclusion: Since z0 2.52, the P-value is P ference in mean drying times, 1 2, and 0 0. 1 12.522 0.0059, so we reject H0 at the 0.05 level2. Non hypothesis: H 0: 1 2 0, or H 0: 1 2. Practical Interpretation: We conclude that adding the3. Alternative hypothesis: H 1: 1 2. We want to reject new ingredient to the paint signicantly reduces the drying H0 if the new ingredient reduces mean drying time.time. This is a strong conclusion.ExercisesEXERCISES FOR SECTION 5-5Each chapter has an extensive collection 5-67. Suppose that X is a random variable with probability5-73.Suppose that X has the probability distributionof exercises, including end-of-section distributionfX 1x2 1, 1x2exercises that emphasize the material infX 1x2 14,x 1, 2, 3, 48Find the probability distribution of the random variablethat section, supplemental exercises atFind the probability distribution of Y 2X 1.Y e X.(a) Find the probability distribution of the random variablethe end of the chapter that cover the5-68. Let X be a binomial random variable with p 0.25Supplemental Exercises Y 2X 10.5-74. The random variable X has the probability distri- and n 3. FindFind the expecteddistribution of the random the probability value of Y. butionscope of chapter topics and require the (b) variable Y X .2 5-75. Show that the following function satises the proper-5-70. Suppose that X has a uniform probability distributionties of a joint probability mass function:student to make a decision about the 5-69. Suppose that X is a continuous random variable with x fX 1x2 8 , 0x4 probability distribution fX 1x2 1, 0x1approach they will use to solve the x y f (x, y)x fX that the probability xdistribution of the random variable probability distribution of Y 0 2) .2 Find the (Xproblem, and mind-expanding exer- Show 1x2 18 , 0 6 0 14Y 2 ln X is chi-squared with two degrees of freedom. 0 1 18cises that often require the student to 1 0 185-71. A random variable X has the following probabilityextend the text material somewhat or to distribution: 1 1 14apply it in a novel situation.fX 1x2 ex,x02 2 14Answers are provided to most (a) Find the probability distribution for Y X 2. Determine the following:odd-numbered exercises in Appendix CMIND-EXPANDING EXERCISES 0.5, Y 1.52 (b) P1X 12 (b) Find the probability distribution for Y X 12 .(a) P1Xin the text, and the WileyPLUS online(c) Find the probability distribution for Y ln X.(c) P1X 1.52 (d) P1X 0.5, Y 1.525-96.5-72. The velocity X2,ap , Xp are aindependent, x and h( y), is a function only of),y. Show that and V(Y ).Show that if X1, of particle in gas is a random variable(e) Determine E(X E(Y ), V(X ), X and Ylearning environment includes for stu-continuous random variables, P(X1 A1, X2 A2, p , are independent. Marginal probability distribution of the random vari- V with probability distribution(f )Xp Ap) P(X1 A1)P(X2 A2) p P(Xp Ap) for any 5-100. This exerciseXdents complete detailed solutions to able extends the hypergeometric dis-regions A1, A2, p , Ap f in the range of X1, v p , Xp tribution to multiple variables. Consider a population Y given that X 12 bv X2, 0 (g) Conditional probability distribution ofV 1v2 av eselected exercises. respectively.with N items (h) kE1Y 0 X 12 of different types. Assume there are N15-97.where b is a constantX2, pdepends on the temperature of the type (i) N2 items of Y independent? Why or why not?Show that if X1, that , Xp are independent items of 1, Are X and type 2, ..., Nk items of type krandom variables mass Y c1particle. X2 p cp Xp, so that N1 Nj) p p, the N. Suppose that a X and Y. gas and the and of the X1 c2 ( 2 Calculate Nk correlation between ran- (a) Find the value of2the constant a. 2 dom sample of size nThe percentage of people given an antirheumatoid5-76. is selected, without replacement, V1Y 2 c1V1X1 2 c2V1X2 2 p cpV1Xp 2 2 (b) The kinetic energy of the particle is W mV 2 2 . from the population. Let X1, X2suffer k denotemoderate, or minor side effects Find the medication who , ..., X severe, the numberYou may assume that the random variables are continuous. of items of each type in the sample so that X1, X2, Assume that people react probability distribution of W. are 10, 20, and 70%, respectively. p p, Xk n. Show that for feasible values of the parame-5-98. Suppose that the joint probability function of ters n, x1, x2, p, xk, N1, N2, p, Nk, the probability is P (X1 the continuous random variables X and Y is constant onthe rectangle 0 x a, 0 y b. Show that X and Y N1 N2 Nk a ba b p a bare independent.x1 x2 xn x1, X2 x2, p, Xk xk) 5-99. Suppose that the range of the continuous N a bvariables X and Y is 0 x a and 0 y b. Also nsuppose that the joint probability density functionfXY (x, y) g (x)h( y), where g (x) is a function only ofImportant Terms and Concepts IMPORTANT TERMS AND CONCEPTSAt the end of each chapter is a list ofBivariate distributionConditional varianceJoint probability density Multinomial distribution Bivariate normalContour plotsfunction Reproductive property ofimportant terms and concepts for an distribution Correlation Joint probability massthe normaleasy self-check and study tool.Conditional meanCovariance function distributionSTUDENT RESOURCES Data Sets Data sets for all examples and exercises in the text. Visit the student section of thebook Web site at www.wiley.com/college/montgomery to access these materials. Student Solutions Manual Detailed solutions for selected problems in the book. The StudentSolutions Manual may be purchased from the Web site at www.wiley.com/college/montgomery. 11. JWCL232_fm_i-xvi.qxd 1/26/10 4:35 AMPage x xPREFACE INSTRUCTOR RESOURCES The following resources are available only to instructors who adopt the text: Solutions Manual All solutions to the exercises in the text. Data Sets Data sets for all examples and exercises in the text. Image Gallery of Text Figures PowerPoint Lecture Slides Section on Logistic Regression These instructor-only resources are password-protected. Visit the instructor section of the book Web site at www.wiley.com/college/montgomery to register for a password to access these materials. MINITAB A student version of Minitab is available as an option to purchase in a set with this text. Student versions of software often do not have all the functionality that full versions do. Consequently, student versions may not support all the concepts presented in this text. If you would like to adopt for your course the set of this text with the student version of Minitab, please contact your local Wiley representative at www.wiley.com/college/rep.Alternatively, students may nd information about how to purchase the professional version of the software for academic use at www.minitab.com. WileyPLUS This online teaching and learning environment integrates the entire digital textbook with the most effective instructor and student resources to t every learning style.With WileyPLUS: Students achieve concept mastery in a rich, structured environment thats available 24/7. Instructors personalize and manage their course more effectively with assessment, assignments,grade tracking, and more. WileyPLUS can complement your current textbook or replace the printed text altogether. For Students Personalize the learning experience Different learning styles, different levels of prociency, different levels of preparationeach of your stu- dents is unique. WileyPLUS empowers them to take advantage of their individual strengths: Students receive timely access to resources that address their demonstrated needs, and get im-mediate feedback and remediation when needed. Integrated, multi-media resourcesincluding audio and visual exhibits, demonstration prob-lems, and much moreprovide multiple study-paths to t each students learning preferencesand encourage more active learning. WileyPLUS includes many opportunities for self-assessment linked to the relevant portionsof the text. Students can take control of their own learning and practice until they master thematerial. 12. JWCL232_fm_i-xvi.qxd 1/26/10 4:24 AM Page xi PREFACExiFor Instructors Personalize the teaching experience WileyPLUS empowers you with the tools and resources you need to make your teaching even more effective: You can customize your classroom presentation with a wealth of resources and functionalityfrom PowerPoint slides to a database of rich visuals. You can even add your own materials toyour WileyPLUS course. With WileyPLUS you can identify those students who are falling behind and intervene accord-ingly, without having to wait for them to come to ofce hours. WileyPLUS simplies and automates such tasks as student performance assessment, making as-signments, scoring student work, keeping grades, and more.COURSE SYLLABUS SUGGESTIONS This is a very exible textbook because instructors ideas about what should be in a rst course on sta- tistics for engineers vary widely, as do the abilities of different groups of students. Therefore, we hesitate to give too much advice, but will explain how we use the book. We believe that a rst course in statistics for engineers should be primarily an applied statistics course, not a probability course. In our one-semester course we cover all of Chapter 1 (in one or two lectures); overview the material on probability, putting most of the emphasis on the normal distribution (six to eight lectures); discuss most of Chapters 6 through 10 on condence intervals and tests (twelve to fourteen lectures); introduce regression models in Chapter 11 (four lectures); give an introduction to the design of experiments from Chapters 13 and 14 (six lectures); and present the basic concepts of statisti- cal process control, including the Shewhart control chart from Chapter 15 (four lectures). This leaves about three to four periods for exams and review. Let us emphasize that the purpose of this course is to introduce engineers to how statistics can be used to solve real-world engineering problems, not to weed out the less mathematically gifted students. This course is not the baby math-stat course that is all too often given to engineers. If a second semester is available, it is possible to cover the entire book, including much of the supplemental material, if appropriate for the audience. It would also be possible to assign and work many of the homework problems in class to reinforce the understanding of the concepts. Obviously, multiple regression and more design of experiments would be major topics in a second course.USING THE COMPUTER In practice, engineers use computers to apply statistical methods to solve problems. Therefore, we strongly recommend that the computer be integrated into the class. Throughout the book we have presented output from Minitab as typical examples of what can be done with modern statistical software. In teaching, we have used other software packages, including Statgraphics, JMP, and Statistica. We did not clutter up the book with examples from many different packages because how the instructor integrates the software into the class is ultimately more important than which package is used. All text data are available in electronic form on the textbook Web site. In some chapters, there are problems that we feel should be worked using computer software. We have marked these problems with a special icon in the margin. In our own classrooms, we use the computer in almost every lecture and demonstrate how the tech- nique is implemented in software as soon as it is discussed in the lecture. Student versions of many sta- tistical software packages are available at low cost, and students can either purchase their own copy or use the products available on the PC local area networks. We have found that this greatly improves the pace of the course and student understanding of the material. 13. JWCL232_fm_i-xvi.qxd 1/23/10 1:17 PMPage xii xii PREFACEUsers should be aware that nal answers may differ slightly due to different numerical precision and rounding protocols among softwares. ACKNOWLEDGMENTS We would like to express our grateful appreciation to the many organizations and individuals who have contributed to this book. Many instructors who used the previous editions provided excellent suggestions that we have tried to incorporate in this revision.We would like to thank the following who assisted in contributing to and/or reviewing material for the WileyPLUS course: Michael DeVasher, Rose-Hulman Institute of Technology Craig Downing, Rose-Hulman Institute of Technology Julie Fortune, University of Alabama in Huntsville Rubin Wei, Texas A&M UniversityWe would also like to thank the following for their assistance in checking the accuracy and com- pleteness of the exercises and the solutions to exercises. Abdelaziz Berrado, Arizona State University Dr. Connie Borror, Arizona State Universtity Patrick Egbunonu, Queens University James C. Ford, Ford Consulting Associates Dr. Alejandro Heredia-Langner Jing Hu, Arizona State University Busaba Laungrungrong, Arizona State University Fang Li, Arizona State University Nuttha Lurponglukana, Arizona State University Sarah Streett Yolande Tra, Rochester Institute of Technology Dr. Lora Zimmer We are also indebted to Dr. Smiley Cheng for permission to adapt many of the statistical tables from his excellent book (with Dr. James Fu), Statistical Tables for Classroom and Exam Room. John Wiley and Sons, Prentice Hall, the Institute of Mathematical Statistics, and the editors of Biometrics allowed us to use copyrighted material, for which we are grateful.Douglas C. MontgomeryGeorge C. Runger 14. JWCL232_fm_i-xvi.qxd1/23/10 1:17 PMPage xiiiContentsINSIDE FRONT COVER Index of Applications in4-10 Weibull Distribution 141Examples and Exercises 4-11 Lognormal Distribution 144 4-12 Beta Distribution 146CHAPTER 1 The Role of Statistics in Engineering 1 CHAPTER 5Joint Probability Distributions 1521-1 The Engineering Method and Statistical Thinking 21-2 Collecting Engineering Data 55-1 Two or More Random Variables 153 5-1.1 Joint Probability Distributions 1531-2.1 Basic Principles 5 5-1.2 Marginal Probability Distributions 1561-2.2 Retrospective Study 5 5-1.3 Conditional Probability Distributions 1581-2.3 Observational Study 6 5-1.4 Independence 1611-2.4 Designed Experiments 6 5-1.5 More Than Two Random Variables 1631-2.5 Observing Processes Over Time 91-3 Mechanistic and Empirical Models 125-2 Covariance and Correlation 1701-4 Probability and Probability Models 155-3 Common Joint Distributions 176 5-3.1 Multinomial Distribution 176CHAPTER 2 Probability 17 5-3.2 Bivariate Normal Distribution 177 5-4 Linear Functions of Random Variables 1812-1 Sample Spaces and Events 185-5 General Functions of Random Variables 1852-1.1 Random Experiments 182-1.2 Sample Spaces 19 CHAPTER 6Descriptive Statistics 1912-1.3 Events 222-1.4 Counting Techniques 24 6-1 Numerical Summaries of Data 1922-2 Interpretations and Axioms of Probability 31 6-2 Stem-and-Leaf Diagrams 1972-3 Addition Rules 376-3 Frequency Distributions and Histograms 2032-4 Conditional Probability 41 6-4 Box Plots 2082-5 Multiplication and Total Probability Rules 476-5 Time Sequence Plots 2102-6 Independence 506-6 Probability Plots 2142-7 Bayes Theorem 552-8 Random Variables 57CHAPTER 7 Sampling Distributions and PointEstimation of Parameters 223CHAPTER 3 Discrete Random Variables and Probability Distributions 667-1 Point Estimation 224 7-2 Sampling Distributions and the Central Limit3-1 Discrete Random Variables 67 Theorem 2253-2 Probability Distributions and Probability Mass 7-3 General Concepts of Point Estimation 231Functions 68 7-3.1 Unbiased Estimators 2313-3 Cumulative Distribution Functions 71 7-3.2 Variance of a Point Estimator 2323-4 Mean and Variance of a Discrete Random Variable 74 7-3.3 Standard Error: Reporting a Point Estimate 2333-5 Discrete Uniform Distribution 77 7-3.4 Mean Squared Error of an Estimator 2343-6 Binomial Distribution 79 7-4 Methods of Point Estimation 2373-7 Geometric and Negative Binomial Distributions 86 7-4.1 Method of Moments 2373-8 Hypergeometric Distribution 92 7-4.2 Method of Maximum Likelihood 2393-9 Poisson Distribution 977-4.3 Bayesian Estimation of Parameters 244CHAPTER 4 Continuous Random Variables andCHAPTER 8 Statistical Intervals for a Single Probability Distributions 107Sample 2514-1 Continuous Random Variables 1088-1 Condence Interval on the Mean of a Normal4-2 Probability Distributions and Probability DensityDistribution, Variance Known 253Functions 1088-1.1 Development of the Condence Interval and Its Basic4-3 Cumulative Distribution Functions 111Properties 2534-4 Mean and Variance of a Continuous Random 8-1.2 Choice of Sample Size 256 8-1.3 One-sided Condence Bounds 257Variable 114 8-1.4 General Method to Derive a4-5 Continuous Uniform Distribution 116Condence Interval 2584-6 Normal Distribution 1188-1.5 Large-Sample Condence Interval for 2584-7 Normal Approximation to the Binomial and Poisson 8-2 Condence Interval on the Mean of a NormalDistributions 127Distribution, Variance Unknown 2614-8 Exponential Distribution 132 8-2.1 t Distribution 2624-9 Erlang and Gamma Distributions 138 8-2.2 t Condence Interval on 263xiii 15. JWCL232_fm_i-xvi.qxd1/23/101:57 PMPage xiv xivCONTENTS 8-3 Condence Interval on the Variance and Standard10-3 A Nonparametric Test for the Difference in Two Deviation of a Normal Distribution 266Means 373 8-4 Large-Sample Condence Interval for a Population10-3.1 Description of the Wilcoxon Rank-Sum Test 373 Proportion 27010-3.2 Large-Sample Approximation 374 8-5 Guidelines for Constructing Condence Intervals 273 10-3.3 Comparison to the t-Test 375 8-6 Tolerance and Prediction Intervals 274 10-4 Paired t-Test 376 8-6.1 Prediction Interval for a Future Observation 274 10-5 Inference on the Variances of Two Normal 8-6.2 Tolerance Interval for a Normal Distribution 276Distributions 382 10-5.1 F Distribution 383 CHAPTER 9 Tests of Hypotheses for a Single10-5.2 Hypothesis Tests on the Ratio of Two Variances 384Sample 283 10-5.3 Type II Error and Choice of Sample Size 387 9-1 Hypothesis Testing 28410-5.4 Condence Interval on the Ratio of Two 9-1.1 Statistical Hypotheses 284 Variances 387 9-1.2 Tests of Statistical Hypotheses 28610-6 Inference on Two Population Proportions 389 9-1.3 One-Sided and Two-Sided Hypothesis 29210-6.1 Large-Sample Tests on the Difference in Population 9-1.4 P-Values in Hypothesis Tests 294 Proportions 389 9-1.5 Connection between Hypothesis Tests and 10-6.2 Type II Error and Choice of Sample Size 391 Condence Intervals 295 10-6.3 Condence Interval on the Difference in Population 9-1.6 General Procedure for Hypothesis Tests 296 Proportions 392 9-2 Tests on the Mean of a Normal Distribution, Variance 10-7 Summary Table and Roadmap for Inference Known 299 Procedures for Two Samples 394 9-2.1 Hypothesis Tests on the Mean 299 9-2.2 Type II Error and Choice of Sample Size 303CHAPTER 11 Simple Linear Regression and 9-2.3 Large-Sample Test 307 Correlation 401 9-3 Tests on the Mean of a Normal Distribution, Variance Unknown 31011-1 Empirical Models 402 9-3.1 Hypothesis Tests on the Mean 310 11-2 Simple Linear Regression 405 9-3.2 Type II Error and Choice of Sample Size 31411-3 Properties of the Least Squares Estimators 414 9-4 Tests on the Variance and Standard Deviation of a11-4 Hypothesis Tests in Simple Linear Regression 415 Normal Distribution 319 11-4.1 Use of t-Tests 415 9-4.1 Hypothesis Tests on the Variance 31911-4.2 Analysis of Variance Approach to Test Signicance 9-4.2 Type II Error and Choice of Sample Size 322of Regression 417 9-5 Tests on a Population Proportion 323 11-5 Condence Intervals 421 11-5.1 Condence Intervals on the Slope and Intercept 421 9-5.1 Large-Sample Tests on a Proportion 324 11-5.2 Condence Interval on the Mean Response 421 9-5.2 Type II Error and Choice of Sample Size 32611-6 Prediction of New Observations 423 9-6 Summary Table of Inference Procedures for a Single11-7 Adequacy of the Regression Model 426 Sample 32911-7.1 Residual Analysis 426 9-7 Testing for Goodness of Fit 330 11-7.2 Coefcient of Determination (R2)428 9-8 Contingency Table Tests 33311-8 Correlation 431 9-9 Nonparametric Procedures 337 11-9 Regression on Transformed Variables 437 9-9.1 The Sign Test 33711-10 Logistic Regression 440 9-9.2 The Wilcoxon Signed-Rank Test 342 9-9.3 Comparison to the t-Test 344CHAPTER 12 Multiple Linear Regression 449 CHAPTER 10 Statistical Inference for Two 12-1 Multiple Linear Regression Model 450Samples 35112-1.1 Introduction 450 12-1.2 Least Squares Estimation of the Parameters 452 10-1 Inference on the Difference in Means of Two Normal 12-1.3 Matrix Approach to Multiple Linear Regression 456Distributions, Variances Known 352 12-1.4 Properties of the Least Squares Estimators 460 10-1.1 Hypothesis Tests on the Difference in Means,12-2 Hypothesis Tests in Multiple Linear Regression 470Variances Known 35412-2.1 Test for Signicance of Regression 470 10-1.2 Type II Error and Choice of Sample Size 35612-2.2 Tests on Individual Regression Coefcients and 10-1.3 Condence Interval on the Difference in Means,Subsets of Coefcients 472Variances Known 357 12-3 Condence Intervals in Multiple Linear Regression 479 10-2 Inference on the Difference in Means of Two Normal 12-3.1 Condence Intervals on Individual RegressionDistributions, Variances Unknown 361Coefcients 479 10-2.1 Hypothesis Tests on the Difference in Means, 12-3.2 Condence Interval on the Mean Response 480Variances Unknown 361 12-4 Prediction of New Observations 481 10-2.2 Type II Error and Choice of Sample Size 367 12-5 Model Adequacy Checking 484 10-2.3 Condence Interval on the Difference in Means, 12-5.1 Residual Analysis 484Variances Unknown 36812-5.2 Inuential Observations 487 16. JWCL232_fm_i-xvi.qxd1/23/102:07 PMPage xv CONTENTSxv12-6 Aspects of Multiple Regression Modeling 49015-2.3 Rational Subgroups 646 12-6.1 Polynomial Regression Models 49015-2.4 Analysis of Patterns on Control Charts 647 12-6.2 Categorical Regressors and Indicator Variables 492 15-3 X and R or S Control Charts 649 12-6.3 Selection of Variables and Model Building 49415-4 Control Charts for Individual Measurements 658 12-6.4 Multicollinearity 50215-5 Process Capability 662 15-6 Attribute Control Charts 668CHAPTER 13 Design and Analysis of Single-Factor 15-6.1 P Chart (Control Chart for Proportions) 668 Experiments: The Analysis of Variance 51315-6.2 U Chart (Control Chart for Defects per Unit) 670 15-7 Control Chart Performance 67313-1 Designing Engineering Experiments 514 15-8 Time-Weighted Charts 67613-2 Completely Randomized Single-Factor15-8.1 Cumulative Sum Control Chart 676 Experiment 515 15-8.2 Exponentially Weighted Moving Average Control 13-2.1 Example: Tensile Strength 515Chart 682 13-2.2 Analysis of Variance 517 15-9 Other SPC Problem-Solving Tools 688 13-2.3 Multiple Comparisons Following the ANOVA 524 15-10 Implementing SPC 690 13-2.4 Residual Analysis and Model Checking 526 13-2.5 Determining Sample Size 52713-3 The Random-Effects Model 533 APPENDICES 702 13-3.1 Fixed Versus Random Factors 533 13-3.2 ANOVA and Variance Components 534APPENDIX A: Statistical Tables and Charts 70313-4 Randomized Complete Block Design 538Table ISummary of Common Probability 13-4.1 Design and Statistical Analysis 538 Distributions 704 13-4.2 Multiple Comparisons 542 Table II Cumulative Binomial Probability 13-4.3 Residual Analysis and Model Checking 544P(X x) 705 Table III Cumulative Standard Normal Distribution 708CHAPTER 14 Design of Experiments withTable IV Percentage Points 2 of the Chi-Squared , Several Factors 551Distribution 71014-1 Introduction 552Table VPercentage Points t, of the t distribution 71114-2 Factorial Experiments 555 Table VI Percentage Points f,v1,v2 of the F distribution 71214-3 Two-Factor Factorial Experiments 558Chart VII Operating Characteristic Curves 717 14-3.1 Statistical Analysis of the Fixed-Effects Model 559Table VIII Critical Values for the Sign Test 726 14-3.2 Model Adequacy Checking 564Table IX Critical Values for the Wilcoxon Signed-Rank 14-3.3 One Observation per Cell 566Test 72614-4 General Factorial Experiments 568 Table XCritical Values for the Wilcoxon Rank-Sum14-5 2k Factorial Designs 571 Test 727 14-5.1 22 Design 571Table XI Factors for Constructing Variables Control 14-5.2 2k Design for k 3 Factors 577 Charts 728 14-5.3 Single Replicate of the 2k Design 584 Table XII Factors for Tolerance Intervals 729 14-5.4 Addition of Center Points to a 2k Design 58814-6 Blocking and Confounding in the 2k Design 595 APPENDIX B: Answers to Selected Exercises 73114-7 Fractional Replication of the 2k Design 602 14-7.1 One-Half Fraction of the 2k Design 602 14-7.2 Smaller Fractions: The 2kp Fractional APPENDIX C: Bibliography 749Factorial 60814-8 Response Surface Methods and Designs 619GLOSSARY751CHAPTER 15 Statistical Quality Control 637 INDEX 76215-1 Quality Improvement and Statistics 638INDEX OF APPLICATIONS IN EXAMPLES 15-1.1 Statistical Quality Control 639 15-1.2 Statistical Process Control 640 AND EXERCISES, CONTINUED 76615-2 Introduction to Control Charts 640 15-2.1 Basic Principles 640 15-2.2 Design of a Control Chart 644 17. JWCL232_fm_i-xvi.qxd 1/23/10 1:18 PM Page xvi 18. JWCL232_c01_001-016.qxd 1/14/10 8:51 AM Page 11 The Role of Statisticsin Engineering Steve Shepard/iStockphotoStatistics is a science that helps us make decisions and draw conclusions in the pres-ence of variability. For example, civil engineers working in the transportation eld areconcerned about the capacity of regional highway systems. A typical problem wouldinvolve data on the number of nonwork, home-based trips, the number of persons perhousehold, and the number of vehicles per household, and the objective would be toproduce a trip-generation model relating trips to the number of persons per householdand the number of vehicles per household. A statistical technique called regressionanalysis can be used to construct this model. The trip-generation model is an importanttool for transportation systems planning. Regression methods are among the mostwidely used statistical techniques in engineering. They are presented in Chapters 11and 12. Hospital emergency departments (EDs) are an important part of the health-care de-livery system. The process by which patients arrive at the ED is highly variable and candepend on the hour of the day and the day of the week, as well as on longer-term cycli-cal variations. The service process is also highly variable, depending on the types ofservices that the patients require, the number of patients in the ED, and how the ED isstaffed and organized. The capacity of an ED is also limited, so consequently some pa-tients experience long waiting times. How long do patients wait, on average? This is animportant question for health-care providers. If waiting times become excessive, somepatients will leave without receiving treatment (LWOT). Patients who LWOT are a seri-ous problem, as they do not have their medical concerns addressed and are at risk forfurther problems and complications. Therefore, another important question is: Whatproportion of patients LWOT from the ED? These questions can be solved by employ-ing probability models to describe the ED, and from these models very precise esti-mates of waiting times and the number of patients who LWOT can be obtained.Probability models that can be used to solve these types of problems are discussed inChapters 2 through 5.1 19. JWCL232_c01_001-016.qxd 1/14/108:51 AM Page 2 2CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERINGThe concepts of probability and statistics are powerful ones and contribute extensively to the solutions of many types of engineering problems. You will encounter many exam- ples of these applications in this book. CHAPTER OUTLINE 1-1 THE ENGINEERING METHOD AND 1-2.4 Designed Experiments STATISTICAL THINKING 1-2.5 Observing Processes Over Time 1-2 COLLECTING ENGINEERING DATA1-3 MECHANISTIC AND EMPIRICAL1-2.1 Basic PrinciplesMODELS1-2.2 Retrospective Study 1-4 PROBABILITY AND PROBABILITY1-2.3 Observational Study MODELS LEARNING OBJECTIVES After careful study of this chapter you should be able to do the following: 1. Identify the role that statistics can play in the engineering problem-solving process 2. Discuss how variability affects the data collected and used for making engineering decisions 3. Explain the difference between enumerative and analytical studies 4. Discuss the different methods that engineers use to collect data 5. Identify the advantages that designed experiments have in comparison to other methods ofcollecting engineering data 6. Explain the differences between mechanistic models and empirical models 7. Discuss how probability and probability models are used in engineering and science 1-1 THE ENGINEERING METHOD AND STATISTICAL THINKING An engineer is someone who solves problems of interest to society by the efcient application of scientic principles. Engineers accomplish this by either rening an existing product or process or by designing a new product or process that meets customers needs. The engineering, or scientic, method is the approach to formulating and solving these problems. The steps in the engineering method are as follows:1. Develop a clear and concise description of the problem.2. Identify, at least tentatively, the important factors that affect this problem or that may play a role in its solution.3. Propose a model for the problem, using scientic or engineering knowledge of the phenomenon being studied. State any limitations or assumptions of the model.4. Conduct appropriate experiments and collect data to test or validate the tentative model or conclusions made in steps 2 and 3.5. Rene the model on the basis of the observed data. 20. JWCL232_c01_001-016.qxd1/14/108:51 AMPage 31-1 THE ENGINEERING METHOD AND STATISTICAL THINKING 3Develop aIdentify thePropose or Manipulate Confirm Conclusionsclear important refine a the theand description factors modelmodelsolution recommendations Conduct experimentsFigure 1-1 Theengineering method. 6.Manipulate the model to assist in developing a solution to the problem. 7.Conduct an appropriate experiment to conrm that the proposed solution to the problem is both effective and efcient. 8.Draw conclusions or make recommendations based on the problem solution.The steps in the engineering method are shown in Fig. 1-1. Many of the engineering sciencesare employed in the engineering method: the mechanical sciences (statics, dynamics), uidscience, thermal science, electrical science, and the science of materials. Notice that the engi-neering method features a strong interplay between the problem, the factors that may inuenceits solution, a model of the phenomenon, and experimentation to verify the adequacy of themodel and the proposed solution to the problem. Steps 24 in Fig. 1-1 are enclosed in a box,indicating that several cycles or iterations of these steps may be required to obtain the nalsolution. Consequently, engineers must know how to efciently plan experiments, collect data,analyze and interpret the data, and understand how the observed data are related to the modelthey have proposed for the problem under study. The eld of statistics deals with the collection, presentation, analysis, and use of data tomake decisions, solve problems, and design products and processes. In simple terms, statisticsis the science of data. Because many aspects of engineering practice involve working withdata, obviously knowledge of statistics is just as important to an engineer as the other engineeringsciences. Specically, statistical techniques can be a powerful aid in designing new productsand systems, improving existing designs, and designing, developing, and improving productionprocesses. Statistical methods are used to help us describe and understand variability. By variability,we mean that successive observations of a system or phenomenon do not produce exactly thesame result. We all encounter variability in our everyday lives, and statistical thinking cangive us a useful way to incorporate this variability into our decision-making processes. Forexample, consider the gasoline mileage performance of your car. Do you always get exactly thesame mileage performance on every tank of fuel? Of course notin fact, sometimes the mileageperformance varies considerably. This observed variability in gasoline mileage depends onmany factors, such as the type of driving that has occurred most recently (city versus high-way), the changes in condition of the vehicle over time (which could include factors such astire ination, engine compression, or valve wear), the brand and /or octane number of thegasoline used, or possibly even the weather conditions that have been recently experienced.These factors represent potential sources of variability in the system. Statistics provides aframework for describing this variability and for learning about which potential sources ofvariability are the most important or which have the greatest impact on the gasoline mileageperformance. We also encounter variability in dealing with engineering problems. For example, sup-pose that an engineer is designing a nylon connector to be used in an automotive engineapplication. The engineer is considering establishing the design specication on wall thickness 21. JWCL232_c01_001-016.qxd1/14/10 8:51 AMPage 4 4CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERINGat 332 inch but is somewhat uncertain about the effect of this decision on the connector pull-off force. If the pull-off force is too low, the connector may fail when it is installed in an en-gine. Eight prototype units are produced and their pull-off forces measured, resulting in thefollowing data (in pounds): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1. As we anticipated,not all of the prototypes have the same pull-off force. We say that there is variability in thepull-off force measurements. Because the pull-off force measurements exhibit variability, weconsider the pull-off force to be a random variable. A convenient way to think of a randomvariable, say X, that represents a measurement is by using the model X (1-1)where is a constant and is a random disturbance. The constant remains the same with everymeasurement, but small changes in the environment, variance in test equipment, differences inthe individual parts themselves, and so forth change the value of . If there were no distur-bances, would always equal zero and X would always be equal to the constant . However,this never happens in the real world, so the actual measurements X exhibit variability. We oftenneed to describe, quantify, and ultimately reduce variability.Figure 1-2 presents a dot diagram of these data. The dot diagram is a very useful plot fordisplaying a small body of datasay, up to about 20 observations. This plot allows us to seeeasily two features of the data: the location, or the middle, and the scatter or variability. Whenthe number of observations is small, it is usually difcult to identify any specic patterns in thevariability, although the dot diagram is a convenient way to see any unusual data features.The need for statistical thinking arises often in the solution of engineering problems.Consider the engineer designing the connector. From testing the prototypes, he knows that theaverage pull-off force is 13.0 pounds. However, he thinks that this may be too low for theintended application, so he decides to consider an alternative design with a greater wallthickness, 18 inch. Eight prototypes of this design are built, and the observed pull-off forcemeasurements are 12.9, 13.7, 12.8, 13.9, 14.2, 13.2, 13.5, and 13.1. The average is 13.4.Results for both samples are plotted as dot diagrams in Fig. 1-3. This display gives the im-pression that increasing the wall thickness has led to an increase in pull-off force. However,there are some obvious questions to ask. For instance, how do we know that another sampleof prototypes will not give different results? Is a sample of eight prototypes adequate to givereliable results? If we use the test results obtained so far to conclude that increasing the wallthickness increases the strength, what risks are associated with this decision? For example,is it possible that the apparent increase in pull-off force observed in the thicker prototypesis only due to the inherent variability in the system and that increasing the thickness of thepart (and its cost) really has no effect on the pull-off force?Often, physical laws (such as Ohms law and the ideal gas law) are applied to help designproducts and processes. We are familiar with this reasoning from general laws to speciccases. But it is also important to reason from a specic set of measurements to more generalcases to answer the previous questions. This reasoning is from a sample (such as the eightconnectors) to a population (such as the connectors that will be sold to customers). Thereasoning is referred to as statistical inference. See Fig. 1-4. Historically, measurements were = 3 inch 3212 1314 15121314 15= 1 inch 8Pull-off force Pull-off forceFigure 1-2 Dot diagram of the pull-off forceFigure 1-3 Dot diagram of pull-off force for two walldata when wall thickness is 3/32 inch.thicknesses. 22. JWCL232_c01_001-016.qxd 1/14/10 8:51 AM Page 5 1-2 COLLECTING ENGINEERING DATA 5 Physical Populationlaws Statistical inference Types ofreasoningFigure 1-4 Statistical Product Sampleinference is one typedesignsof reasoning. obtained from a sample of people and generalized to a population, and the terminology has remained. Clearly, reasoning based on measurements from some objects to measurements on all objects can result in errors (called sampling errors). However, if the sample is selected properly, these risks can be quantied and an appropriate sample size can be determined.1-2 COLLECTING ENGINEERING DATA1-2.1 Basic Principles In the previous section, we illustrated some simple methods for summarizing data. Sometimes the data are all of the observations in the populations. This results in a census. However, in the engineering environment, the data are almost always a sample that has been selected from the population. Three basic methods of collecting data are A retrospective study using historical data An observational study A designed experiment An effective data-collection procedure can greatly simplify the analysis and lead to improved understanding of the population or process that is being studied. We now consider some examples of these data-collection methods.1-2.2 Retrospective Study Montgomery, Peck, and Vining (2006) describe an acetone-butyl alcohol distillation column for which concentration of acetone in the distillate or output product stream is an important variable. Factors that may affect the distillate are the reboil temperature, the condensate tem- perature, and the reux rate. Production personnel obtain and archive the following records: The concentration of acetone in an hourly test sample of output product The reboil temperature log, which is a plot of the reboil temperature over time The condenser temperature controller log The nominal reux rate each hour The reux rate should be held constant for this process. Consequently, production personnel change this very infrequently. A retrospective study would use either all or a sample of the historical process data archived over some period of time. The study objective might be to discover the relationships 23. JWCL232_c01_001-016.qxd 1/14/108:51 AM Page 6 6CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERING among the two temperatures and the reux rate on the acetone concentration in the output product stream. However, this type of study presents some problems:1. We may not be able to see the relationship between the reux rate and acetone con- centration, because the reux rate didnt change much over the historical period.2. The archived data on the two temperatures (which are recorded almost continuously) do not correspond perfectly to the acetone concentration measurements (which are made hourly). It may not be obvious how to construct an approximate correspondence.3. Production maintains the two temperatures as closely as possible to desired targets or set points. Because the temperatures change so little, it may be difcult to assess their real impact on acetone concentration.4. In the narrow ranges within which they do vary, the condensate temperature tends to increase with the reboil temperature. Consequently, the effects of these two process variables on acetone concentration may be difcult to separate. As you can see, a retrospective study may involve a lot of data, but those data may contain relatively little useful information about the problem. Furthermore, some of the relevant data may be missing, there may be transcription or recording errors resulting in outliers (or unusual values), or data on other important factors may not have been collected and archived. In the distillation column, for example, the specic concentrations of butyl alcohol and acetone in the input feed stream are a very important factor, but they are not archived because the concentrations are too hard to obtain on a routine basis. As a result of these types of issues, statistical analysis of historical data sometimes identies interesting phenomena, but solid and reliable explanations of these phenomena are often difcult to obtain. 1-2.3 Observational Study In an observational study, the engineer observes the process or population, disturbing it as little as possible, and records the quantities of interest. Because these studies are usually con- ducted for a relatively short time period, sometimes variables that are not routinely measured can be included. In the distillation column, the engineer would design a form to record the two temperatures and the reux rate when acetone concentration measurements are made. It may even be possible to measure the input feed stream concentrations so that the impact of this factor could be studied. Generally, an observational study tends to solve problems 1 and 2 above and goes a long way toward obtaining accurate and reliable data. However, observa- tional studies may not help resolve problems 3 and 4. 1-2.4 Designed Experiments In a designed experiment the engineer makes deliberate or purposeful changes in the control- lable variables of the system or process, observes the resulting system output data, and then makes an inference or decision about which variables are responsible for the observed changes in output performance. The nylon connector example in Section 1-1 illustrates a designed experiment; that is, a deliberate change was made in the wall thickness of the connector with the objective of discovering whether or not a greater pull-off force could be obtained. Experiments designed with basic principles such as randomization are needed to establish cause-and-effect relationships.Much of what we know in the engineering and physical-chemical sciences is developed through testing or experimentation. Often engineers work in problem areas in which no 24. JWCL232_c01_001-016.qxd 1/14/10 8:51 AMPage 71-2 COLLECTING ENGINEERING DATA7scientic or engineering theory is directly or completely applicable, so experimentation andobservation of the resulting data constitute the only way that the problem can be solved. Evenwhen there is a good underlying scientic theory that we may rely on to explain the phenom-ena of interest, it is almost always necessary to conduct tests or experiments to conrm that thetheory is indeed operative in the situation or environment in which it is being applied.Statistical thinking and statistical methods play an important role in planning, conducting, andanalyzing the data from engineering experiments. Designed experiments play a very importantrole in engineering design and development and in the improvement of manufacturing processes. For example, consider the problem involving the choice of wall thickness for thenylon connector. This is a simple illustration of a designed experiment. The engineer chose twowall thicknesses for the connector and performed a series of tests to obtain pull-off forcemeasurements at each wall thickness. In this simple comparative experiment, the engineer isinterested in determining if there is any difference between the 332- and 18-inch designs. Anapproach that could be used in analyzing the data from this experiment is to compare the meanpull-off force for the 332-inch design to the mean pull-off force for the 18-inch design usingstatistical hypothesis testing, which is discussed in detail in Chapters 9 and 10. Generally, ahypothesis is a statement about some aspect of the system in which we are interested. Forexample, the engineer might want to know if the mean pull-off force of a 332-inch designexceeds the typical maximum load expected to be encountered in this application, say, 12.75pounds. Thus, we would be interested in testing the hypothesis that the mean strength exceeds12.75 pounds. This is called a single-sample hypothesis-testing problem. Chapter 9 presentstechniques for this type of problem. Alternatively, the engineer might be interested in testingthe hypothesis that increasing the wall thickness from 332 to 18 inch results in an increasein mean pull-off force. It is an example of a two-sample hypothesis-testing problem. Two-sample hypothesis-testing problems are discussed in Chapter 10. Designed experiments are a very powerful approach to studying complex systems, suchas the distillation column. This process has three factorsthe two temperatures and the reuxrateand we want to investigate the effect of these three factors on output acetone concentra-tion. A good experimental design for this problem must ensure that we can separate the effectsof all three factors on the acetone concentration. The specied values of the three factors usedin the experiment are called factor levels. Typically, we use a small number of levels for eachfactor, such as two or three. For the distillation column problem, suppose we use two levels,high and low (denoted +1 and 1, respectively), for each of the three factors. A veryreasonable experiment design strategy uses every possible combination of the factor levels toform a basic experiment with eight different settings for the process. This type of experimentis called a factorial experiment. Table 1-1 presents this experimental design.Table 1-1 The Designed Experiment (Factorial Design) for theDistillation Column Reboil Temp. Condensate Temp.Reux Rate1 111 111 111 111 111 111 111 11 25. JWCL232_c01_001-016.qxd1/14/108:51 AM Page 8 8 CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERING +1 +1 Reflux ratees at ureen ratnd peCo em 1 t Figure 1-5 The 1 factorial design for 1+1 the distillation column. Reboil temperatureFigure 1-5 illustrates that this design forms a cube in terms of these high and low levels.With each setting of the process conditions, we allow the column to reach equilibrium, takea sample of the product stream, and determine the acetone concentration. We then can drawspecic inferences about the effect of these factors. Such an approach allows us to proactivelystudy a population or process.An important advantage of factorial experiments is that they allow one to detect aninteraction between factors. Consider only the two temperature factors in the distillationexperiment. Suppose that the response concentration is poor when the reboil temperature islow, regardless of the condensate temperature. That is, the condensate temperature has noeffect when the reboil temperature is low. However, when the reboil temperature is high, ahigh condensate temperature generates a good response, while a low condensate tempera-ture generates a poor response. That is, the condensate temperature changes the responsewhen the reboil temperature is high. The effect of condensate temperature depends on thesetting of the reboil temperature, and these two factors would interact in this case. If the fourcombinations of high and low reboil and condensate temperatures were not tested, such aninteraction would not be detected.We can easily extend the factorial strategy to more factors. Suppose that the engineer wantsto consider a fourth factor, type of distillation column. There are two types: the standard oneand a newer design. Figure 1-6 illustrates how all four factorsreboil temperature, condensatetemperature, reux rate, and column designcould be investigated in a factorial design. Sinceall four factors are still at two levels, the experimental design can still be represented geometri-cally as a cube (actually, its a hypercube). Notice that as in any factorial design, all possiblecombinations of the four factors are tested. The experiment requires 16 trials.Generally, if there are k factors and they each have two levels, a factorial experimentaldesign will require 2k runs. For example, with k 4, the 24 design in Fig. 1-6 requires 16 tests.Clearly, as the number of factors increases, the number of trials required in a factorial experi-ment increases rapidly; for instance, eight factors each at two levels would require 256 trials.This quickly becomes unfeasible from the viewpoint of time and other resources. Fortunately,when there are four to ve or more factors, it is usually unnecessary to test all possiblecombinations of factor levels. A fractional factorial experiment is a variation of the basic Column designure Old Neweratemp Reflux rate 1 et nsat nde 1 1Co Figure 1-6 A four-1 factorial experiment1 1 for the distillation column.Reboil temperature 26. JWCL232_c01_001-016.qxd1/14/108:51 AM Page 91-2 COLLECTING ENGINEERING DATA 9Column designe Old New tur pera temReflux rate 1atensde 1 1C onFigure 1-7 A frac-1tional factorial experi- 11ment for the connectorReboil temperaturewall thickness problem. factorial arrangement in which only a subset of the factor combinations are actually tested. Figure 1-7 shows a fractional factorial experimental design for the four-factor version of the distillation experiment. The circled test combinations in this gure are the only test combina- tions that need to be run. This experimental design requires only 8 runs instead of the original 16; consequently it would be called a one-half fraction. This is an excellent experimental design in which to study all four factors. It will provide good information about the individual effects of the four factors and some information about how these factors interact. Factorial and fractional factorial experiments are used extensively by engineers and scien- tists in industrial research and development, where new technology, products, and processes are designed and developed and where existing products and processes are improved. Since so much engineering work involves testing and experimentation, it is essential that all engineers understand the basic principles of planning efcient and effective experiments. We discuss these principles in Chapter 13. Chapter 14 concentrates on the factorial and fractional factorials that we have introduced here.1-2.5 Observing Processes Over Time Often data are collected over time. In this case, it is usually very helpful to plot the data versus time in a time series plot. Phenomena that might affect the system or process often become more visible in a time-oriented plot and the concept of stability can be better judged.Figure 1-8 is a dot diagram of acetone concentration readings taken hourly from the distillation column described in Section 1-2.2. The large variation displayed on the dot diagram indicates a lot of variability in the concentration, but the chart does not help explain the reason for the variation. The time series plot is shown in Figure 1-9. A shift in the process mean level is visible in the plot and an estimate of the time of the shift can be obtained.W. Edwards Deming, a very inuential industrial statistician, stressed that it is important to understand the nature of variability in processes and systems over time. He conducted an experiment in which he attempted to drop marbles as close as possible to a target on a table. He used a funnel mounted on a ring stand and the marbles were dropped into the funnel. See Fig. 1-10. The funnel was aligned as closely as possible with the center of the target. He then used two different strategies to operate the process. (1) He never moved the funnel. He justFigure 1-8 The dotdiagram illustrates xvariation but does not 80.584.0 87.591.0 94.5 98.0identify the problem. Acetone concentration 27. JWCL232_c01_001-016.qxd 1/14/10 8:51 AM Page 10 10 CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERING100Acetone concentration 90 801020 30Observation number (hour) Target Marbles Figure 1-9 A time series plot of concentration provides Figure 1-10 Demings funnel experiment. more information than the dot diagram. dropped one marble after another and recorded the distance from the target. (2) He dropped the rst marble and recorded its location relative to the target. He then moved the funnel an equal and opposite distance in an attempt to compensate for the error. He continued to make this type of adjustment after each marble was dropped.After both strategies were completed, he noticed that the variability of the distance from the target for strategy 2 was approximately 2 times larger than for strategy 1. The adjust- ments to the funnel increased the deviations from the target. The explanation is that the error (the deviation of the marbles position from the target) for one marble provides no information about the error that will occur for the next marble. Consequently, adjustments to the funnel do not decrease future errors. Instead, they tend to move the funnel farther from the target.This interesting experiment points out that adjustments to a process based on random dis- turbances can actually increase the variation of the process. This is referred to as overcontrol or tampering. Adjustments should be applied only to compensate for a nonrandom shift in the processthen they can help. A computer simulation can be used to demonstrate the lessons of the funnel experiment. Figure 1-11 displays a time plot of 100 measurements (denoted as y) from a process in which only random disturbances are present. The target value for the process 16 14 12 10 y86 Figure 1-11 Adjust-4 ments applied toWithout adjustment random disturbances2With adjustment overcontrol the process0 and increase the devia-1 11 21 31 41 51 61 7181 91 tions from the target. Observation number 28. JWCL232_c01_001-016.qxd 1/14/108:51 AM Page 11 1-2 COLLECTING ENGINEERING DATA1116 Process mean shift14is detected.1210y 8 6Figure 1-12 Processmean shift is detected at4observation number 57, Without adjustmentand one adjustment 2 With adjustment(a decrease of two units) 0reduces the deviations 111213141 51 6171 8191from target. Observation numberis 10 units. The gure displays the data with and without adjustments that are applied to theprocess mean in an attempt to produce data closer to target. Each adjustment is equal andopposite to the deviation of the previous measurement from target. For example, when themeasurement is 11 (one unit above target), the mean is reduced by one unit before the nextmeasurement is generated. The overcontrol has increased the deviations from the target. Figure 1-12 displays the data without adjustment from Fig. 1-11, except that the measure-ments after observation number 50 are increased by two units to simulate the effect of a shiftin the mean of the process. When there is a true shift in the mean of a process, an adjustmentcan be useful. Figure 1-12 also displays the data obtained when one adjustment (a decrease oftwo units) is applied to the mean after the shift is detected (at observation number 57). Notethat this adjustment decreases the deviations from target. The question of when to apply adjustments (and by what amounts) begins with an under-standing of the types of variation that affect a process. A control chart is an invaluable wayto examine the variability in time-oriented data. Figure 1-13 presents a control chart forthe concentration data from Fig. 1-9. The center line on the control chart is just the average ofthe concentration measurements for the rst 20 samples ( x 91.5 gl) when the process isstable. The upper control limit and the lower control limit are a pair of statistically derived100Upper control limit = 100.5Acetone concentration x = 91.50 90 Lower control limit = 82.54Figure 1-13 A con- 8011trol chart for thechemical process 05 1015 202530concentration data. Observation number (hour) 29. JWCL232_c01_001-016.qxd 1/14/10 8:51 AMPage 12 12 CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERINGTimePopulation? Futurepopulation? SampleSample Figure 1-14x1, x2,, xn x1, x2,, xn Enumerative versus EnumerativeAnalytic analytic study. studystudy limits that reect the inherent or natural variability in the process. These limits are located three standard deviations of the concentration values above and below the center line. If the process is operating as it should, without any external sources of variability present in the system, the concentration measurements should uctuate randomly around the center line, and almost all of them should fall between the control limits.In the control chart of Fig. 1-13, the visual frame of reference provided by the center line and the control limits indicates that some upset or disturbance has affected the process around sample 20 because all of the following observations are below the center line, and two of them actually fall below the lower control limit. This is a very strong signal that corrective action is required in this process. If we can nd and eliminate the underlying cause of this upset, we can improve process performance considerably.Furthermore, Deming pointed out that data from a process are used for different types of conclusions. Sometimes we collect data from a process to evaluate current production. For example, we might sample and measure resistivity on three semiconductor wafers selected from a lot and use this information to evaluate the lot. This is called an enumerative study. However, in many cases we use data from current production to evaluate future production. We apply conclusions to a conceptual, future population. Deming called this an analytic study. Clearly this requires an assumption of a stable process, and Deming emphasized that control charts were needed to justify this assumption. See Fig. 1-14 as an illustration.Control charts are a very important application of statistics for monitoring, controlling, and improving a process. The branch of statistics that makes use of control charts is called statistical process control, or SPC. We will discuss SPC and control charts in Chapter 15. 1-3 MECHANISTIC AND EMPIRICAL MODELS Models play an important role in the analysis of nearly all engineering problems. Much of the formal education of engineers involves learning about the models relevant to specic elds and the techniques for applying these models in problem formulation and solution. As a sim- ple example, suppose we are measuring the ow of current in a thin copper wire. Our model for this phenomenon might be Ohms law: Current voltageresistance orI ER (1-2) 30. JWCL232_c01_001-016.qxd 1/14/10 8:51 AMPage 13 1-3 MECHANISTIC AND EMPIRICAL MODELS 13We call this type of model a mechanistic model because it is built from our underlyingknowledge of the basic physical mechanism that relates these variables. However, if weperformed this measurement process more than once, perhaps at different times, or even ondifferent days, the observed current could differ slightly because of small changes or varia-tions in factors that are not completely controlled, such as changes in ambient temperature,uctuations in performance of the gauge, small impurities present at different locations in thewire, and drifts in the voltage source. Consequently, a more realistic model of the observedcurrent might be I ER (1-3)where is a term added to the model to account for the fact that the observed values ofcurrent ow do not perfectly conform to the mechanistic model. We can think of as aterm that includes the effects of all of the unmodeled sources of variability that affect thissystem.Sometimes engineers work with problems for which there is no simple or well-understood mechanistic model that explains the phenomenon. For instance, suppose we areinterested in the number average molecular weight (Mn) of a polymer. Now we know that Mnis related to the viscosity of the material (V ), and it also depends on the amount of catalyst (C )and the temperature (T ) in the polymerization reactor when the material is manufactured.The relationship between Mn and these variables is Mn f 1V, C, T 2(1-4)say, where the form of the function f is unknown. Perhaps a working model could be de-veloped from a rst-order Taylor series expansion, which would produce a model of theform Mn 0 1V 2C 3T(1-5)where the s are unknown parameters. Now just as in Ohms law, this model will not exactlydescribe the phenomenon, so we should account for the other sources of variability that mayaffect the molecular weight by adding another term to the model; therefore,Mn 0 1V 2C 3T (1-6)is the model that we will use to relate molecular weight to the other three variables. This typeof model is called an empirical model; that is, it uses our engineering and scientic knowl-edge of the phenomenon, but it is not directly developed from our theoretical or rst-principlesunderstanding of the underlying mechanism. To illustrate these ideas with a specic example, consider the data in Table 1-2. This tablecontains data on three variables that were collected in an observational study in a semicon-ductor manufacturing plant. In this plant, the nished semiconductor is wire-bonded to aframe. The variables reported are pull strength (a measure of the amount of force required tobreak the bond), the wire length, and the height of the die. We would like to nd a modelrelating pull strength to wire length and die height. Unfortunately, there is no physical mech-anism that we can easily apply here, so it doesnt seem likely that a mechanistic modelingapproach will be successful. Figure 1-15 presents a three-dimensional plot of all 25 observations on pull strength, wirelength, and die height. From examination of this plot, we see that pull strength increases as 31. JWCL232_c01_001-016.qxd 1/14/10 8:51 AMPage 14 14CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERING Table 1-2 Wire Bond Pull Strength DataObservationPull Strength Wire Length Die Height Numberyx1 x219.95 2502 24.45 8 1103 31.7511 1204 35.0010 5505 25.02 8 2956 16.86 4 2007 14.38 2 37589.60 2529 24.35 9 100 10 27.50 8 300 11 17.08 4 412 12 37.0011 400 13 41.9512 500 14 11.66 2 360 15 21.65 4 205 16 17.89 4 400 17 69.0020 600 18 10.30 1 585 19 34.9310 540 20 46.5915 250 21 44.8815 290 22 54.1216 510 23 56.6317 590 24 22.13 6 100 25 21.15 5 400 both wire length and die height increase. Furthermore, it seems reasonable to think that a model such as Pull strength 0 1 1wire length2 2 1die height2 would be appropriate as an empirical model for this relationship. In general, this type of empirical model is called a regression model. In Chapters 11 and 12 we show how to build these models and test their adequacy as approximating functions. We will use a method for 80 60 Pull strength 40 20600 Figure 1-15 Three-500400 dimensional plot of0300 t04200igh the wire bond pull8100 he12 16 0D ie strength data. Wire length20 32. JWCL232_c01_001-016.qxd1/14/10 8:51 AM Page 15 1-4 PROBABILITY AND PROBABILITY MODELS15 80 Pull strength 60 40 20600Figure 1-16 Plot of500 400predicted values of 0 300 t04 200ighpull strength from the 8 100 he12 16 ieempirical model.Wire length 20 0D estimating the parameters in regression models, called the method of least squares, that traces its origins to work by Karl Gauss. Essentially, this method chooses the parameters in the empirical model (the s) to minimize the sum of the squared distances between each data point and the plane represented by the model equation. Applying this technique to the data in Table 1-2 results in Pull strength 2.26 2.741wire length2 0.01251die height2(1-7) where the hat, or circumex, over pull strength indicates that this is an estimated or pre- dicted quantity.Figure 1-16 is a plot of the predicted values of pull strength versus wire length and die height obtained from Equation 1-7. Notice that the predicted values lie on a plane above the wire lengthdie height space. From the plot of the data in Fig. 1-15, this model does not ap- pear unreasonable. The empirical model in Equation 1-7 could be used to predict values of pull strength for various combinations of wire length and die height that are of interest. Essentially, the empirical model could be used by an engineer in exactly the same way that a mechanistic model can be used.1-4 PROBABILITY AND PROBABILITY MODELS In Section 1-1, it was mentioned that decisions often need to be based on measurements from only a subset of objects selected in a sample. This process of reasoning from a sample of objects to conclusions for a population of objects was referred to as statistical inference. A sample of three wafers selected from a larger production lot of wafers in semiconductor man- ufacturing was an example mentioned. To make good decisions, an analysis of how well a sample represents a population is clearly necessary. If the lot contains defective wafers, how well will the sample detect this? How can we quantify the criterion to detect well? Basically, how can we quantify the risks of decisions based on samples? Furthermore, how should samples be selected to provide good decisionsones with acceptable risks? Probability models help quantify the risks involved in statistical inference, that is, the risks involved in decisions made every day.More details are useful to describe the role of probability models. Suppose a production lot contains 25 wafers. If all the wafers are defective or all are good, clearly any sample will generate all defective or all good wafers, respectively. However, suppose only one wafer in the lot is defective. Then a sample might or might not detect (include) the wafer. A probabil- ity model, along with a method to select the sample, can be used to quantify the risks that the defective wafer is or is not detected. Based on this analysis, the size of the sample might be increased (or decreased). The risk here can be interpreted as follows. Suppose a series of lots, 33. JWCL232_c01_001-016.qxd 1/14/10 8:51 AMPage 16 16CHAPTER 1 THE ROLE OF STATISTICS IN ENGINEERINGeach with exactly one defective wafer, are sampled. The details of the method used to selectthe sample are postponed until randomness is discussed in the next chapter. Nevertheless,assume that the same size sample (such as three wafers) is selected in the same manner fromeach lot. The proportion of the lots in which the defective wafer is included in the sample or,more specically, the limit of this proportion as the number of lots in the series tends to inn-ity, is interpreted as the probability that the defective wafer is detected.A probability model is used to calculate this proportion under reasonable assumptions forthe manner in which the sample is selected. This is fortunate because we do not want to at-tempt to sample from an innite series of lots. Problems of this type are worked in Chapters 2and 3. More importantly, this probability provides valuable, quantitative information regard-ing any decision about lot quality based on the sample.Recall from Section 1-1 that a population might be conceptual, as in an analytic study thatapplies statistical inference to future production based on the data from current production.When populations are extended in this manner, the role of statistical inference and the associ-ated probability models becomes even more important.In the previous example, each wafer in the sample was only classied as defective or not.Instead, a continuous measurement might be obtained from each wafer. In Section 1-2.5, con-centration measurements were taken at periodic intervals from a production process. Figure 1-8shows that variability is present in the measurements, and there might be concern that theprocess has moved from the target setting for concentration. Similar to the defective wafer,one might want to quantify our ability to detect a process change based on the sample data.Control limits were mentioned in Section 1-2.5 as decision rules for whether or not to adjusta process. The probability that a particular process change is detected can be calculated witha probability model for concentration measurements. Models for continuous measurementsare developed based on plausible assumptions for the data and a result known as the centrallimit theorem, and the associated normal distribution is a particularly valuable probabilitymodel for statistical inference. Of course, a check of assumptions is important. These types ofprobability models are discussed in Chapter 4. The objective is still to quantify the risks in-herent in the inference made from the sample data.Throughout Chapters 6 through 15, decisions are based on statistical inference from sam-ple data. Continuous probability models, specically the normal distribution, are used exten-sively to quantify the risks in these decisions and to evaluate ways to collect the data and howlarge a sample should be selected. IMPORTANT TERMS AND CONCEPTS Analytic studyFractional factorialPopulation Statistical inference Cause and effect experiment Probability modelStatistical process Designed experiment Hypothesis testingProblem-solving control Empirical model Interaction method Statistical thinking Engineering methodMechanistic model RandomizationTampering Enumerative study Observational study Retrospective studyTime series Factorial experimentOvercontrol Sample Variability 34. JWCL232_c02_017-065.qxd 1/11/10 6:37 AM Page 17 2ProbabilityErik Isakson/Getty Images, Inc.An athletic woman in her twenties arrives at the emergency department complaining ofdizziness after running in hot weather. An electrocardiogram is used to check for a heart at-tack and the patient generates an abnormal result. The test has a false positive rate 0.1 (theprobability of an abnormal result when the patient is normal) and a false negative rate of 0.1(the probability of a normal result when the patient is abnormal). Furthermore, it might beassumed that the prior probability of a heart attack for this patient is 0.001. Although the ab-normal test is a concern, you might be surprised to learn that the probability of a heart at-tack given the electrocardiogram result is still less than 0.01. See Why Clinicians areNatural Bayesians (2005, bmj.com) for details of this example and others. The key is to properly combine the given probabilities. Furthermore, the exact sameanalysis used for this medical example can be applied to tests of engineered products.Consequently knowledge of how to manipulate probabilities in order to assess risks andmake better decisions is important throughout scientic and engineering disciplines. In thischapter the laws of probability are presented and used to assess risks in cases such as thisone and numerous others.CHAPTER OUTLINE2-1 SAMPLE SPACES AND EVENTS 2-3 ADDITION RULES