supplemental material to intro to sqc 6th ed

Upload: frank-scialla

Post on 01-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    1/93

    1

    About the Supplemental Text Material

    I have prepared supplemental text material to accompany the 6 th edition of Introduction to Statistical QualityControl . This material consists of (1) additional background reading on some aspects of statistics and

    statistical quality control and improvement, (2) extensions of and elaboration on some textbook topics and (3)some new topics that I could not easily find a home for in the text without making the book much t Much of this material has been prepared in at least partial response to the many excellent and very helpfulsuggestions that have been made over the years by textbook users. However, sometimes there just waway to easily accommodate their suggestions directly in the book. Some of the supplemental material is alsoin response to FAQs or frequently asked questions from students. I have also provided a list of refor this supplemental material that are not cited in the textbook.

    Feedback from my colleagues indicates that this book is used in a variety of ways. Most often, it is used asthe textbook in an upper-division undergraduate course on statistical quality control and improvement.However, there are a significant number of instructors that use the book as the basis of a graduate-level course,

    or offer a course taken by a mixture of advanced undergraduates and graduate students. Obviously the topicalcontent and depth of coverage varies widely in these courses. Consequently, I have included somesupplemental material on topics that might be of interest in a more advanced undergraduate or graduate-levelcourse.

    There is considerable personal bias in my selection of topics for the supplemental material. The cofar from comprehensive.

    I have not felt as constrained about mathematical level or statistical background of the readers in thesupplemental material as I have tried to be in writing the textbook. There are sections of the supplementalmaterial that will require more background in statistics than is required to read the text material. However, Ithink that many instructors will be able to use selected portions of this supplement material in their coursesquite effectively, depending on the maturity and background of the students.

    Supplemental Text Material Contents

    Chapter 3

    S3-1. Independent Random Variables

    S3-2. Development of the Poisson Distribution

    S3-3. The Mean and Variance of the Normal Distribution

    S3-4. More about the Lognormal Distribution

    S3-5. More about the Gamma Distribution

    S3-6. The Failure Rate for the Exponential Distribution

    S3-7. The Failure Rate for the Weibull Distribution

    Chapter 4

    S4-1. Random Samples

    S4-2. Expected Value and Variance Operators

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    2/93

    2

    S4-3. Proof That 2 2( ) and ( ) E x E S

    S4-4. More about Parameter Estimation

    S4-5. Proof That ( ) E S

    S4-6. More about Checking Assumptions in the t -TestS4-7. Expected Mean Squares in the Single-Factor Analysis of Variance

    Chapter 5

    S5-1. A Simple Alternative to Runs Rules on the x Chart

    Chapter 6

    S6-1. s 2 is not Always an Unbiased Estimator of

    2

    S6-2. Should We Use 2d or*

    2d in Estimating via the Range Method?

    S6-3. Determining When the Process has Shifted

    S6-4. More about Monitoring Variability with Individual Observations

    S6-5. Detecting Drifts versus Shifts in the Process Mean

    S6-6. The Mean Square Successive Difference as an Estimator of 2

    Chapter 7

    S7-1. Probability Limits on Control Charts

    Chapter 8

    S8-1. Fixed Versus Random Factors in the Analysis of Variance

    S8-2. More about Analysis of Variance Methods for Measurement Systems Capability Studies

    Chapter 9S9-1. The Markov Chain Approach for Finding the ARLs for Cusum and EWMA ControlCharts

    S9-2. Integral Equations versus Markov Chains for Finding the ARL

    Chapter 10

    S10-1. Difference Control Charts

    S10-2. Control Charts for Contrasts

    S10-3. Run Sum and Zone Control Charts

    S10-4. More about Adaptive Control Charts

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    3/93

    3

    Chapter 11

    S-11.1 Multivariate Cusum Control Charts

    Chapter 13

    S13-1. Guidelines for Planning Experiments

    S13-2. Using a t -Test for Detecting Curvature

    S13-3. Blocking in Designed Experiments

    S13-4. More about Expected Mean Squares in the Analysis of Variance

    Chapter 14

    S14-1. Response Surface Designs

    S14-2. Fitting Regression Models by Least Squares

    S14-3. More about Robust Design and Process Robustness Studies

    Chapter 15

    S15-1. A Lot Sensitive Compliance (LTPD) Sampling Plan

    S15-2. Consideration of Inspection Errors

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    4/93

    4

    Supplemental Material for Chapter 3

    S3.1. Independent Random Variables

    Preliminary Remarks

    Readers encounter random variables throughout the textbook. An informal definition of and notationfor random variables is used. A random variable may be thought of informally as any variable forwhich the measured or observed value depends on a random or chance mechanism. That is, the valueof a random variable cannot be known in advance of actual observation of the phenomena. Formally,of course, a random variable is a function that assigns a real number to each outcome in the samplespace of the observed phenomena. Furthermore, it is customary to distinguish between the randomvariable and its observed value or realization by using an upper-case letter to denote the randomvariable (say X ) and the actual numerical value x that is the result of an observation or a measuredvalue. This formal notation is not used in the book because (1) it is not widely employed in thestatistical quality control field and (2) it is usually quite clear from the context whether we are

    discussing the random variable or its realization.Independent Random variables

    In the textbook, we make frequent use of the concept of independent random variables. Most readershave been exposed to this in a basic statistics course, but here a brief review of the concept is given.For convenience, we consider only the case of continuous random variables. For the case of discreterandom variables, refer to Montgomery and Runger (2007).

    Often there will be two or more random variables that jointly define some physical phenomena of interest. Forexample, suppose we consider injection-molded components used to assemble a connector for an automotiveapplication. To adequately describe the connector, we might need to study both the hole interior diameter andthe wall thickness of the component. Let x1 represent the hole interior diameter and x2 represent the wallthickness. The joint probability distribution (or density function) of these two continuous random variablescan be specified by providing a method for calculating the probability that x1 and x2 assume a value in anyregion R of two-dimensional space, where the region R is often called the range space of the random variable.This is analogous to the probability density function for a single random variable. Let this joint probabilitydensity function be denoted by 1 2( , ) f x x . Now the double integral of this joint probability density functionover a specified region R provides the probability that x1 and x2 assume values in the range space R.

    A joint probability density function has the following properties:

    a. 1 2 1 2( , ) 0 for all , f x x x x

    b. 1 2 1 2( , ) 1 f x x dx dx

    c. For any region R of two-dimensional space 1 2 1 2 1 2{( , ) } ( , )

    R

    P x x R f x x dx dx

    The two random variables x1 and x2 are independent if 1 2 1 1 2 2( , ) ( ) ( ) f x x f x f x where 1 1 2 2( ) and ( ) f x f x arethe marginal probability distributions of x1 and x2, respectively, defined as

    1 1 1 2 2 2 2 1 2 1( ) ( , ) and ( ) ( , ) f x f x x dx f x f x x dx

    In general, if there are p random variables 1 2, , ..., p x x x then the joint probability density function is

    1 2( , ,..., ) p f x x x , with the properties:

    a. 1 2 1 2( , ,..., ) 0 for all , ,..., p p f x x x x x x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    5/93

    5

    b. 1 2 1 2... ( , ..., ) ... 1 p p R

    f x x x dx dx dx

    c. For any region R of p-dimensional space,

    1 2 1 2 1 2{( , ,..., ) } ... ( , ,..., ) ... p p p R

    P x x x R f x x x dx dx dx

    The random variables x1, x2, , x p are independent if

    1 2 1 1 2 2( , ,..., ) ( ) ( )... ( ) p p p f x x x f x f x f x

    where ( )i i f x are the marginal probability distributions of x1, x2 , , x p, respectively, defined as

    1 2 1 2 1 1( ) ... ( , ,..., ) ... ... xi

    i i p i i p R

    f x f x x x dx dx dx dx dx

    S3.2. Development of the Poisson Distribution

    The Poisson distribution is widely used in statistical quality control and improvement, frequently asthe underlying probability model for count data . As noted in Section 3.2.3 of the text, the Poissondistribution can be derived as a limiting form of the binomial distribution, and it can also be developedfrom a probability argument based on the birth and death process. We now give a summary of bothdevelopments.

    The Poisson Distribution as a Limiting Form of the Binomial Distribution

    Consider the binomial distribution

    ( ) (1 )

    !(1 ) , 0,1, 2,...,

    !( )!

    x n x

    x n x

    n p x p p x

    n p p x n

    x n x

    Let np so that / p n . We may now write the binomial distribution as

    ( 1)( 2) ( 1)( )

    !

    1 2 1(1) 1 1 1 1 1

    !

    x n x

    x n x

    n n n n x n p x

    x n n

    x

    x n n n n n

    Let and 0n p so that np remains constant. The terms1 2 1

    1 , 1 ,..., 1 x

    n n n

    and

    1 x

    n

    all approach unity. Furthermore,

    1 asn

    e nn

    Thus, upon substitution we see that the limiting form of the binomial distribution is

    ( )!

    xe p x

    x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    6/93

    6

    which is the Poisson distribution.

    Development of the Poisson Distribution from the Poisson Process

    Consider a collection of time- oriented events, arbitrarily called arrivals or births. Let xt be thenumber of these arrivals or births that occur in the interval [0,t ). Note that the range space of xtis R = {0,1,}. Assume that the number of births during non-overlapping time intervals areindependent random variables, and that there is a positive constant such that for any small timeinterval t , the following statements are true:

    1. The probability that exactly one birth will occur in an interval of length t is t .

    2. The probability that zero births will occur in the interval is 1 t .

    3. The probability that more than one birth will occur in the interval is zero.

    The parameter is often called the mean arrival rate or the mean birth rate. This type of process, inwhich the probability of observing exactly one event in a small interval of time is constant (or the

    probability of occurrence of event is directly proportional to the length of the time interval), and theoccurrence of events in non-overlapping time intervals is independent is called a Poisson process .

    In the following, let

    { } ( ) ( ), 0,1, 2,...t x P x x p x p t x

    Suppose that there have been no births up to time t. The probability that there are no births at the endof time t + t is

    0 0( ) (1 ) ( ) p t t t p t

    Note that

    0 00

    ( ) ( )( )

    p t t p t p t

    t

    so consequently

    0 000

    0

    ( ) ( )lim ( )

    ( )

    t

    p t t p t p t

    t

    p t

    For x > 0 births at the end of time t + t we have

    1( ) ( ) (1 ) ( ) x x x p t t p t t t p t

    and

    0

    1

    ( ) ( )lim ( )

    ( ) ( )

    x x x

    t

    x x

    p t t p t p t

    t

    p t p t

    Thus we have a system of differential equations that describe the arrivals or births:

    0 0

    1

    ( ) ( ) for 0

    ( ) ( ) ( ) for 1,2,... x x x

    p t p t x

    p t p t p t x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    7/93

    7

    The solution to this set of equations is

    ( )( ) 0,1, 2,...

    !

    x t

    x

    t e p t x

    x

    Obviously for a fixed value of t this is the Poisson distribution.

    S3.3. The Mean and Variance of the Normal Distribution

    In Section 3.3.1 we introduce the normal distribution, with probability density function

    22

    1( )

    21( ) ,2

    x

    f x e x

    and we stated that 2 and are the mean and variance, respectively, of the distribution. We nowshow that this claim is correct.

    Note that ( ) 0 f x . We first evaluate the integral ( ) I f x dx, showing that it is equal to 1. In

    the integral, change the variable of integration to ( ) / z x . Then

    2 / 21

    2

    z I e dz

    Since 20, if 1, then 1 I I I . Now we may write

    2 2

    2 2

    2 / 2 / 2

    ( ) / 2

    12

    12

    x y

    x y

    I e dx e dy

    e dxdy

    If we switch to polar coordinates, then cos( ), sin( ) x r y r and222 / 2

    0 0

    2

    0

    1

    2

    1 12 1

    2 2

    r I e rdrd

    d

    So we have shown that ( ) f x has the properties of a probability density function.

    The integrand obtained by the substitution ( ) / z x is, of course, the standard normal distribution , an important special case of the more general normal distribution. The standard normal

    probability density function has a special notation, namely

    2 / 21( ) ,2

    z z e z

    and the cumulative standard normal distribution is

    ( ) ( ) z

    z t dt

    Several useful properties of the standard normal distribution can be found by basic calculus:

    1. ( ) ( ), z z for all real z , so ( ) z is an even function (symmetric about 0) of z

    2. ( ) ( ) z z z

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    8/93

    8

    3. 2( ) ( 1) ( ) z z z

    Consequently, ( ) z has a unique maximum at z = 0, inflection points at 1 z , and both( ) 0 and ( ) 0 as z z z .

    The mean and variance of the standard normal distribution are found as follows:

    ( ) ( )

    ( )

    ( ) |

    0

    E z z z dz

    z dz

    z

    and

    2 2( ) ( )

    [ ( ) ( )]

    ( ) | ( )

    0 1

    1

    E z z z dz

    z z dz

    z z dz

    Because the variance of a random variable can be expressed in terms of expectation as2 2 2 2( ) ( ) , E z E z we have shown that the mean and variance of the standard normal

    distribution are 0 and 1, respectively.

    Now consider the case where x follows the more general normal distribution. Based on thesubstitution, we have ( ) / z x

    22

    1( )

    21( )2

    ( ) ( )

    ( ) ( )

    (1) (0)

    x

    E x x e dx

    z z dz

    z dz z z dz

    and

    22

    1( )

    2 2 2

    2

    2 2

    2 2

    1( )

    2

    ( ) ( )

    ( ) 2 ( ) ( )

    x

    E x x e dx

    z z dz

    z dz z z dz z dz

    Therefore, it follows that 2 2 2 2 2 2( ) ( ) ( )V x E x .

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    9/93

    9

    S3.4. More about the Lognormal Distribution

    The lognormal distribution is a general distribution of wide applicability. The lognormal distributionis defined only for positive values of the random variable x and the probability density function is

    22

    1(ln )

    21

    ( ) 02

    x

    f x e x x

    The parameters of the lognormal distribution are 2 and 0 . The lognormalrandom variable is related to the normal random variable in that ln y x is normally distributed with

    mean 2 and variance .

    The mean and variance of the lognormal distribution are21

    2

    2 22 2

    ( )

    ( ) ( 1)

    x

    x

    E x e

    V x e e

    The median and mode of the lognormal distribution are

    2

    x e

    mo e

    In general, the k th origin moment of the lognormal random variable is2 21

    2( ) k k k E x e

    Like the gamma and Weibull distributions, the lognormal finds application in reliability engineering,often as a model for survival time of components or systems. Some important properties of thelognormal distribution are:

    1. If x1 and x2 are independent lognormal random variables with parameters 2 21 1 2 2( , ), ( , ) ,respectively, then 1 2 y x x is a lognormal random variable with parameters

    2 2

    1 2 1 2and .

    2. If 1 2, , ..., k x x x are independently and identically distributed lognormal random variables with

    parameters 2 and , then the geometric mean of the xi, or1/

    1

    k k

    i

    i

    x

    , has a lognormal

    distribution with parameters 2 and / k .

    3. If x is a lognormal random variable with parameters 2 and , and if a, b, and c are constants

    such that cb e , then the random variable a y bx has a lognormal distribution with

    parameters2 2 andc a a .

    S3.5. More about the Gamma Distribution

    The gamma distribution is introduced in Section 3.3.4. The gamma probability density function is

    1( ) ( ) , 0

    ( )

    r x f x x e x

    r

    where r > 0 is a shape parameter and 0 is a scale parameter. The parameter r is called a shape parameter because it determines the basic shape of the graph of the density function. For example, if r = 1, the gamma

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    10/93

    10

    distribution reduces to an exponential distribution. There are actually three basic shapes; 1r orhyperexponential, r = 1 or exponential, and r > 1 or unimodal with right skew.

    The cumulative distribution function of the gamma is

    1

    0( ; , ) ( )

    ( )

    x r x F x r t e dt r

    The substitution /u t in this integral results in ( ; , ) ( / ; ,1) F x r F x r , which depends on only

    through the variable / x . We typically call such a parameter a scale parameter . It can be important to havea scale parameter in a probability distribution so that the results do not depend on the scale of measurementactually used. For example, suppose that we are measuring time in months, and 6 . The probability that

    x is less than or equal to 12 months is (12 / 6; ,1) (2; ,1) F r F r . If we wish to consider measuring time inweeks, then the probability that x is less than or equal to 48 weeks is just (48/ 24; ,1) (2; ,1) F r F r .Therefore, different scales of measurement can be accommodated by changing the scale parameter withouthaving to change to a more general form of the distribution.

    When r is an integer, the gamma distribution is sometimes called the Erlang distribution. Another special caseof the gamma distribution arises when we let r = , 1, 3/2, 2, and 1/ 2 ; this is the chi -square distribution with degrees of freedom / 1, 2, ...r . The chi-square distribution is very important in statisticalinference.

    S3.6. The Failure Rate for the Exponential Distribution

    The exponential distribution

    ( ) , 0 x f x e x

    was introduced in Section 3.3.3 of the text. The exponential distribution is frequently used in reliabilityengineering as a model for the lifetime or time to failure of a component or system. Generally, we define thereliability function of the unit as

    0

    ( ) { }

    1 ( )

    1 ( )

    t

    R t P x t

    f x dx

    F t

    where, of course, ( ) F t is the cumulative distribution function. In biomedical applications, the reliabilityfunction is usually called the survival function . For the exponential distribution, the reliability function is

    ( ) t F t e

    The Hazard Function

    The mean and variance of a distribution are quite important in reliability applications, but an additional property called the hazard function or the instantaneous failure rate is also useful. The hazard function is theconditional density function of failure at time t , given that the unit has survived until time t . Therefore, letting

    X denote the random variable and x denote the realization,

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    11/93

    11

    ( | ) ( )

    ( | )

    ( | ) ( | )lim

    ( | )lim

    ( , )lim

    { }

    ( )lim

    [1 ( )]

    ( )1 ( )

    x

    x

    x

    x

    f x X x h x

    F x X x

    F x x X x F x X x x

    F x X x x X x x

    F x X x x X x xP X x

    F x X x x x F x

    f x F x

    It turns out that specifying a hazard function completely determines the cumulative distributionfunction (and vive-versa).The Hazard Function for the Exponential Distribution

    For the exponential distribution, the hazard function is

    ( )( )

    1 ( ) x

    x

    f xh x

    F x

    ee

    That is, the hazard function for the exponential distribution is constant, or the failure rate is just thereciprocal of the mean time to failure.

    A constant failure rate implies that the reliability of the unit at time t does not depend on its age. Thismay be a reasonable assumption for some types of units, such as electrical components, b

    probably unreasonable for mechanical components. It is probably not a good assumption for manytypes of system-level products that are made up of many components (such as an automobile).Generally, an increasing hazard function indicates that the unit is more likely to fail in the nextincrement of time than it would have been in an earlier increment of time of the same length. This islikely due to aging or wear.

    Despite the apparent simplicity of its hazard function, the exponential distribution has been animportant distribution in reliability engineering. This is partly because the constant failure rateassumption is probably not unr easonable over some region of the units life.

    S3.7. The Failure Rate for the Weibull Distribution

    The instantaneous failure rate or the hazard function was defined in Section S3.6 of the Supplemental TextMaterial. For the Weibull distribution, the hazard function is

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    12/93

    12

    1 ( / )

    ( / )

    1

    ( )( )

    1 ( )

    ( / )( / ) x

    x

    f xh x

    F x

    x e

    e

    x

    Note that if 1 the Weibull hazard function is constant. This should be no surprise, since for 1 theWeibull distribution reduces to the exponential. When 1 , the Weibull hazard function increases,approaching as . Consequently, the Weibull is a fairly common choice as a model for componentsor systems that experience deterioration due to wear-out or fatigue. For the case where 1 , the Weibullhazard function decreases, approaching 0 as 0 .

    For comparison purposes, note that the hazard function for the gamma distribution with parameters r and

    is also constant for the case r = 1 (the gamma also reduces to the exponential when r = 1). Also, when r > 1the hazard function increases, and when r < 1 the hazard function decreases. However, when r > 1 the hazardfunction approaches from below, while if r < 1 the hazard function approaches from above. Therefore,even though the graph of the gamma and Weibull distributions look very similar, and they can both producereasonable fits to the same sample of data, they clearly have very different characteristics in terms of describingsurvival or reliability data.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    13/93

    13

    Supplemental Material for Chapter 4

    S4.1. Random Samples

    To properly apply many statistical techniques, the sample drawn from the population of interest must be a

    random sample . To properly define a random sample, let x be a random variable that represents the resultsof selecting one observation from the population of interest. Let ( ) f x be the probability distribution of x.

    Now suppose that n observations (a sample) are obtained independently from the population under unchangingconditions. That is, we do not let the outcome from one observation influence the outcome from anotherobservation. Let xi be the random variable that represents the observation obtained on the ith trial. Then theobservations 1 2, , ..., n x x x are a random sample.

    In a random sample the marginal probability distributions 1 2( ), ( ),..., ( )n f x f x f x are all identical, theobservations in the sample are independent, and by definition, the joint probability distribution of the randomsample is 1 2 1 2( , ,..., ) ( ) ( )... ( )n n f x x x f x f x f x .

    S4.2. Expected Value and Variance Operators

    Readers should have prior exposure to mathematical expectation from a basic statistics course. Heresome of the basic properties of expectation are reviewed.

    The expected value of a random variable x is denoted by ( ) E x and is given by

    all

    ( ), is a discrete random variable

    ( )( ) , is a continuous random variable

    i

    i i i x

    x p x x

    E x xf x dx x

    The expectation of a random variable is very useful in that it provides a straightforward characterization of thedistribution, and it has a simple practical interpretation as the center of mass, centroid, or mean of thedistribution.

    Now suppose that y is a function of the random variable x, say ( ) y h x . Note that y is also a random variable.The expectation of ( )h x is defined as

    all

    ( ) ( ), is a discrete random variable

    [ ( )]( ) ( ) , is a continuous random variable

    i

    i i i x

    h x p x x

    E h xh x f x dx x

    An interesting result, sometimes called the theorem of the unconscious statistician states that if x is acontinuous random variable with probability density function ( ) f x and ( ) y h x is a function of x having

    probability density function ( ) g y , then the expectation of y can be found either by using the definition ofexpectation with ( ) g y or in terms of its definition as the expectation of a function of x with respect to the

    probability density function of x. That is, we may write either

    ( ) ( ) E y yg y dy

    or

    ( ) [ ( )] ( ) ( ) E y E h x h x f x dx

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    14/93

    14

    The name for this theorem comes from the fact that we often apply it without consciously thinkingabout whether the theorem is true in our particular case.

    Useful Properties of Expectation I:

    Let x be a random variable with mean , and c be a constant. Then

    1. 1. ( ) E c c

    2. 2. ( ) E x

    3. 3. ( ) ( ) E cx cE x c

    4. 4. [ ( )] [ ( )] E ch x cE h x

    5. If c1 and c2 are constants and h1 and h2 are functions, then

    1 1 2 2 1 1 2 2[ ( ) ( )] [ ( )] [ ( )] E c h x c h x c E h x c E h x

    Because of property 5, expectation is called a linear (or distributive ) operator .

    Now consider the function 2( ) ( )h x x c where c is a constant, and suppose that 2[( ) ] E x c exists.

    To find the value of c for which 2[( ) ] E x c is a minimum, write

    2 2 2

    2 2

    [( ) ] [ 2 ]

    ( ) 2 ( )

    E x c E x xc c

    E x cE x c

    Now the derivative of 2[( ) ] E x c with respect to c is 2 ( ) 2 E x c , and this derivative is zero when

    ( )c E x . Therefore, 2[( ) ] E x c is a minimum when ( )c E x .

    The variance of the random variable x is defined as2

    2

    ( ) [( ) ]V x E x

    and we usually call2( ) [( ) ]V x E x

    the variance operator . It is straightforward to show that if c is a constant, then2 2( )V cx c

    The variance is analogous to the moment of inertia in mechanics.

    Useful Properties of Expectation II:

    Let x1 and x2 be random variables with means 1 2and and variances2 2

    1 2and , respectively, and

    let c1 and c2 be constants. Then

    1. 1 2 1 2( ) E x x

    2. It is possible to show that 2 21 2 1 2 1 2( ) 2 ( , )V x x Cov x x , where

    1 2 1 1 2 2( , ) [( )( )]Cov x x E x x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    15/93

    15

    is the covariance of the random variables x1 and x2. The covariance is a measure of the linear association between x1 and x2. More specifically, we may show that if x1 and x2 areindependent , then 1 2( , ) 0Cov x x .

    3. 2 21 2 1 2 1 2( ) 2 ( , )V x x Cov x x

    4. If the random variables x1 and x2 are independent , 2 21 2 1 2( )V x x

    5. If the random variables x1 and x2 are independent , 1 2 1 2 1 2( ) ( ) ( ) E x x E x E x

    6. Regardless of whether x1 and x2 are independent, in general

    1 1

    2 2

    ( )( )

    x E x E

    x E x

    7. For the single random variable x2( ) 4V x x

    because 2( , )Cov x x .

    Moments

    Although we do not make much use of the notion of the moments of a random variable in the book,for completeness we give the definition. Let the function of the random variable x be

    ( ) k h x x

    where k is a positive integer. Then the expectation of ( ) k h x x is called the k th moment about the

    origin of the random variable x and is given by

    all

    ( ), is a discrete random variable

    ( )( ) , is a continuous random variable

    i

    k i i i

    xk

    k

    x p x x

    E x x f x dx x

    Note that the first origin moment is just the mean of the random variable x. The second originmoment is

    2 2 2( ) E x

    Moments about the mean are defined as

    all

    ( ) ( ), is a discrete random variable[( ) ]

    ( ) ( ) , is a continuous random variable

    i

    k i i i

    xk

    k

    x p x x E x

    x f x dx x

    The second moment about the mean is the variance 2 of the random variable x.

    S4.3. Proof That 2 2( ) and ( ) E x E s

    It is easy to show that the sample average x and the sample variance s2 are unbiased estimators of thecorresponding population parameters 2 and , respectively. Suppose that the random variable x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    16/93

    16

    has mean2 and variance , and that 1 2, , ..., n x x x is a random sample of size n from the population.

    Then

    1

    1

    1

    1( )

    1( )

    1

    n

    i

    i

    n

    i

    i

    n

    i

    E x E xn

    E xn

    n

    because the expected value of each observation in the sample is ( )i E x . Now consider

    2

    2 1

    2

    1

    ( )

    ( ) 1

    1( )

    1

    n

    i

    i

    n

    i

    i

    x x

    E s E n

    E x xn

    It is convenient to write 2 2 21 1

    ( )n n

    i ii i

    x x x nx

    , and so

    2 2 2

    1 1

    ( ) ( ) ( )n n

    i i

    i i

    E x x E x E nx

    Now 2 2 2 2 2 21

    ( ) and ( ) /n

    i

    i

    E x E x n . Therefore

    2 2 2 2 2

    1

    2 2 2 2

    2

    2

    1( ) ( ) ( / )

    1

    11

    ( 1)

    1

    n

    i

    E s n nn

    n n nn

    n

    n

    Note that:

    a. These results do not depend on the form of the distribution for the random variable x. Many people think that an assumption of normality is required, but this is unnecessary.

    b. Even though 2 2( ) E s , the sample standard deviation is not an unbiased estimator of the population standard deviation. This is discussed more fully in section S3-5.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    17/93

    17

    S4.4. More About Parameter Estimation

    Throughout the book estimators of various population or process parameters are given without muchdiscussion concerning how the se estimators are generated. Often they are simply logical ointuitive estimators, such as using the sample average x as an estimator of the population mean .

    There are methods for developing point estimators of population parameters. These methods aretypically discussed in detail in courses in mathematical statistics. We now give a brief overview ofsome of these methods.

    The Method of Maximum Likelihood

    One of the best methods for obtaining a point estimator of a population parameter is the method ofmaximum likelihood. Suppose that x is a random variable with probability distribution ( ; ) f x ,where is a single unknown parameter. Let 1 2, , ..., n x x x be the observations in a random sample ofsize n. Then the likelihood function of the sample is

    1 2( ) ( ; ) ( ; ) ( ; )n L f x f x f x

    The maximum likelihood estimator of is the value of that maximizes the likelyhood function L( ).

    Example 1 The Exponential Distribution

    To illustrate the maximum likelihood estimation procedure, set x be exponentially distributed with parameter . The likelihood function of a random sample of size n, say 1 2, , ..., n x x x , is

    1

    1

    ( ) i

    n

    i

    i

    n x

    i

    xn

    L e

    e

    Now it turns out that, in general, if the maximum likelihood estimator maximizes L( ), it will alsomaximize the log likelihood, ln ( ) L . For the exponential distribution, the log likelihood is

    1

    ln ( ) lnn

    ii

    L n x

    Now

    1

    ln ( ) ni

    i

    d L n

    xd

    Equating the derivative to zero and solving for the estimator of we obtain

    1

    1n

    i

    i

    n

    x x

    Thus the maximum likelihood estimator (or the MLE) of is the reciprocal of the sample average.

    Maximum likelihood estimation can be used in situations where there are several unknown parameters, say 1 2, , , p to be estimated. The maximum likelihood estimators would be found

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    18/93

    18

    simply by equating the p first partial derivatives 1 2( , , , ) / , 1,2,..., p i L i p of the likelihood(or the log likelihood) equal to zero and solving the resulting system of equations.

    Example 2 The Normal Distribution

    Let x be normally distributed with the parameters 2 and unknown. The likelihood function of arandom sample of size n is

    2

    22

    1

    12 2

    1

    1( )

    22 / 2

    1( , )

    2

    1(2 )

    i

    n

    i

    i

    xn

    i

    x

    n

    L e

    e

    The log-likelihood function is

    2 2 2

    21

    1ln ( , ) ln(2 ) ( )2 2

    n

    ii

    n L x

    Now2

    21

    22

    2 2 41

    ln ( ) 1( )

    ln ( ) 1( ) 0

    2 2

    n

    ii

    n

    ii

    L x

    L n x

    The solution to these equations yields the MLEs

    1

    2 2

    1

    1

    1 ( )

    n

    i

    i

    n

    i

    i

    x xn

    x xn

    Generally, we like the method of maximum likelihood because when n is large, (1) it results inestimators that are approximately unbiased, (2) the variance of a MLE is as small as or nearly as smallas the variance that could be obtained with any other estimation technique, and (3) MLEs areapproximately normally distributed. Furthermore, the MLE has an invariance property; that is, if

    is the MLE of , then the MLE of a function of , say ( )h , is the same function

    ( ) of the MLEh . There are also some other nice statistical properties that MLEs enjoy; se book on mathematical statistics, such as Hogg and Craig (1978) or Bain and Engelhardt (1987).

    The unbiased property of the MLE is a large-sample orasymptotic property. To illustrate, consider theMLE for

    2 in the normal distribution of example 2 above. We can easily show that

    2 21( ) n

    E n

    Now the bias in estimation of 2 is2

    2 2 2 21( ) n

    E

    n n

    Notice that the bias in estimating 2 goes to zero as the sample size n . Therefore, the MLE isan asymptotically unbiased estimator.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    19/93

    19

    The Method of Moments

    Estimation by the method of moments involves equating the origin moments of the probabilitydistribution (which are functions of the unknown parameters) to the sample moments, and solving forthe unknown parameters. We can define the first p sample moments as

    1 , 1, 2, ...,

    nk i

    ik

    x M k p

    n

    and the first p moments around the origin of the random variable x are just

    ( ), 1,2, ...,k k E x k p

    Example 3 The Normal Distribution

    For the normal distribution the first two origin moments are

    1

    2 2

    2

    and the first two sample moments are

    1

    2

    2

    1

    1 n

    i

    i

    M x

    M xn

    Equating the sample and origin moments results in

    2 2 2

    1

    1 n

    i

    i

    x

    xn

    The solution gives the moment estimators of 2 and :

    2 2

    1

    1 ( )

    n

    i

    i

    x

    x xn

    The method of moments often yields estimators that are reasonably good. For example, in the aboveexample the moment estimators are identical to the MLEs. However, generally moment estimators

    are not as good as MLEs because they dont have statistical properties that are as nice. For exmoment estimators usually have larger variances than MLEs.Least Squares Estimation

    The method of least squares is one of the oldest and most widely used methods of parameterestimation. Section 4.6 gives an introduction to least squares for fitting regression models. Unlikethe method of maximum likelihood and the method of moments, least squares can be employed whenthe distribution of the random variable is unknown.

    To illustrate, suppose that the simple location model can describe the random variable x:

    , 1,2,...,i i x i n

    where the parameter is unknown and the i are random errors. We dont know the distribution ofthe errors, but we can assume that they have mean zero and constant variance. The least squares

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    20/93

    20

    estimator of is chosen so the sum of the squares of the model errors i is minimized. The least

    squares function for a sample of n observations 1 2, , ..., n x x x is

    2

    1

    2

    1

    ( )

    n

    i

    i

    n

    i

    i

    L

    x

    Differentiating L and equating the derivative to zero results in the least squares estimator of :

    x

    In general, the least squares function will contain p unknown parameters and L will be minimized bysolving the equations that result when the first partial derivatives of L with respect to the unknown

    parameters are equated to zero. These equations are called the least squares normal equations . SeeSection 4.6 in the textbook.

    The method of least squares dates from work by Karl Gauss in the early 1800s. It has a very well-developed and indeed quite elegant theory. For a discussion of the use of least squares in estimatingthe parameters in regression models and many illustrative examples, see Section 4.6 andMontgomery, Peck and Vining (2007), and for a very readable and concise presentation of the theory,see Myers and Milton (1991).

    S4.5. Proof That ( ) E S

    In Section S4.4 of the Supplemental Text Material we showed that the sample variance is an unbiasedestimator of the population variance; that is, 2 2( ) E s , and that this result does not depend on theform of the distribution. However, the sample standard deviation is not an unbiased estimator of the

    population standard deviation. This is easy to demonstrate for the case where the random variable x follows a normal distribution.

    Let x have a normal distribution with mean 2 and variance , and let 1 2, ,..., n x x x be a randomsample of size n from the population. Now the distribution of

    2

    2

    ( 1)n s

    is chi-square with n 1 degrees of freedom, denoted 2 1n . Therefore the distribution of s2 is2 2

    1/( 1) times a nn random variable. So when sampling from a normal distribution, the expectedvalue of s2 is

    22 2

    1

    22

    1

    2

    2

    ( )1

    ( )1

    ( 1)1

    n

    n

    E s E n

    E n

    nn

    because the mean of a chi-square random variable with n 1 degrees of freedom is n 1. Now itfollows that the distribution of

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    21/93

    21

    ( 1)n s

    is a chi distribution with n 1 degrees of freedom, denoted 1n . The expected value of S can bewritten as

    1

    1

    ( )1

    ( )1

    n

    n

    E s E n

    E n

    The mean of the chi distribution with n 1 degrees of freedom is

    1( / 2)

    ( ) 2[( 1) / 2]

    n

    n E

    n

    where the gamma function 10( ) r yr y e dy

    . Then

    4

    2 ( / 2)( )

    1 [( 1) / 2]

    n E s

    n n

    c

    The constant c4 is given in Appendix table VI.

    While s is a biased estimator of , the bias gets small fairly quickly as the sample size n increases.From Appendix table VI, note that c4 = 0.94 for a sample of n = 5, c4 = 0.9727 for a sample of n =10, and c4 = 0.9896 or very nearly unity for a sample of n = 25.

    S4.6. More about Checking Assumptions in the t -Test

    The two-sample t -test can be presented from the viewpoint of a simple linear regression model .This is a very instructive way to think about the t -test, as it fits in nicely with the general notion of afactorial experiment with factors at two levels. This type of experiment is very important in processdevelopment and improvement, and is discussed extensively in Chapter 13. This also leads to anotherway to check assumptions in the t -test. This method is equivalent to the normal probability plottingof the original data discussed in Chapter 4.

    We will use the data on the two catalysts in Example 4.9 to illustrate. In the two-sample t-testscenario, we have a factor x with two levels, which we can arbitrarily call low and high. We wuse x = -1 to denote the low level of this factor (Catalyst 1) and x = +1 to denote the high level of thisfactor (Catalyst 2). The figure below is a scatter plot (from Minitab) of the yield data resulting fromusing the two catalysts shown in Table 4.2 of the textbook.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    22/93

    22

    Catalyst

    Y i e l d

    1.00.50.0-0.5-1.0

    98

    97

    96

    95

    94

    93

    92

    91

    90

    89

    Scatterplot of Yield vs Catalyst

    We will a simple linear regression model to this data, say

    y xij ij ij 0 1

    where 0 1and are the intercept and slope, respectively, of the regression line and the regressor or predictor variable is x j1 1 and x j2 1 . The method of least squares can be used to estimate the

    slope and intercept in this model. Assuming that we have equal sample sizes n for each factor levelthe least squares normal equations are:

    2

    2

    0

    11

    2

    1 2

    1

    1

    1

    n y

    n y y

    ij j

    n

    i

    j j

    n

    j j

    n

    The solution to these equations is

    ( )

    0

    1 2 112

    y

    y y

    Note that the least squares estimator of the intercept is the average of all the observations from bothsamples, while the estimator of the slope is one-half of the difference between the sample averagesat the high and low levels of the factor x. Below is the output from the linear regression procedurein Minitab for the catalyst data.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    23/93

    23

    Regression Analysis: Yield versus Catalyst

    The regression equation is

    Yield = 92.5 + 0.239 Catalyst

    Predictor Coef SE Coef T P

    Constant 92.4938 0.6752 136.98 0.000

    Catalyst 0.2387 0.6752 0.35 0.729

    S = 2.70086 R-Sq = 0.9% R-Sq(adj) = 0.0%

    Analysis of Variance

    Source DF SS MS F P

    Regression 1 0.912 0.912 0.13 0.729

    Residual Error 14 102.125 7.295

    Total 15 103.037

    Notice that the estimate of the slope (given in the column labeled Coef and the row labeled Catalys

    is 0.2387 2 11 1( ) (92.7325 92.255)2 2 y y and the estimate of the intercept is 92.4938

    2 11 1

    ( ) (93.7325 92.255)2 2

    y y . Furthermore, notice that the t -statistic associated with the slope is

    equal to 0.35, exactly the same value (apart from sign, because we subtracted the averages in the reverse order)we gave in the text. Now in simple linear regression, the t -test on the slope is actually testing the hypotheses

    H

    H 0 1

    0 1

    0

    0

    :

    :

    and this is equivalent to testing H 0 1 2: .

    It is easy to show that the t -test statistic used for testing that the slope equals zero in simple linear regressionis identical to the usual two-sample t -test. Recall that to test the above hypotheses in simple linear regressionthe t -statistic is

    t

    S xx

    01

    2

    where S xx ( ) x xij j

    n

    i

    2

    11

    2

    is the corrected sum of squares of the xs. Now in our specific problem,

    x x x j j 0 1 11 2, ,and so S n xx 2 . Therefore, since we have already observed that the estimate of is just s p,

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    24/93

    24

    2 11 2 1

    0 2

    1( )

    21 2

    2 p p xx

    y y y yt

    s sn nS

    This is the usual two-sample t -test statistic for the case of equal sample sizes.

    Most regression software packages will also compute a table or listing of the residuals from the model. Theresiduals from the Minitab regression model fit obtained above are as follows:

    Obs Catalyst Yield Fit SE Fit Residual St Resid

    1 -1.00 91.500 92.255 0.955 -0.755 -0.30

    2 -1.00 94.180 92.255 0.955 1.925 0.76

    3 -1.00 92.180 92.255 0.955 -0.075 -0.03

    4 -1.00 95.390 92.255 0.955 3.135 1.24

    5 -1.00 91.790 92.255 0.955 -0.465 -0.18

    6 -1.00 89.070 92.255 0.955 -3.185 -1.26

    7 -1.00 94.720 92.255 0.955 2.465 0.98

    8 -1.00 89.210 92.255 0.955 -3.045 -1.21

    9 1.00 89.190 92.733 0.955 -3.543 -1.40

    10 1.00 90.950 92.733 0.955 -1.783 -0.71

    11 1.00 90.460 92.733 0.955 -2.273 -0.90

    12 1.00 93.210 92.733 0.955 0.477 0.19

    13 1.00 97.190 92.733 0.955 4.457 1.76

    14 1.00 97.040 92.733 0.955 4.307 1.70

    15 1.00 91.070 92.733 0.955 -1.663 -0.66

    16 1.00 92.750 92.733 0.955 0.017 0.01

    The column labeled Fit contains the predicted values of yield from the regression model, which justo be the averages of the two samples. The residuals are in the sixth column of this table. They are just thedifferences between the observed values of yield and the corresponding predicted values. A normal probability

    plot of the residuals follows.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    25/93

    25

    Residual

    P e r c e n t

    5.02.50.0-2.5-5.0-7.5

    99

    95

    90

    80

    70

    60504030

    20

    10

    5

    1

    Normal Probability Plot of the Residuals(response is Yield)

    Notice that the residuals plot approximately along a straight line, indicating that there is no serious problem with the normality assumption in these data. This is equivalent to plotting the original yielddata on separate probability plots as we did in Chapter 3.

    S4.7. Expected Mean Squares in the Single-Factor Analysis of Variance

    In section 4.5.2 we give the expected values of the mean squares for treatments and error in the single-factoranalysis of variance (ANOVA). These quantities may be derived by straightforward application of theexpectation operator.

    Consider first the mean square for treatments:

    E MS E SS

    aTreatments

    Treatments( ) F H G I

    K J 1

    Now for a balanced design (equal number of observations in each treatment)

    SS n

    yan

    yTreatments ii

    a 1 12 2

    1. ..

    and the single-factor ANOVA model is

    yi a

    j nij i ij

    RST

    1 2

    1 2

    , , ,

    , , ,

    In addition, we will find the following useful:

    E E E E E n E anij i ij i( ) ( ) ( ) , ( ) , ( ) , ( ). .. . .. 0 2 2 2 2 2 2

    Now

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    26/93

    26

    E SS E n

    y E an

    yTreatments ii

    a

    ( ) ( ) ( ). .. 1 12 2

    1

    Consider the first term on the right hand side of the above expression:

    E n y n E n nii

    a

    i ii

    a

    ( ) ( ). .1 1

    2

    1

    2

    1

    Squaring the expression in parentheses and taking expectation results in

    E n

    yn

    a n n an

    an n a

    ii

    a

    ii

    a

    ii

    a

    ( ) [ ( ) ].1 12

    1

    2 2 2 2

    1

    2 2 2

    1

    because the three cross-product terms are all zero. Now consider the second term on the right hand side of E SS

    Treatments( ) :

    E an

    yan

    E an n

    an E an

    ii

    a1 1

    1

    2

    1

    2

    2

    .. ..

    ..

    ( )

    ( )

    F H G

    I K J

    since ii

    a

    1

    0. Upon squaring the term in parentheses and taking expectation, we obtain

    E an

    yan

    an an

    an

    1 12 2 2

    2 2

    .. [( ) ]F

    H G I

    K J

    since the expected value of the cross-product is zero. Therefore,

    Consequently the expected value of the mean square for treatments is

    E SS E n

    y E an

    y

    an n a an

    a n

    Treatments ii

    a

    ii

    a

    ii

    a

    ( ) ( ) ( )

    ( )

    ( )

    . ..

    1 1

    1

    2 2

    1

    2 2 2 2 2

    1

    2 2

    1

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    27/93

    27

    E MS E SS

    a

    a n

    a

    n

    a

    TreatmentsTreatments

    i

    i

    a

    i

    i

    a

    ( )

    ( )

    F H G

    I K J

    1

    1

    1

    1

    2 2

    1

    2

    2

    1

    This is the result given in the textbook.

    For the error mean square, we obtain

    2.

    1 1

    2 2.

    1 1 1

    ( )

    1 ( )

    1 1

    E E

    a n

    ij ii j

    a n a

    ij ii j i

    SS E MS E

    N a

    E y y N a

    E y y N a n

    Substituting the model into this last expression, we obtain

    2

    2

    1 1 1 1

    1 1( ) ( ) ( )

    a n a n

    E i ij i iji j i j

    E MS E N a n

    After squaring and taking expectation, this last equation becomes

    2 2 2 2 2 2

    1 1

    2

    1( )

    a a

    E i i

    i i

    E MS N n N N n a N a

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    28/93

    28

    Supplemental Material for Chapter 5

    S5.1. A Simple Alternative to Runs Rules on the x Chart

    It is well-known that while Shewhart control charts detect large shifts quickly, they are relative insensitive to

    small or moderately-sized process shifts. Various sensitizing rules (sometimes called runs rules) have been proposed to enhance the effectiveness of the chart to detect small shifts. Of these rules, the Western Electricrules are among the most popular. The western Electric rules are of the r out of m form; that is, if r out of thelast m consecutive points exceed some limit, an out of control signal is generated.

    In a very fundamental paper, Champ and Woodall (1987) point out that the use of these sensitizing rules doesindeed increase chart sensitivity, but at the expense of (sometimes greatly) increasing the rate of false alarms,hence decreasing the in-control ARL. Generally, I do not think that the sensitizing rules should be usedroutinely on a control chart, particularly once the process has been brought into a state of control. They dohave some application in the establishment of control limits (Phase 1 of control chart usage) and in trying to

    bring an unruly process into control, but even then they need to be used carefully to avoid false alarms.

    Obviously, Cusum and EWMA control charts provide an effective alternative to Shewhart control charts forthe problem of small shifts. However, Klein (2000) has proposed another solution. His solution is simple butelegant: use an r out of m consecutive point rule, but apply the rule to a single control limit rather than to a setof interior warning type limits. He analyzes the following two rules:

    1. If two consecutive points exceed a control limit, the process is out of control. The width of the controllimits should be 1.78 .

    2. If two out of three consecutive points exceed a control limit, the process is out of control. The widthof the control limits should be 1.93 .

    These rules would be applied to one side of the chart at a time, just as we do with the Western Electric rules.

    Klein (2000) presents the ARL performance of these rules for the x

    chart, using actual control limit widths of1.7814 and 1.9307 , as these choices make the in-control ARL exactly equal to 370, the values

    associated with the usual three-sigma limits on the Shewhart chart. The table shown below is adapted fromhis results. Notice that Professor Kleins procedure greatly improves the ability of the Shewhart x chart todetect small shifts. The improvement is not as much as can be obtained with an EWMA or a Cusum, but it issubstantial, and considering the simplicity of Kleins procedure, it should be more widely used in practice.

    Shift in process mean,in standard deviation

    units

    ARL for the Shewhart x chart with three-

    sigma control limits

    ARL for the Shewhart x chart with

    1.7814 controllimits

    ARL for the Shewhart x chart with

    1.9307 controllimits

    0 370 350 370

    0.2 308 277 271

    0.4 200 150 142

    0.6 120 79 73

    0.8 72 44 40

    1 44 26 23

    2 6.3 4.6 4.33 2 2.4 2.4

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    29/93

    29

    Supplemental Material for Chapter 6

    S6.1. s 2 is not Always an Unbiased Estimator of 2

    An important property of the sample variance is that it is an unbiased estimator of the populationvariance, as demonstrated in Section S4.3 of the Supplemental Text Material. However, this unbiased

    property depends on the assumption that the sample data has been drawn from a stable process; thatis, a process that is in statistical control. In statistical quality control work we sometimes make thisassumption, but if it is incorrect, it can have serious consequences on the estimates of the process

    parameters we obtain.

    To illustrate, suppose that in the sequence of individual observations x x x x x

    t t m1 2 1, , , , , ,

    the process is in-control with mean 0 and standard deviation for the first t observations, but between xt and xt +1 an assignable cause occurs that results in a sustained shift in the process mean to

    a new level 0 and the mean remains at this new level for the remaining sampleobservations x xt m1 , , . Under these conditions, Woodall and Montgomery (2000-01) show that

    2 2 2( )( ) ( ) .( 1)

    t m t E s

    m m (S6.1)

    In fact, this result holds for any case in which the mean of t of the observations is 0 and the mean of theremaining observations is 0 , since the order of the observations is not relevant in computing s2. Note

    that s2 is biased upwards; that is, s2 tends to overestimate 2. Furthermore, the extent of the bias depends onthe magnitude of the shift in the mean ( ), the time period following which the shift occurs ( t ), and the numberof available observations ( m). For example, if there are m = 25 observations and the process mean shifts from

    0 to 0 (that is, 1) between the 20 th and the 21 st observation ( t = 20), then s2 will overestimate2 by 16.7% on average. If the shift in the mean occurs earlier, say between the 10 th and 11 th observations, then

    s2 will overestimate 2 by 25% on average.

    The proof of Equation S6.1 is straightforward. Since we can write

    2 2 2

    1

    11

    m

    ii

    s x mxm

    then

    2 2 2 2 2

    1 1

    1 1( ) ( ) ( )

    1 1

    m m

    i ii i

    E s E x mx E x mE xm m

    Now

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    30/93

    30

    11

    11

    11

    11

    2

    1

    2 2

    11

    02 2

    02 2

    02

    02 2

    m E x

    m E x E x

    mt m t m t

    mt m t m

    ii

    m

    i ii t

    m

    i

    t F H G

    I K J

    F H G

    I K J

    ( ) ( ) ( )

    ( ) ( )( ) ( )

    ( )( )

    c h

    c h

    and

    11 1

    20

    2 2

    mmE x

    m

    m

    m t

    m m

    F H G

    I K J

    F H G

    I K J

    L

    NMM

    O

    QPP

    ( )

    Therefore

    2 2

    2 2 2 20 0 0

    2

    2 2 20 0 0

    22 2 2

    2 2

    1( ) ( )( )

    1

    1( )( )

    1

    1 ( )( )( ) ( )

    1

    1 ( )( )( ) 1

    1

    m t E s t m t m m

    m m m

    m t t m t m

    m m

    m t m t

    m m

    m t m t

    m m

    2 2( ) ( )( 1)

    t m t

    m m

    S6.2. Should We Use d 2 or d 2* in Estimating via the Range Method?

    In the textbook, we make use of the range method for estimation of the process standard deviation, particularlyin constructing variables control charts (for example, see the and x R charts of Chapter 5). We use the

    estimator 2/ R d . Sometimes an alternative estimator, *2/ R d , is encountered. In this section we discuss thenature and potential uses of these two estimators. Much of this discussion is adapted from Woodall andMontgomery (2000-01). The original work on using ranges to estimate the standard deviation of a normaldistribution is due to Tippett (1925). See also the paper by Duncan (1955).

    Suppose one has m independent samples, each of size n, from one or more populations assumed to be normallydistributed with standard deviation . We denote the sample ranges of the m samples or subgroups as R R R

    m1 2, , , . Note that this type of data arises frequently in statistical process control applications and gaugerepeatability and reproducibility (R & R) studies (refer to Chapter 8). It is well-known that E ( Ri) = d2 andVar ( Ri)=d 32 2 for i m1 2, , , where d 2 and d 3 are constants that depend on the sample size n. Values ofthese constants are tabled in virtually all textbooks and training materials on statistical process control. See,for example Appendix table VI for values of d 2 and d 3 for n = 2 to 25.

    There are two estimators of the process standard deviation based on the average sample range,

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    31/93

    31

    R

    R

    m

    i

    i

    m

    1 , (S6.2)

    that are commonly encountered in practice. The estimator

    / 1 2 R d (S6.3)

    is widely used after the application of control charts to estimate process variability and to assess processcapability. In Chapter 4 we report the relative efficiency of the range estimator given in Equation (S6.3) to thesample standard deviation for various sample sizes. For example, if n = 5, the relative efficiency of the rangeestimator compared to the sample standard deviation is 0.955. Consequently, there is little practical difference

    between the two estimators. Equation (S6.3) is also frequently used to determine the usual 3-sigma limits onthe Shewhart chart x in statistical process control. The estimator

    / * 2 2 R d (S6.4)

    is more often used in gauge R & R studies and in variables acceptance sampling. Here d 2*represents a constant

    whose value depends on both m and n. See Chrysler, Ford, GM (1995), Military Standard 414 (1957), andDuncan (1986).

    Patnaik (1950) showed that R / is distributed approximately as a multiple of a - distribution. In particular,

    R / is distributed approximately as d 2* / , where represents the fractional degrees of freedom for the

    distribution. Patnaik (1950) used the approximation

    322

    *

    2128

    5

    32

    1

    4

    11

    d d . (S6.5)

    It has been pointed out by Duncan (1986), Wheeler (1995), and Luko (1996), among others, that 1 is an

    unbiased estimator of and that 22

    is an unbiased estimator of 2. For 22

    to be an unbiased estimator of 2,

    however, David (1951) showed that no approximation for d 2* was required. He showed that

    d d V mn2 2

    2 1 2* /( / ) , (S6.6)

    where V n is the variance of the sample range with sample size n from a normal population with unit variance.

    It is important to note that V d n 32

    , so Equation (S5-6) can be easily used to determine values of d 2*from the

    widely available tables of d 2 and d 3. Thus, a table of d 2*values, such as the ones given by Duncan (1986),

    Wheeler (1995), and many others, is not required so long as values of d 2 and d 3 are tabled, as they usually are(once again, see Appendix Table VI). Also, use of the approximation

    41

    122 d d

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    32/93

    32

    given by Duncan (1986) and Wheeler (1995) becomes unnecessary.

    The table of d 2*values given by Duncan (1986) is the most frequently recommended. If a table is required, the

    ones by Nelson (1975) and Luko (1996) provide values of d 2*that are slightly more accurate since their values

    are based on Equation (S6.6).

    It has been noted that as m increases, d 2*approaches d 2. This has frequently been argued noting that increases

    as m increases. The fact that d 2*approaches d 2 as m increases is more easily seen, however, from Equation

    (S6.6) as pointed out by Luko (1996).

    Sometimes use of Equation (S6.4) is recommended without any explanation. See, for example, the AIAGmeasurement systems capability guidelines [Chrysler, Ford, and GM (1995)]. The choice between 1 and 2has often not been explained clearly in the literature. It is frequently stated that the use of Equation (S6.3)requires that R be obtained from a fairly large number of individual ranges. See, for example, Bissell (1994, p. 289). Grant and Leavenworth (1996, p. 128) state that Strictly speaking, the validity of the exactthe d 2 factor assumes that the ranges have been averaged for a fair number of subgroups, say, 20 or more.When only a few subgroups are available, a better estimate of is obtained using a factor that writers on

    statistics have designated as d 2*

    . Nelson (1975) writes, If fewer than a large number of subgroups are us Equation (S6.3) gives an estimate of which does not have the same expected value as the standard deviationestimator. In fact, Equation (S6.3) produces an unbiased estimator of regardless of the number of samplesm, whereas the pooled standard deviation does not (refer to Section S4.5 of the Supplemental Text Material).The choice between 1 and 2 depends upon whether one is interested in obtaining an unbiased estimator of

    or 2. As m increases, both estimators (S6.3) and (S6.4) become equivalent since each is a consistentestimator of .

    It is interesting to note that among all estimators of the form cR c( ),0 the one minimizing the mean squarederror in estimating has

    c d d 2 2

    2/ ( )* .

    The derivation of this result is in the proofs at the end of this section. If we let

    ( )*

    32

    22

    d

    d R

    then it is shown in the proofs below that

    MSE d

    d (

    )( )*

    3

    22

    221

    F H G

    I K J

    Luko (1996) compared the mean squared error of 2 in estimating to that of 1 and recommended 2 on the basis of uniformly lower MSE values. By definition, 3 leads to further reduction in MSE.

    It is shown in the proofs at the end of this section that the percentage reduction in MSE using 3 instead of 2 is

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    33/93

    33

    50 2 2

    2

    d d

    d

    *

    *

    F H G

    I K J

    Values of the percentage reduction are given in Table S6.1. Notice that when both the number ofsubgroups and the subgroup size are small, a moderate reduction in mean squared error can beobtained by using 3 .

    Table S6.1.

    Percentage Reduction in Mean Squared Error from using 3 instead of 2

    Subgroup

    Size, n

    Number of Subgroups, m

    1 2 3 4 5 7 10 15 20

    2 10.1191 5.9077 4.1769 3.2314 2.6352 1.9251 1.3711 0.9267 0.6998

    3 5.7269 3.1238 2.1485 1.6374 1.3228 0.9556 0.6747 0.4528 0.3408

    4 4.0231 2.1379 1.4560 1.1040 0.8890 0.6399 0.4505 0.3017 0.2268

    5 3.1291 1.6403 1.1116 0.8407 0.6759 0.4856 0.3414 0.2284 0.1716

    6 2.5846 1.3437 0.9079 0.6856 0.5507 0.3952 0.2776 0.1856 0.1394

    7 2.2160 1.1457 0.7726 0.5828 0.4679 0.3355 0.2356 0.1574 0.1182

    8 1.9532 1.0058 0.6773 0.5106 0.4097 0.2937 0.2061 0.1377 0.1034

    9 1.7536 0.9003 0.6056 0.4563 0.3660 0.2623 0.1840 0.1229 0.0923

    10 1.5963 0.8176 0.5495 0.4138 0.3319 0.2377 0.1668 0.1114 0.0836

    Proofs

    Result 1: Let , (

    ) [ ( ) ]* cR MSE c d cd then 2 2 22

    22 1

    Proof:

    MSE E cR

    c R c R

    c E R c E R

    (

    ) [( ) ]

    [ ]

    ( ) ( )

    2

    2 2 2

    2 2 2

    2

    2

    Now E R Var R E R d m d ( ) ( ) [ ( )] / ( )2 2 32 2

    22 . Thus

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    34/93

    34

    MSE c d m c d c d

    c d m d cd

    c d cd

    (

    ) / ( )

    [ ( / ) ]

    [ ( ) ]*

    232 2 2

    22 2

    22

    2 232

    22

    2

    2 22

    22

    2

    2 1

    2 1

    Result 2: The value of c that minimizes the mean squared error of estimators of the form cR in estimating

    isd

    d 2

    22( )*

    .

    Proof:

    MSE c d cd

    dMSE

    dcc d d

    c d

    d

    (

    ) [ ( ) ]

    (

    )[ ( ) ]

    ( ).

    *

    *

    *

    2 22

    22

    22

    22

    2

    22

    2 1

    2 2 0

    Result 3: The mean square error of( ) ( )* *

    32

    22

    2 2

    221

    F H G

    I K J

    d d

    R d

    d is .

    Proof:

    MSE d

    d d

    d

    d d

    d

    d

    d

    d

    d

    d

    (

    )( )

    ( )( )

    ( ) ( )

    ( )

    **

    *

    * *

    *

    32 2

    2

    24 2

    2 2

    22 2

    2 22

    2 2

    22

    2 2

    2 22

    22

    2 1

    2 1

    1

    LNM

    OQP

    LNM

    OQP

    F H G

    I K J

    (from result 1)

    =

    Note that MSE n MSE m(

    ) (

    ) . 3 30 0 as and as

    Result 4: Let

    ( ).* * 2

    23

    2

    22

    R

    d

    d

    d Rand Then

    MSE MSE

    MSE

    (

    ) (

    )(

    )

    2 3

    2

    100L

    NM

    O

    QP , the percent reduction

    in mean square error using the minimum mean square error estimator instead of R

    d 2

    * [as recommended by

    Luko (1996)], is

    50 2 2

    2

    d d

    d

    *

    *

    F H G

    I K J

    Proof:

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    35/93

    35

    Luko (1996) shows that MSE d d

    d (

    ) ( )*

    *

    2

    22 2

    2

    2, therefore

    MSE MSE

    d d

    d

    d

    d

    d d

    d

    d d

    d

    d d

    d

    d d d d

    d

    d d

    d

    d d

    d

    d d d

    d d d

    (

    ) (

    ) ( )

    ( )

    ( ) ( )( )

    ( ) ( )( )( )

    ( )

    ( )

    *

    * *

    *

    *

    *

    *

    *

    *

    * *

    *

    *

    *

    *

    *

    *

    *

    *

    2 3

    22 2

    2

    2 22

    22

    2 2 2

    2

    22

    22

    22

    2 2 2

    2

    2 2 2 2

    22

    2 2 2

    2

    2 2

    2

    2 2 2

    2

    2 2

    21

    2

    2

    2

    F H G

    I K J

    LNM

    OQP

    LNM

    OQP

    F H G

    I K J

    2

    2 2 22

    22

    *

    *

    *

    ( )( )

    F H G I K J

    d d

    d

    Consequently MSE MSE MSE

    d d d d d d

    d d d

    (

    ) (

    )(

    )( ) / ( )

    ( ) / ( )

    * *

    * *

    *

    *

    2 3

    2

    22 2

    22

    2

    22 2 2

    2 2

    2

    1002

    100 50LNM

    OQP

    F H G

    I K J

    .

    S6.3. Determining When the Process has ShiftedControl charts monitor a process to determine whether an assignable cause has occurred. Knowingwhen the assignable c ause has occurred would be very helpful in its identification and eventuremoval. Unfortunately, the time of occurrence of the assignable cause does not always coincide withthe control chart signal. In fact, given what is known about average run length performance of controlcharts, it is actually very unlikely that the assignable cause occurs at the time of the signal. Therefore,when a signal occurs, the control chart analyst should look earlier in the process history to determinethe assignable cause.

    But where should we start? The Cusum control chart provides some guidance simply search backwards on the Cusum status chart to find the point in time where the Cusum last crossed zero

    (refer to Chapter 8). However, the Shewhart x control chart provides no such simple guidance.Samuel, Pignatiello, and Calvin (1998) use some theoretical results by Hinkley (1970) on change-

    point problems to suggest a procedure to determine the time of a shift in the process mean followinga signal on the Shewhart x control chart. They assume the standard x control chart with in-control

    value of the process mean 0 . Suppose that the chart signal at subgroup average T x . Now the in-

    control subgroups are 1 2, , ..., t x x x , and the out-of-control subgroups are 1 2, ,...,t t T x x x , whereobviously t T . Their procedure consists of finding the value of t in the range 0 t T thatmaximizes

    2

    , 0( )( )

    t T t C T t x

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    36/93

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    37/93

    37

    S6.6. The Mean Square Successive Difference as an Estimator of 2

    An alternative to the moving range estimator of the process standard deviation is the mean square successivedifference as an estimator of 2 . The mean square successive difference is defined as

    21 1

    1

    1( )2( 1)

    n

    i

    i MSSD x xn

    It is easy to show that the MSSD is an unbiased estimator of 2 . Let 1 2, , ..., n x x x be a random sample of size

    n from a population with mean 2 and variance . Without any loss of generality, we may take the mean to bezero. Then

    21

    2

    2 21 1

    2

    2 2

    2

    2

    1( ) ( )

    2( 1)

    1( 2 )

    2( 1)1

    [( 1) ( 1) ]2( 1)

    2( 1)2( 1)

    n

    i i

    i

    n

    i i i i

    i

    E MSSD E x xn

    E x x x x

    n

    n nn

    n

    n

    Therefore, the mean square successive difference is an unbiased estimator of the population variance.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    38/93

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    39/93

    39

    Supplemental Material for Chapter 8

    S8.1. Fixed Versus Random Factors in the Analysis of Variance

    In chapter 4, we present the standard analysis of variance (ANOVA) for a single-factor experiment, assuming

    that the factor is a fixed factor. By a fixed factor, we mean that all levels of the factor of interest were studiedin the experiment. Sometimes the levels of a factor are selected at random from a large (theoretically infinite)

    population of factor levels. This leads to a random effects ANOVA model.

    In the single factor case, there are only modest differences between the fixed and random models. The modelfor a random effects experiment is still written as

    ij i ij y

    but now the treatment effects i are random variables, because the treatment levels actually used in the

    experiment have been chosen at random. The population of treatments is assumed to be normally and

    independently distributed with mean zero and variance2

    . Note that the variance of an observation is

    2 2

    ( ) ( )ij i ijV y V

    We often call2 2 and variance components , and the random model is sometimes called the components

    of variance model. All of the computations in the random model are the same as in the fixed effects model, but since we are studying an entire population of treatments, it doesnt make much sense to fohypotheses about the individual factor levels selected in the experiment. Instead, we test the followinghypotheses about the variance of the treatment effects:

    20

    21

    : 0

    : 0

    H

    H

    The test statistic for these hypotheses is the usual F -ratio, F = MS Treatments /MS E . If the null hypothesis is notrejected, there is no variability in the population of treatments, while if the null hypothesis is rejected, there issignificant variability among the treatments in the entire population that was sampled. Notice that theconclusions of the ANOVA extend to the entire population of treatments.

    The expected mean squares in the random model are different from their fixed effects model counterparts. Itcan be shown that

    2 2

    2

    ( )

    ( )

    Treatments

    E

    E MS n

    E MS

    Frequently, the objective of an experiment involving random factors is to estimate the variance components.A logical way to do this is to equate the expected values of the mean squares to their observed values and solvethe resulting equations. This leads to

    2

    2

    Treatments E

    E

    MS MS

    n

    MS

    A typical application of experiments where some of the factors are random is in a measurement systemscapability study, as discussed in Chapter 7. The model used there is a factorial model, so the analysis and theexpected mean squares are somewhat more complicated than in the single factor model considered here.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    40/93

    40

    S8.2. Analysis of Variance Methods for Measurement Systems Capability Studies

    In Chapter 8 an analysis of variance model approach to measurement systems studies is presented.This method replaces the tabular approach that was presented along with the ANOVA method inearlier editions of the book. The tabular approach is a relatively simple method, but it is not the mostgeneral or efficient approach to conducting gage studies. Gauge and measurement systems studiesare designed experiments , and often we find that the gauge study must be conducted using anexperimental design that does not nicely fit into the tabular analysis scheme. For example, supposethat the operators used with each instrument (or gauge) are different because the instruments are indifferent physical locations. Then operators are nested within instruments, and the experiment has

    been conducted as a nested design.

    As another example, suppose that the operators are not selected at random, because the specific operators usedin the study are the only ones that actually perform the measurements. This is a mixed model experiment, andthe random effects approach that the tabular method is based on is inappropriate. The random effects modelanalysis of variance approach in the text is also inappropriate for this situation. Dolezal, Burdick, and Birch(1998), Montgomery (2001), and Burdick, Borror, and Montgomery (2003) discuss the mixed model analysis

    of variance for gauge R & R studies.The tabular approach does not lend itself to constructing confidence intervals on the variancecomponents or functions of the variance components of interest. For that reason we do notrecommend the tabular approach for general use. There are three general approaches to constructingthese confidence intervals: (1) the Satterthwaite method, (2) the maximum likelihood large-samplemethod, and (3) the modified large sample method. Montgomery (2001) gives an overview of thesedifferent methods. Of the three approaches, there is good evidence that the modified large sampleapproach is the best in the sense that it produces confidence intervals that are closest to the statedlevel of confidence.

    Hamada and Weerahandi (2000) show how generalized inference can be applied to the problem of determining

    confidence intervals in measurement systems capability studies. The technique is somewhat more involvedthat the three methods referenced above. Either numerical integration or simulation must be used to find thedesired confidence intervals. Burdick, Borror, and Montgomery (2003) discuss this technique.

    While the tabular method should be abandoned, the control charting aspect of measurement systems capabilitystudies should be used more consistently. All too often a measurement study is conducted and analyzed viasome computer program without adequate graphical analysis of the data. Furthermore, some of the advice invarious quality standards and reference sources regarding these studies is just not very good and can produceresults of questionable validity. The most reliable measure of gauge capability is the probability that parts aremisclassified.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    41/93

    41

    Supplemental Material for Chapter 9

    S9.1. The Markov Chain Approach to Finding the ARLs for Cusum and EWMA ControlCharts

    When the observations drawn from the process are independent, average run lengths or ARLs are easy todetermine for Shewhart control charts because the points plotted on the chart are independent. The distributionof run length is geometric, so the ARL of the chart is just the mean of the geometric distribution, or 1/ p, where

    p is the probability that a single point plots outside the control limits.

    The sequence of plotted points on Cusum and EWMA charts is not independent, so another approach must beused to find the ARLs. The Markov chain approach developed by Brook and Evans (1972) is very widelyused. We give a brief discussion of this procedure for a one-sided Cusum.

    The Cusum control chart statistic (or )C C form a Markov process with a continuous state space. By

    discretizing the continuous random variable (or )C C with a finite set of values, approximate ARLs can

    be obtained from Markov chain theory. For the upper one-sided Cusum with upper decision interval H , theintervals are defined as follows:( , / 2],[ / 2,3 / 2], ,[( 1/ 2) , ( 1/ 2) ], ,[( 3 / 2) , ],[ , )w w w k w k w m w H H where m + 1 is thenumber of states and w = 2 H/ (2m- 1). The elements of the transition probability matrix of the Markov chain

    [ ]ij pP are

    / 2

    0

    ( / 2)

    ( / 2)

    ( ) , 0,1,..., 1

    0,1,..., 1( )

    1,2, ..., 1

    ( ) , 0,1,..., 1

    0, 0,1,..., 1

    1

    w

    i

    j i w

    ij j i w

    im H

    mj

    mm

    p f x iw k dx i m

    i m p f x iw k dx

    j m

    p f x iw k dx i m

    p j m

    p

    The absorbing state is m and f denotes the probability density function of the variable that is being monitoredwith the Cusum.

    From the theory of Markov chains, the expected first passage times from state i to the absorbing state are

    1

    0

    1 , 0,1,..., 1m

    i ij j j

    p i m

    Thus, i is the ARL given that the process started in state i. Let Q be the matrix of transition probabilities

    obtained by deleting the last row and column of P . Then the vector of ARLs is found by computing

    I Q 1

    where 1 is an 1m vector of 1s and I is the m m identity matrix.

    When the process is out of control, this procedure gives a vector of initial-state (or zero-state) ARLs. That is,the process shifts out of control at the initial start-up of the control chart. It is also possible to calculate steady-state ARLs that describe performance assuming that the process shifts out of control after the control chart has

    been operating for a long period of time. There is typically very little difference between initial-state andsteady-state ARLs.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    42/93

    42

    Let ( , ) P n i be the probability that run length takes on the value n given that the chart started in state i. Collectthese quantities into a vector say

    [ ( , 0), ( ,1), ..., ( , 1)]n P n P n P n m p

    for n = 1,2, . These probabilitiescan be calculated by solving the following equations:1

    1

    1

    ( )

    , 2,3,...n n n

    p I Q 1

    p Qp

    This technique can be used to calculate the probability distribution of the run length, given the control chartstarted in state i. Some authors believe that the distribution of run length or its percentiles is more useful thatthe ARL, since the distribution of run length is usually highly skewed and so the ARL may not be a value in any sense.

    S9.2. Integral Equations Versus Markov Chains for Finding the ARL

    Two methods are used to find the ARL distribution of control charts, the Markov chain method andan approach that uses integral equations. The Markov chain method is described in Section S9.1 ofthe Supplemental Text Material. This section gives an overview of the integral equation approachfor the Cusum control chart. Some of the notation defined in Section S9.1 will be used here.

    Let ( , ) and ( ) P n u R u be the probability that the run length takes on the value n and the ARL for the Cusumwhen the procedure begins with initial value u. For the one-sided upper Cusum

    1/ 2 ( 1/ 2)

    ( 1/ 2 )1

    (1, ) 1 ( )

    1 ( ) ( )

    H

    mw j w

    j w j

    P u f x u k dx

    f x u k dx f x u k dx

    and

    0

    0

    0 / 2

    0

    1 ( 1 / 2 )

    ( 1/ 2 )1

    0 / 2

    0 0

    ( , ) ( 1,0) ( ) ( 1, ) ( )

    ( 1,0) ( ) ( 1, ) ( )

    ( 1, ) ( )

    ( 1,0) ( ) ( 1, ) ( )

    ( 1,

    H

    w

    m j w

    j w j

    w

    P n u P n f x u k dx P n y f x u k dx

    P n f x u k dx P n y f x u k dx

    P n y f x u k dx

    P n f x u k dx P n f x u k dx

    P n

    1 ( 1 / 2 )( 1/ 2 )

    1

    ) ( )m j w

    j j w j

    f x u k dx

    for n = 1,2, and for0 ( , / 2) and [( 1/ 2) ,( 1/ 2) ), 1,2,..., 1. jw j w j w j m If w is small,then j is the midpoint jw of the jth interval for j = 1,2,,m 1, and considering only the values of ( , ) P n u for which u = iw results in

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    43/93

    43

    1

    0

    1

    0

    (1, ) 1

    ( , ) ( 1, ) , 2,3,...

    m

    ij j

    m

    ij j

    P iw p

    P n iw P n iw p n

    But these last equations are just the equations used for calculating the probabilities of first-passagetimes in a Markov chain. Therefore, the solution to the integral equation approach involves solvingequations identical to those used in the Markov chain procedure.

    Champ and Rigdon (1991) give an excellent discussion of the Markov chain and integral equationtechniques for finding ARLs for both the Cusum and the EWMA control charts. They observe thatthe Markov chain approach involves obtaining an exact solution to an approximate formulation of theARL problem, while the integral equation approach involves finding an approximate solution to theexact formulation of the ARL problem. They point out that more accurate solutions can likely befound via the integral equation approach. However, there are problems for which only the Markovchain method will work, such as the case of a drifting mean.

  • 8/9/2019 Supplemental Material to Intro to SQC 6th Ed

    44/93

    44

    Supplemental Material for Chapter 10

    S10.1. Difference Control Charts

    The difference control chart is briefly mentioned in Chapter 10, and a reference is given to a paper by Grubbs

    (1946). There are actually two types of difference control charts in the literature. Grubbs compared samplesfrom a current production process to a reference sample. His application was in the context of testingordinance. The plotted quantity was the difference in the current sample average and the reference sampleaverage. This quantity would be plotted on a control chart with center line at zero and control limits at

    2 2

    2 1 2 A R R , where

    2 2

    1 2and R R are the average ranges for the reference samples (1) and the current

    production samples (2) used to establish the control limits.

    The second type of difference control chart was suggested by Ott (1947), who considered the situation wheredifferences are observed between paired measurements within each subgroup (much as in a paired t -test), andthe average difference for each subgroup is plotted on the chart. The center line for this chart is zero, and the

    control limits are at2

    , where A R R is the average of the ranges of the differences. This chart would beuseful in instrument calibration, where one measurement on each unit is from a standard instrument (say in alaboratory) and the other is from an instrument used in different conditions (such as in production).

    S10.1. Control Charts for Contrasts

    There are many manufacturing processes where process monitoring is important but traditionalstatistical control charts cannot be effectively used because of rational subgrouping considerations.Examples occur frequently in the chemical and processing industries, stamping, casting and moldingoperations, and electronics and