curso bosch diseño de experimentos
TRANSCRIPT
-
8/2/2019 Curso Bosch Diseo de experimentos
1/104
-
8/2/2019 Curso Bosch Diseo de experimentos
2/104
Edition 08.1993
1993 Robert Bosch GmbH
-
8/2/2019 Curso Bosch Diseo de experimentos
3/104
- 3 -
Table of Contents:
1. System-Analytical Approach.................................................................................. 5
1.1 One-Factor-at-a-Time Method................................................................................ 5
1.2 Two-Factor Method................................................................................................ 91.3 General Case (Numerous Influence Factors) ........................................................ 12
2. Industrial Experimentation Methodology and System Theory .............................. 16
2.1 Hints on System Analysis..................................................................................... 17
2.2 Short Description of the System Theoretical Procedure ....................................... 17
2.2.1 Global System Matrix (i.e. without quoting any levels)........................................ 19
2.2.2 Local System Consideration ................................................................................ 19
2.2.3 Local System Matrix ............................................................................................ 20
2.3 Summary .............................................................................................................. 20
3. Probability Plot .................................................................................................... 21
3.1 Probability Plot of Small-Size Samples................................................................ 22
3.2 Probability Paper ................................................................................................. 23
4. Comparison of Samples Means ........................................................................... 244.1 t Test.................................................................................................................... 244.2 Minimum Sample Size ......................................................................................... 26
5. F Test................................................................................................................... 30
6. Analysis of Variance (ANOVA)........................................................................... 326.1 Deriving the Test Statistic.................................................................................... 34
6.2 Equality Test of Several Variances (According to Levene) .................................. 36
7. Design of Experiments with Orthogonal Arraysand Evaluating such Experiments......................................................................... 38
7.1 Representing the Results of Measurement............................................................ 437.2 Calculating the Effects ......................................................................................... 527.3 Regression Analysis............................................................................................. 557.4 Factorial Designs ................................................................................................. 567.4.1 Design Matrix ...................................................................................................... 567.4.2 Evaluation Matrix ................................................................................................ 587.4.3 Confounding ........................................................................................................ 597.4.4 Fractional Factorial Designs................................................................................. 637.5 Designs for Three-Level Factors .......................................................................... 657.6 Central Composite Designs .................................................................................. 677.7 Screening Designs According to Plackett and Burman ......................................... 69
8. Statistical Evaluation Procedures for Factorial Designs ....................................... 718.1 One-Way Analysis of Variance ............................................................................ 718.2 Factorial Analysis of Variance ............................................................................. 728.3 Factorial Analysis of Variance with Respect to Variation .................................... 728.4 Computer Support ............................................................................................... 738.4.1 Evaluation of an Experiment using the FKM Program.......................................... 758.4.2 Evaluation with the Help of SAV Program........................................................... 81
-
8/2/2019 Curso Bosch Diseo de experimentos
4/104
- 4 -
9. Hints on Practical Design of Experiments ............................................................ 86
9.1. Task and Target Formulation .............................................................................. 86
9.2. System Analysis .................................................................................................. 86
9.3. Stipulating an Experimental Strategy ................................................................... 87
9.4. Executing and Documenting an Experiment ......................................................... 88
10. Shainin Method .................................................................................................... 89
11. List of References ............................................................................................... 92
12. Tables .................................................................................................................. 93
Index ...................................................................................................................101
Within the framework of quality assurance and for effective new and further development
of Bosch products, careful design of experiments is not only indispensable but is also re-quired by our customers.
In this connection, the commonly used term Statistical Experimental Design is not ex-actly defined and labels such as Design of Experiments (DOE), Industrial Experimen-tation Methodology, Taguchi Method and Shainin Method(s) are often used inter-changeably.
This pamphlet is based on a seminar manuscript on Industrielle Versuchsmethodik 1(Industrial Experimentation Methodology) and should clarify vital terms and processes ofthe statistical experimental design, to an interested user.
-
8/2/2019 Curso Bosch Diseo de experimentos
5/104
- 5 -
1. System-Analytical Approach
Investigation of a system must often begin with the description of a particular systemsstate. A basic requisite that we impose on an experiment is reproducibility, i.e. under defi-
nite conditions the result of an experiment must always be the same. Since there cant beabsolute equality (one cannot swim upstream twice), the reproducibility of an experi-ment is a relative term. One can use statistical terms to define the term reproducibility.The term self-control can be interpreted as generalization of the term reproducibility.It is also possible to limit oneself to the statement that a (quantitative) result of an experi-ment must always lie within a specific bandwidth.
Variation in results of repeated experiments (e.g. process variation) can, under certainsituations, be a vital parameter. Standard deviation, under certain circumstances, canserve as a measure of the variation. If one wishes to evaluate the variation quantitatively,one needs a sufficiently large sample size, i.e. sufficiently many repetitions of the ex-periment (see hereto, Chapter 4.2). Similar statements are valid for the mean position.
1.1 One-Factor-at-a-Time Method
If one wishes to investigate the influence of a factor within a system, one varies this factorbut leaves other factors in the system unchanged. In general, this ensures that other fac-tors, which are not the subject of the investigation, neither falsify the results nor restrictthe corresponding deduced statements. (That is obviously easier said than done.) This ap-proach is convenient, logical and should be reckoned as a fundamental experimental strat-egy. The only restriction: The nature of influence depends (possibly very strongly) upon
the position of the other factors.
Strangely, numerous textbook authors reckon the one-factor-at-a-time method to be ineffi-cient. A practical person can, nonetheless, confidently ignore these objections. It is evidentthat one-factor-at-a-time experiments must be carefully designed, executed and evaluated.
Systematic Approach
We differentiate between variable and discrete influence parameters. Before one deter-mines experiments or experimental series, one must think about what type of influence thevariable factor has. When preparing one-factor-at-a-time experiments we will get ac-quainted with terms which are later of prime importance when investigating the general n-dimensional case.
a) The most simple type of influence is the linear influence.
Increasing the influence by a fixed amount always brings about the same effect, independ-ent of the chosen levels (see Chapter 7). Many known natural laws of physics or chemistryare linear (examples?).
-
8/2/2019 Curso Bosch Diseo de experimentos
6/104
- 6 -
Differential calculus linearizes (nearly) arbitrary functions. However, one should not as-
sume that a fact to be investigated can be linearized just as a matter of simplicity.
The statement that every problem
can be linearized when the differ-
ence between the steps of the influ-
ence parameters is small enough,
may be correct, though this is of
little practical value, since what is
considered small enough mustthen be clarified.
For instance, a temperature differ-ence of 1C can be small in manyproblems, but in other problems, thisincrement may be large.
Extrapolation beyond the investigated region is only permissible if the function is known.The same restriction applies to interpolation.
Because a system generally exhibits significant background noise, erroneous interpreta-tions of experimental results easily occur even though linearity is ensured.
b) Further generalization of linear influence is the monotonic influence (synonym: ten-
dency, directional factor).
A monotonic influence is apparentwhen the input quantity can influ-ence the output quantity in only onedirection.
More precise: A monotonic influ-ence is apparent when an increase ofthe input quantity invariably causeseither an increase or a decrease of
the output quantity.
Monotonic influence parameter: The choice of thesteps influences the size of the effect.
-
8/2/2019 Curso Bosch Diseo de experimentos
7/104
-
8/2/2019 Curso Bosch Diseo de experimentos
8/104
- 8 -
Task 1
When doing a one-factor-at-a-time experiment one should make sure that factors notconstituting the object under investigation neither falsify the results nor restrict the con-firmation. One should discuss, using concrete cases, how this basic principle can be real-
ized (e.g. by randomization).
Task 2
A glass of water is put inside a freezer and the time required for the water to freeze is re-corded.
The initial temperature of the water (between 10 C and 100 C) should be determined sothat the time interval up to freezing is as long as possible (optimization problem).
How do you investigate the process empirically?
Task 3
a) Assuming that a process is definitely linear.How many supporting points does one need to represent the natural law explicitly?How should the system noise be considered?How does one select the supporting points?
b) How can one invalidate linearity empirically?
Task 4
Electron-impact experiment doneby Franck and Hertz:The anode current flowing to theexciting anode as a function of theanode voltage.
a) How can the process representedin the adjacent figure be investi-gated empirically?
b) What could a physicist con-clude, who only performed testsat 5V, 10V and 15V?
-
8/2/2019 Curso Bosch Diseo de experimentos
9/104
- 9 -
Summary:
From the considerations discussed in this chapter, it is clear that when investigating the
influence of a single factor, the given situation is very important - how many measure-
ments must be made, how many repetitions must be undertaken and where they have to be.
There is therefore no strict recipe for conducting empirical investigations. Thus, it is not
appropriate to teach recipes.
Scheme
Target quantity(ies):
Influence variable(s):
Other factors influencing the target quantity are (which are nonetheless not the objective
of the investigation):
How are the quantities considered?
Prior knowledge:
Number of the steps:
Reason:
Number of repetitions:
Reason:
Additional points to be considered:
1.2 Two-Factor Method
If one wants to investigate the influence of two factors within a system, phenomena have
to be observed that dont arise during one-factor-at-a-time investigations. Since these phe-nomena are symptomatic for the general n-dimensional case, a thorough investigation is
beneficial.Before one determines an experimental arrangement (incl. experiments size), what isknown and unknown regarding the two factors (and what is then to be investigated empiri-cally) must be systematically established.At first, a cognitive investigation takes place in principle; i.e. a knowledge-based de-scription of the two-factors system. What is helpful as in the one-factor-at-a-time methodis differentiation between discrete and variable influence parameters.
There are 3 cases to be differentiated.
-
8/2/2019 Curso Bosch Diseo de experimentos
10/104
- 10 -
Case 1: Both influence factors are discrete
Influence factor A with k levels: A A Ak1 2, , . . .,
B with l levels: B B B l1 2, , . . .,
There are k l system states.
Example:
Target quantity: YieldPlants A A1 2,
Pesticides B B B1 2 3, ,
Remark:
It is clear that in general one cannot derive other system states A Bi j from the knowledgeof an empirical result.
Case 2:
One discrete influence factor A A Al: , . . .,1One variable influence factor.
Example:
System: Solution
Target quantity: Solubility
A: Chemical substanceB : Temperature
It is possible, in principle to de-scribe the system in k characteristic
lines (family of characteristics). Ingeneral, one deals with k differentone-factor problems. Is it possibleto make a general deduction fromone characteristic line to the othercharacteristic line?
When characteristic curves are shifted upwards in parallel (depending on the discrete fac-tor), one speaks of an interaction-free system.
Solubility of several inorganic substances as a func-tion of temperature
-
8/2/2019 Curso Bosch Diseo de experimentos
11/104
- 11 -
Case 3: Both influence parameters are variable.
The information can be represented in 3-dimensions (see Figure).
Complex Motronic ignition map (ignition angle as a function of load and engine speed)
Hereby the values of both variable influence parameters (in this example, load and engine
speed) constitute the coordinates of points in a plane. The function is then represented by a
mountain above this plane (see Figure 7.1).
The experimenter must now specify the region in which empirical investigations are to beperformed. It fully depends upon the physical question - how many experimental pointsshould be foreseen.The idea that the experimental scope can be reduced by means of a combinatorial magic issimply erroneous. The scope can only be reduced through precise task formulation and useof the knowledge already verified.
Task:
System: CakeTarget quantity: Height of cakeInfluence parameters: Yeast, water
How can one investigate the system empirically?What does an array of characteristic curves look like in principle?
Ignition Angle
LoadRotationalSpeed
-
8/2/2019 Curso Bosch Diseo de experimentos
12/104
- 12 -
Summary:
To handle two influence factors just like in the one-dimensional case: the combinatorialarrangement of experimental points, the number of repetitions per point etc. fully dependupon how a question is formulated. Generally binding rules, in an algorithmic sense, can-
not exist.The case differentiation
discrete - discretediscrete - variablevariable - variable
is helpful.
1.3 General Case (Numerous Influence Factors)
A complex system, with numerous influence factors, poses a challenge. It is clear that thetime needed for an investigation increases with the number of factors to be considered. Itwould be good if one could reduce the time needed for an experiment through a combina-torial magic.It is unfortunately not so. The only way to reduce experimental expenses is by applyingthe existing knowledge in a systematic way. This systematic approach must help the prac-titioner but not force him to have to employ terminology he does not understand (nor is
expected to know). The practitioner must be able to present his knowledge or his pre-sumptions in a simple and rational manner.(Furthermore, the assumptions must be system-based and plausible!)
We differentiate between variable and discrete influence parameters.
The description of the type of influence of the individual quantities belongs to principle-system description. In view of the fact that we may have to differentiate among numerousinput quantities, a careful description of the influence of individual input-quantities isespecially important.
Naturally, the influence of an individual quantity depends upon the position of the other
input-quantities, but because of this, it must be determined whether the physical-chemicalcharacter of the individual quantities permits making principle statements about the typeof influence, independent of the other quantities.
With the systematic approach, it is preferable to begin with considering the discrete influ-ence parameters. If for instance, A is a discrete influence parameter with the levels A1, A2(e.g. metal type) then the following should be asked:Is one of the two levels, in respect of the target quantity, better, in principle, than the oth-ers or not?If it is not the case, then this means that the answer to the question depends upon the posi-tion of the other factors.
-
8/2/2019 Curso Bosch Diseo de experimentos
13/104
- 13 -
Example:
A Definite A Ambiguous
Remark:It is usually preferred to begin by investigating the discrete influence parameters, basically
because the different steps often represent the system states to be differentiated. (Dontcompare apples with oranges!).
Variable InfluenceBefore determining the experiments or experimental series, the overall influence should bedescribed (i.e. without determining the levels). Relevant terminology is known to us (see1.1 and 1.2).
Black BoxIf, after a careful analysis, all of the influence parameters are ambiguous or if the characterof the influence is unknown, then the matter can not be investigated empirically. If onenevertheless, wants to conduct the experiment, all the strategies then become nearlyequivalent (all cats are grey at night).
Trial Task:
System:A board fixed on one side.
Target quantity:Lowering of the free end.
Influence quantities:Types of wood H H H 1 2 3, , ,
length, breadth, height,force F
-
8/2/2019 Curso Bosch Diseo de experimentos
14/104
- 14 -
I. a) Perform a global system analysis with the help of the system matrix!
b) If appropriate, draw an array of characteristic lines!
Length Breadth Height Force
Linear
Monotonic
Non-monotonic
Unknown
II. Given are
Length: 1.5 m
Breadth: 20 cm
Height: 4 cm
Force: 20 N
All 4 quantities can be reduced by up to 10%. Target is a board which is lowered as little
as possible. Perform a local analysis!
Which experiments or experimental series would you perform?
Factors Length Breadth Height Force
Definite
Ambiguous
Unknown
Trial Task:
System: Green house
Target
quantity: Yield of useful plants
Influence-
quantities: Types of plants P P P1 2 3, ,
Types of soil B B1 2,
Chemicals C C C1 2 3, ,
Water quantity (irrigation)
Light
Temperature
-
8/2/2019 Curso Bosch Diseo de experimentos
15/104
- 15 -
1. Perform a detailed system analysis (with system matrix)!
2. Draw arrays of characteristic lines!
What can be said about interactions?
3. Which experimental strategy is recommendable?
Trial Task:
a) What does the optimization strategy of a monotonic system look like?
Global System Matrix:
Factor A B C D
Monotonic
Non-monotonic
Unknown
b) What does the optimization strategy of the following system look like?
Factors A B C D E F
Levels A1 A2 B1 B2 C1 C2 D1D2 E1 E2 F1 F2
Definite X X
Ambiguous X
Unknown X X X
-
8/2/2019 Curso Bosch Diseo de experimentos
16/104
- 16 -
2. Industrial Experimentation Methodology and System Theory
The terminology or key words summarized under D.O.E., Statistical Experimental Design,
Taguchi, and Shainin methods, as mentioned earlier, are either required or initiated by
customers and also used in specialized literature.
With respect to the practical relevance of the methods mentioned above, reference is made
to the following:
Taguchi Method
The Taguchi method is characterized by, among other things, the usage of the so-called
orthogonal arrays to reduce the required extent of the experiment. The use of the method is
dependent upon negligibility of interactions or - in exceptional cases - the predictability of
interactions. These assumptions are controversial; nevertheless, successful examples are
often quoted in the literature. These successes are not verifiable and usually not rationally
comprehensible. What is confirmed is that substantial misstatements can be proved withthe orthogonal arrays.
D.O.E. (= Design of Experiments)
Anybody who has ever thought about performing an experiment, has practiced experi-mental design. Thus, one can never ask the question whether one is for or against experi-mental design. With regard to the contents of textbooks about the D.O.E.-subject, how-ever, there are some reservations, for instance:
All algorithmic approaches are based on models, i.e. a mathematically quantitativemodel is suggested to represent the reality to be investigated. All subsequent proce-dures (experimental designs, evaluations etc.) are only reasonable if the model ade-quately describes the reality.
The difficulty of selecting the right model is fundamentally natural.
From the results structure, it is not possible to recognize whether the model is ade-quate (i.e. verification is neither a prior nor a posterior possibility).
A way out of this difficulty is only possible via a system-theoretical approach.
Shainin Method
For Shainin method see Chapter 10 and [11].
-
8/2/2019 Curso Bosch Diseo de experimentos
17/104
- 17 -
2.1 Hints on System Analysis
The prerequisite for a reasonable experimental design is a system analysis. The purpose of
a system analysis is, among other things, to present existing knowledge or lack of know-
ledge for the system to be investigated with the help of elementary terms. Theoretical DOE
terms are to be avoided at this stage for various reasons. After executing the system analy-sis, a decision can be made, to some extent deduced, about the experimental strategy that
is appropriate. Automation in the sense of a strict recipe is not appropriate and therefore
not to be pursued. Formulation of General Systems Theory terminology is used.
Generally it may be assumed that the system to be investigated does not represent ablack-box. (It is self-evident that a real black-box cannot be investigated with formalprocedures). Hence the specialist will be able to make principle statements about the in-put-output-situation of the system. An explanation in principle, i.e. qualitatively correctexplanations, are preferred to precise quantitative statements that are for various reasonsoften false (better be approximately right than exactly wrong).
2.2 Short Description of the System-Theoretical Procedure
System analysis begins with system definition. This includes listing all relevant targetquantities (output) as well as relevant influence parameters (input).
Here for instance, flow charts and cause-and-effect diagrams can be helpful. When dealingwith input-quantities, e.g., care should be taken about independence, susceptibility andpossibility of definite establishment.
Subsequent to completion of the system definition, the system characteristics are to bedescribed. System analysis is a recursive process. In the ideal case, all relevant systemcharacteristics are known and investigating the system via experiments becomes unneces-sary.
A statement about system noise belongs to system characteristics description, i.e. thedescription of output-quantities behaviour when given input-quantities are kept constant.
The knowledge of system noise has vital consequences to the type and scope of impendinginvestigations. Describing the functional input-output situation is important within thescope of information about system characteristics. In view of the fact that normally severalinput-quantities exist, describing the influence of the individual input-quantity is espe-cially important. Naturally, the influence of an individual quantity depends upon the posi-tion of the other input-quantities, and for this reason it is especially important that thephysical-chemical character of the individual quantity permits making principle statementsabout its type of influence, independent of the other quantities.
Here the following formulation of terms can help further:
Global description
-
8/2/2019 Curso Bosch Diseo de experimentos
18/104
- 18 -
Linear influence (as a special case
of the monotonic influence):
A linear influence exists, if the
functionsf A( , . . . )
are always lin-
ear (linear influence factors are
certainly exceptional cases).
Monotonic influence:
A monotonic influence exists if the
input-quantity can only influence
the output-quantity in one direction.
Non-monotonic influence factor:
A non-monotonic (dichotomous)
influence exists if the input-quantityis influenced in both directions (i.e.
both upwards and downwards). Here
also the characteristic of the influ-
ence factor depends upon the posi-
tion of the other influence factors. It
is generally assumed, however, that
the type of the dichotomy is an in-
variant of the influence factor, i.e.
the dichotomy is independent of the
position of the other factors.
Characteristics of a linear influence factor
Characteristics of a monotonic influence factor
Characteristics of a dichotomous influence factor
-
8/2/2019 Curso Bosch Diseo de experimentos
19/104
- 19 -
2.2.1 Global System Matrix (i.e. without quoting any levels)
Considering the special role of discrete input-quantities, every single quantity is then
specified according to how someone, conversant with the system, determines the influence
character (without quantification). Hereby reference is made to the above type classifica-
tion.
The results are summarized in the global system matrix:
Factors A B C Z
Linear
Monotonic
Dichotomous
Unknown
A completed global system matrix can alreadydepict a sensible experimental strategy.
Example:
If all influence factors are monotonic, then it is simple to optimize the system and the only
question needed to be asked is what influence factors are decisive for the optimum. Here
reference can be made to the Shainin method.
2.2.2 Local System Consideration
Often, an experimental strategy directly follows from the global system consideration.
Because the global characteristics array, especially that of the dichotomous influence fac-
tors, is often very complex, system consideration must be localized; i.e., the levels of theinfluence factors must be prescribed and the properties of the system relative to the pre-scribed levels considered.
For the special case between the two steps, the following case-differentiation is to bemade:
1. Univalent Influence Factor (univalent = definite)If the target quantity is only moved in one direction with a change from A1 to A2 ,
i.e. f A f A( ) ( )1 2 0 >
or always f A f A( ) ( )1 2 0 < ,
then a univalent factor exists.
Hint:Because of localization, a dichotomous factor can be univalent. To some extent, how-
ever, there exists some correspondence between univalent and monotonic factors.
-
8/2/2019 Curso Bosch Diseo de experimentos
20/104
- 20 -
2. Bivalent Influence Factors (bivalent = ambiguous)
Bivalent factors, according to definition, are factors which are not univalent. That
means that the factor, depending on the position of the other factors, influences the tar-
get quantity both upwards and downwards when the level of the influence factor is
changed as prescribed. The behaviour of a bivalent factor is, as such, synergetic or an-
tagonistic. It is of special importance to find out which ones of the other factors cause
the changes.
2.2.3 Local System Matrix
(depending upon the selected levels, i.e. there exists not only one local system matrix).
The results of the local system consideration are summarized in the local system matrix.
Example:
Factors A B C Z
Levels A1 A2 B1 B2 C1 C2 Z1 Z2
Univalent
Bivalent
Unknown
A completed local system matrix gives an indication of the complexity of localized prob-
lems. The simplest case exists when all influence factors are univalent. Then the experi-
mental strategy is obvious. The most difficult case exists when all influence factors are
bivalent or when the character of the influence is unknown.
In this case, a simple experimental strategy is (without further information) impossible.
Especially, reasonable optimization with a small experimental series is not attainable.
2.3 Summary
The statement made in the QS-Info 1/90 there is no alternative to statistical design of
experiments is only correct if, under statistical design of experiments, one understandsthe systematic, i.e., the system-theoretical design of experiments by considering the statis-tical points of view.If under statistical design of experiments, however, one understands the contents of thetextbooks about statistical design of experiments (from Fisher via Box to Taguchi), then itis assumed that these contents are not or are only seldom transferable to real-life. Similarreservations are made with respect to commercial software-packages. Especially, everypolemic against the so-called conventional methods is uncalled-for. A consequent appli-cation of the system-theoretical attitude will often lead to the need to account for conven-tional investigation types in other cases, however, this can lead to the formal approachesbeing seen as promising. Holding to stubborn schools of thought is certainly detrimental atlong-term.
-
8/2/2019 Curso Bosch Diseo de experimentos
21/104
- 21 -
3. Probability Plot
When one speaks about a normal distribution, one mostly associates this concept with a
Gaussian bell-shaped curve. The Gaussian bell-shaped curve is a representation of the
probability density function x( ) of the normal distribution:
f x e
x
( ) =
1
2
1
2
2
.
This function and its graphic representation are printed on the 10 DM bank note, be-sides the portrait, in honour of the mathematician called Carl Friedrich Gau.
The normal distribution assigns to every value x the probability that a random variable Xtakes a value between and x . One acquires the distribution function F x( ) of the
normal distribution, in that he integrates over the above given density function.
f x e dv
vx
( ) =
1
2
1
2
2
F x( ) corresponds to the area up to the value x , under the Gaussian bell-shaped curve.
The graphical representation of this function has an s-shaped form. Thus, strictly speaking,one must always think about this curve whenever a normal distribution is concerned.
If the y-axis, in this representation, is now distorted such that a straight line evolves out ofthe s-shaped curve, a new coordinate system - the probability paper - emerges. The x-axisremains unchanged.Because of this association, a normal distribution in this new coordinate system is alwaysportrayed as a straight line on the probability paper.
One uses this fact in order to check graphically for the normal distribution of a given dataset. As long as the number of measured values given is large enough, one creates a histo-gram of these values, thus determining the relative frequencies of values within the classesof a grouping. If the cumulative relative frequencies found are now plotted over the rightclass limits on the probability paper and a series of points approximately lying on astraight line is obtained, then it can be inferred that the values of the data set are approxi-
mately normally distributed.
Remark:The recording of measurement values or groups of measurement values ordered accordingto the factor levels on probability paper is a component of the SAV-program (see Chapter8.4 Computer aid and [9]).
Hint: In German, two different denotions are used in this context. Wahrscheinlichkeits-netz stands for the coordinate system in which the data are plotted and Wahrscheinlich-keitspapier denotes the form (sheet) with the pre-printed coordinate system (see chapter
3.2), whereas in English textbooks the denotion probability paper is used for both.
-
8/2/2019 Curso Bosch Diseo de experimentos
22/104
- 22 -
3.1 Probability Plot of Small-Size Samples
The size of a sample for creating a histogram or calculating relative frequencies is often
not sufficient, so that representation on the probability paper according to the above-
described method is not possible. There is a way out of this dilemma, which is explained
below.
The processes can be understood easily by means of computer simulation.
One takes a sample of size n : x x xn1 2, , . . . , from a standard normally distributed popula-
tion ( = 0 , = 1) and arranges the values in order of magnitude:
( ) ( ) ( ) x x xn1 2 . . . .
The number assigned to each of the sample values in this increasing sequence is called
rank. The smallest value ( )x 1 has therefore the rank 1, the greatest value ( )x n the rank n .
Then one determines the value F F xi i= ( )( ) from the table of standard normal distributionfor every ( )x i ( , , . . ., )i n= 1 2
If this process is frequently repeated, then the cumulative frequencies H ni( ) ensue for
every rank i as a mean value of Fi (in actual sense, the median will be considered).
To every sample size 6 50 n these cumulative frequencies H ni ( ) are given for eachrank i in Table 1 (Section 12).
We now consider a sample of size 10 for example, which should be tested for normal dis-
tribution:
2.1 2.9 2.4 2.5 2.5 2.8 1.9 2.7 2.7 2.3.
The values are sorted according to magnitude:
1.9 2.1 2.3 2.4 2.5 2.5 2.7 2.7 2.8 2.9.
The value 1.9 has rank 1, the value 2.9 rank 10. In the table in the appendix (sample size
n = 10) one finds the cumulative frequencies (in percentage) for every rank i :
6.2 15.9 25.5 35.2 45.2 54.8 64.8 74.5 84.1 93.8.
Finally, one chooses a suitable division (scaling) for the x-axis of the probability paper
corresponding to the values 1.9 up to 2.9 and enters the cumulative frequencies versus the
well-sorted accompanying sample values on the probability paper. One therefore marks the
following points in the example considered above:
(1.9; 6.2), (2.1; 15.9), (2.3; 25.5), ...
..., (2.7; 74.5), (2.8; 84.1), (2.9; 93.8).
Because these points are well approximated by an eye-fitted straight line, it can be as-
sumed that the sample values are approximately normally distributed.
-
8/2/2019 Curso Bosch Diseo de experimentos
23/104
- 23 -
3.2 Probability Paper
The plot of the above described points will be simplified if the so-called probability paper
is used. This is a special form where horizontal lines are drawn at the positions of the cu-
mulative relative frequencies which correspond to ranks i .
The probability paper for the sample size n = 10 therefore exhibits horizontal lines for thevalues:
6.2% 15.9% 25.5% ... 74.5% 84.1% 93.8%.
Hint:
The cumulative frequency H ni ( ) to the rank i can also be calculated with the following
approximation formulas
H ni
ni ( )
.=
05and H n
i
ni ( )
.
.=
+
0 3
0 4.
The deviation from the exact value in the table is thereby insignificant.
Approximating values for n = 10:
5% 15% 25% 35% 45% 55% 65% 75% 85% 95%
-
8/2/2019 Curso Bosch Diseo de experimentos
24/104
- 24 -
4. Comparison of Samples Means
4.1 t Test
The t test is a statistical method with which a decision can be made to determine whetherthe mean values of two samples are significantly different. In order to clarify the func-
tional nature of t tests, we will perform the following mental experiment:
We derive from a normally distributed population N(, ) two samples each of size n,
calculate the mean values y1 and y 2 as well as the standard deviations s1 and s2 (or the
variances s12
and s22) and finally deduce the value
t ny y
s s=
+
1 2
1
2
2
2.
t can take values between 0 and + . If we repeat this process very often, we expect thatmainly values near zero occur and very large values are rarely found.
This mental experiment was performed by computer simulation. For n = 10 and 3,000sample pairs ( t -values), the result was the histogram represented in Fig. 4.1.
Fig. 4.1
-
8/2/2019 Curso Bosch Diseo de experimentos
25/104
- 25 -
If one simultaneously lets the number of samples approach infinity and the class width
approach zero, the histogram will more and more approach the straight line that represents
the density function of the t distribution.
The upper limit of the 99% random variation range (percentage point) is, in this example,
t18 0 99 2 88; . .= , i.e. only in 1% of all cases can values greater than 2.88 randomly occur.
Percentage points of the t distribution are tabled for different error probabilities depending
upon the number of degrees of freedom f n= 2 1( ) (Table 2). The t test approach isbased on the relationship represented above.
A decision shall be made whether the arithmetic mean values of two existing series of
measurements (each of size n ) can belong to one and the same population or not. As theso-called null hypothesis, it is therefore assumed that the mean values of the respectively
affiliated population are equal.
Hence, the test statistic becomes calculated from both the mean values y1 and y 2 as well
as the variances s12
and s22:
t ny y
s s=
+
1 2
1
2
2
2for n n n1 2= = .
If t t n> 2 1 0 99( ); . is the result, i.e. t lies outside the 99% random variation range, the nullhypothesis will be rejected.
Hint: The expression for the test statistic t is then, in the simplest form only applicablewhen both the variances of the populations as well as the sample sizes are assumed to be
equal respectively ( 12
2
2= and n n n1 2= = ). The prerequisite for equal variances can betested with the help of an F test (see 5).
The t test, in the form represented here, tests the null hypothesis 1 2= against the al-ternative 1 2 . As such, a two-sided question exists. For this reason, the absolutevalue of the difference of the means is contained in the expression for t .
t can hence only assume values 0 , so that the distribution depicted in Figure 4.1 re-sults.
Table 2 in Section 12 gives the 95%, 99%, and 99.9% percentage points of the t distribu-
tion in correspondence with the two-sided question. They correspond to the one-sided per-
centage points: 97.5%, 99.5% and 99.95%.
-
8/2/2019 Curso Bosch Diseo de experimentos
26/104
- 26 -
4.2 Minimum Sample Size
In the preceding Section 4.1 it was explained how one can decide, by means of a t test,
whether or not the mean values of two samples are significantly different.
This decision is frequently the goal of experiments, by which the change of a target char-acteristic in dependence upon two system states or two settings of an influence factor is to
be determined. The subsequent intention with respect to pursued system optimization is to
choose the better one between two selected settings.
This especially applies to experiments witch use orthogonal arrays, by which several in-fluence factors are concurrently varied on two levels (see Chap. 7).
The executed factorial analysis of variance (see 8.2) in the scope of the evaluation of suchexperiments is, in principle, nothing other than a comparison of mean values of all ex-periment results attained for two settings (levels) of an influence factor, by consideringexperimental noise.
In the preparatory phase of such experimental investigations, the experimenter often asksthe question: which minimum mean value difference is of actually interest in view ofhis target (system optimization, production simplification, costs reduction), and whichminimum sample size n must be chosen, so that the minimum mean value distance, if ac-tually existent, is ascertained as a result of the experimental evaluation (significant).
From the expression for the test statistic t (see Section 4.1)
t ny y
s s
=
+
1 2
1
2
2
2
it is apparent that for a significant test result, n must be the greater, the smaller the meanvalue difference y y1 2 is and the greater the variances s1
2 and s22 of both of the series to
be compared are. Note that the table value tTable is smaller at increasing number of degrees
of freedom f n= 2 1( ) .
Visually, a small difference of mean values by a simultaneously greater variance of dis-tributions means that both groups of values are visually either indistinguishable or arehardly distinguishable in a graphical representation of both measurement series.
Based on the previous discussion, it is possible to estimate the minimum sample size nroughly, by assigning the mean value difference as a multiple of a mean variance
s s12
2
2
2
+and for different n the calculated test statistic t is compared with t
Table(ob-
serve the degrees of freedom and significance level!).
-
8/2/2019 Curso Bosch Diseo de experimentos
27/104
- 27 -
Besides this trial method, however, there is an exact deduction method for the minimumsample size from the statistical point of view, which we only sketch roughly at this point(deduction in [1] and [7]).
By comparing the mean values of two series of measurements and the corresponding test-
decision, two types of errors are possible.In the first case, both series of measurements originate from the same population, i.e. thereis no significant difference. If one decides here, due to a t test, that a difference of bothmean values exists, then an error of the first kind ( ) is made. It corresponds to the sig-nificance level of the t test (for example = 1% ).
If, in the second case, a difference of the mean values actually exists, i.e. the measuredseries originates from two different populations, then this will not be indicated with abso-lute certainty by the test. The test result can coincidentally indicate that this differencedoes no exist. One speaks in this case about an error of the second kind ( ).
For the person performing the experiment, both of these error types are unpleasant, be-cause for example due to the likely significant effect of an influence factor, further expen-sive investigations may be initiated or even changes in the production process (error of thefirst kind; type I error), or because the actually significant effect is not identified, thechance to make possible process improvements (error of the second kind; type II error) ismissed.
The minimum sample size n , which is required in order to identify a real mean value dif-
ference depends upon both the distance 2 1
= =D of the mean values given in
units of standard deviation in correspondence with the above plausibility considerationand the error probabilities and .
( )n
u u
D=
+ 2
2
In the concrete case of comparing two series of measurements, the mean values 1 and 2 as well as the standard deviation of the population (subsequently also D ) are notknown. They become estimated through the empirical values y1, y 2 and s . For this rea-
son, when calculating n according to the given formula, the t distribution must be taken as
a basis.
Accordingly, u and u are the abscissa values u , by which the t distribution assumes
the values (two-sided) or (one-sided).
Smaller error probabilities, i.e. smaller type I ( ) and type II errors ( ) mean that bothdistributions to be compared and thus also the distributions of the mean values may onlymarginally overlap. For this, with a given mean values distance D , the sample size n mustbe chosen adequately large.
-
8/2/2019 Curso Bosch Diseo de experimentos
28/104
-
8/2/2019 Curso Bosch Diseo de experimentos
29/104
- 29 -
Stronger effect
Medium effect
Weaker effect
-
8/2/2019 Curso Bosch Diseo de experimentos
30/104
- 30 -
5. F Test
The F test is a statistical method, with which it can be decided, whether the variances of
two samples are significantly different.
The functionality of the test can be explained, just as in the case of the t test, using the
result of a computer simulation.
We take two samples of sizes n1 and n2 from a normally distributed population N( , ) and calculate the sample variances s1
2and s2
2, and from this finally calculate the quantity
Fs
s= 1
2
2
2.
F can take values between 0 and + . It is plausible that by frequent repetition of this
procedure, small values near zero and very large values result very rarely.
The results of a computer simulation, by which the F-values for N= 3 000, sample pairsare determined with sample sizes n n n1 2 9= = = , are represented as a histogram in thefollowing figure.
Figure 5.1
-
8/2/2019 Curso Bosch Diseo de experimentos
31/104
- 31 -
If one lets the number of samples approach infinity and, at the same time, the class width
approaches zero, the histogram will approximate the line in Fig. 5.1 (density function of
the F distribution).
The shape of the histogram depends upon the sample sizes n1 and n2 of the investigated
sample pairs; the curve shape of the density function of the F distribution correspondinglydepends upon the degrees of freedom f n1 1 1= and f n2 2 1= .
The upper limit of the 99% random variation range (percentage point) in the calculated
example is F8 8 0 99 6 03; ; . .= , i.e. only in 1% of all cases (error probability) is random
s s12
2
26 03 . .
The percentage points of the F distribution are tabled in the appendix for different error
probabilities dependent upon the degrees of freedom 1 and 2 .
The relationship represented above makes the approach by F test understandable.
It should be decided whether or not two series of measurements, with sizes n1 and n2,
originate from two normally distributed populations with the same variance (the mean
values do not need to be known).
As a null hypothesis, it is assumed that the variances of the respective populations are
equal: 12
2
2= .
Finally, the test statistic Fs
s= 1
2
2
2will be calculated from the variances s1
2and s2
2of both
measurement series and compared with the percentage point of the F distribution. If the
result is F Fn n> 1 21 1 0 99; ; . , i.e., F lies outside of the 99% random variation range, thenthe null hypothesis will be rejected.
Remark:
The alternative hypothesis is 1
2
2
2> ; a one-sided problem is in question.
In principle, when one writes the greater one of the two variances s12
and s22
above the
fraction line, then F can only assume values greater than 1; now there is a two-sidedquestion. If an error probability of = 1% is chosen the percentage point must be deter-mined with an accuracy of 99.5%.
-
8/2/2019 Curso Bosch Diseo de experimentos
32/104
- 32 -
6. Analysis of Variance (ANOVA)
With the help of the t test (Section 4.1) a determination is made whether the mean values
of two series of measurements are significantly different. The series of measurements to be
compared can be considered formally as experimental results for both respective levels1(e.g. material A) and 2 (material B ) of an individual influence factor (material).
If one expands the one-factor-at-a-time experiment to more than two levels (general: klevels), then it is no longer possible to compare the mean values using the t test. In this
case, an evaluation can occur by means of the analysis of variance.
If the factor A has no influence upon the measurement results, then all individual results
y i j can be seen as originating from the same population. The y i j and thus also the mean
values y i are then only subjected to random deviations (experiment noise) of the com-
mon mean value .
In the other case - the factor A has a significant influence upon the result of measurement -the mean values 1 , . . . , k of the distributions belonging to the levels A Ak1, . . . , of thefactor A will be different.
In the scope of the analysis of variance, one sets k independent, normally distributedpopulations with the same variance as prerequisite and formulates the null hypothesis:
All measured values originate from populations with the same mean value 1 2= = = =. . . k (Remark: Since identical variances were a prerequisite, the nullhypothesis means that all measured values originate from one and the same population).Therefore one calculates the mean variance within the experimental rows (levels ofA)
( )s sk
sy yi
i
k
22 2 2
1
1= = =
as well as the variance between the experimental rows (levels ofA) s sy12 2= .
sy2 Is a measure for the experimental noise. sy
2 Is the variance of the mean values y i.
If the null hypothesis is correct, both factors are estimates of the variance of the underly-
ing population:
$ 12 2= n sy $ 2
2 2= sy .
The factor n is to be considered because of the relationship
y
y
n= .
-
8/2/2019 Curso Bosch Diseo de experimentos
33/104
- 33 -
Finally, one conducts an F test with the test statistic
Fn s
s
y
y
= 2
2
(comparison of both estimates), and rejects the above formulated null hypothesis, if
F Fk n k
> 1 1 0 99; ( ) ; . . (percentage points for F in the appendix)
Rejection of the null hypothesis means: a significant difference exists with regard to the
mean values y i of the results of measurement for the levels of factor A, or: factor A has a
significant influence upon the result of measurement.
Figure 6.1
Figure 6.2
-
8/2/2019 Curso Bosch Diseo de experimentos
34/104
- 34 -
Figures 6.1 and 6.2 should illustrate the importance of this fact. Along the diagonals, the
density functions of normal distributions with equal variance are represented respectively.
In the corners of the figures, the density functions of the mixture of distributions (top left)
and of the distribution of the mean values (bottom right) are represented.
The distributions on Figure 6.1 are only subjected to small mean-value fluctuations, the
mixture of distributions is nearly normally distributed.
The variance of the distribution of mean values and original distributions are rarely differ-
ent, so that an F test does not reject the null hypothesis (identical mean values). In com-
parison with this, the mean values of the seven distributions in Figure 6.2 show greater
fluctuations, the variance of the mean-value distribution is substantially (significant)
greater than that of single distributions.
Accordingly, the null hypothesis, that is the assumption of identical mean values, will in
this case be rejected within the scope of an analysis of variance.
6.1 Deriving the Test Statistic
The term analysis of variance is based on the decomposition of variation of all measured
values in both parts - random variation (experimental noise) and systematic deviation ofthe mean values associated with the above represented formality.
This decomposition is described as follows. When k represents the number of rows and nthe number of measured values (experiments) per row, then the overall variance of alln k measured values is given by
( )sn k
y yi jj
k
i
n2 2
11
1
1=
== .
The quantity Q n k s= ( )1 2 is called the sum of squares (SS).
( )Q y yi jj
k
i
n
= ==
2
11
( )Q y y y yi j j jj
k
i
n
= + ==
2
11(expansion with zero)
( )Q y y y y y y y yi j j i j jj
k
j j
i
n
= + + ==
2
1
2
1
( ) ( ) ( )
If we first consider the middle term:
( ) ( ) y y y yi j jj
k
j
i
n
==
11
= = ===
( ) ( ) y y y y y yi j jj
k
j i j j
j
k
i
n
i
n
1 111
.
-
8/2/2019 Curso Bosch Diseo de experimentos
35/104
- 35 -
=
+
== == == y y y y y y y j i j ji
n
j
k
i j
j
k
i
n
j
j
k
i
n
( )11 11 11
=
+ == y y y n k y n k y j i j j
i
n
j
k
( )11
2
( )= ==
y n y n y j j jj
k
( ) 01
Therefore:
( ) ( )Q y y y yi j jj
k
j
j
k
i
n
i
n
= +
= ===
2
1
2
111
Q n s k sj yi
n
j
k
= + == ( ) ( )1 12 2
11
( ) ( ) ( )n k s k n s n k sy y = + 1 1 12 2 2
Q Q Q= +1 2
Overall variation = experimental noise + variation of mean values
Degrees of freedom ofQ1: k n1 1= ( )Degrees of freedom ofQ2: k2 1= Degrees of freedom ofQ: n k= 1
Equation of the number of degrees of freedom:
= +2 1
n k k k n = + 1 1 1( )n k n k = 1 1
Test statistic: F
Q
f
Q
f
n k
ks
k n
k ns
n s
s
y
y
y
y
= =
=
2
2
1
1
2
2
2
2
1
1
1
1
( )
( )
( )
-
8/2/2019 Curso Bosch Diseo de experimentos
36/104
- 36 -
6.2 Equality Test of Several Variances (According to Levene)
With the one-way analysis of variance, it is investigated whether a factor A has a signifi-cant influence upon the result of measurement. Thus a determination is made whether the
mean values 1 , . . . , k of the measurement results which belong to the levels
A Ak1, . . . , are significantly different.
Frequently the aim of the experiments in this case is to maximise or to minimise a target
quantity.
In connection with investigating disturbance-insensitive (robust) designs, it can be of in-
terest to find out parameter settings, at which the experimental results possibly exhibit
little variation (variance).
For this reason, it is sensible to initially check whether the variances of the results in the
individual experimental rows are significantly different.
Experiment No. Results Mean Variance
1 x x x n11 12 1, , . . . , x1 s12
2 x x x n21 22 2, , . . . , x 2 s22
k x x xk k k n1 2, , . . . , x k sk2
Deviating from our notation to date, we designate the determined measured values with x
and calculate the row mean values x i as well as the variances within the rows s i2.
To test the equality of these variances s i2, Levene proposes the following method:
0. Formulate the null hypothesis:
All results of measurement originate from populations with equal variance:
12
2
2 2= = =. . . k.
1. Calculate the absolute deviations of the results of measurement x i j from the
mean values x i . This corresponds to a transformation according to the equation:
y x xi j i j i= .
The transformed values y i j are entered in the evaluating scheme.
Further calculation is done exclusively with the transformed values y i j .
-
8/2/2019 Curso Bosch Diseo de experimentos
37/104
- 37 -
Experiment No. Results Mean Variance
1 y y y n11 12 1, , . . . , y1 s12
2 y y y n21 22 2, , . . . , y 2 s2
2
k y y yk k k n1 2, , . . . , y k s k2
2. Calculate the mean values y i and variances sy2
3. Calculate the mean value of the variances sy2
4. Calculate the variance sy2
of the mean values y i
5. F test with the test statistic
Fn s
s
y
y
= 2
2Degrees of freedom: k1 1= , n k2 1= ( )
If F, for example, is greater than the percentage point Fk n k 1 1 0 99; ( ) ; . , then the null
hypothesis will be rejected with an error probability
-
8/2/2019 Curso Bosch Diseo de experimentos
38/104
- 38 -
7. Design of Experiments with Orthogonal Arrays
and Evaluating such Experiments
In this section, two simple examples will be used to represent how orthogonal arrays are
applied:
Example 1: One-factor-at-a-time method
The change in length of an alloy should be determined through experiment. Two experi-
ments will be performed.
1. Experiment: length at T1 25= C2. Experiment: length at T2 100= C
L C cm1 25 100 04( ) . = L C cm2 100 10016( ) . =
One starts with the fact that a linear relationship exists between expansion and temperatureand therefore wants to calculate the equation of the straight line in order to determine ar-bitrary intermediate values.
Equation of the straight line: L A A T = + 0 1
Through a coordinate transformation, as it is schematically represented in Figure 7.0.1through the second x-axis, the pair of values (T1, T2 ) will be formally transformed in (-1,
+1).
Figure 7.0.1
-
8/2/2019 Curso Bosch Diseo de experimentos
39/104
- 39 -
The transformation equation is x
TT T
T T=
+
2 1
2 1
2
2
.
Remark:
This equation can be written in the form given in 7.1 through a simple transformation:
xT T
T T=
+2
12 1
2( ) .
Substituting the values T1 25= C and T2 100= C gives: xT
= 62 537 5
.
..
For T T= 2 follows: x = +1.For T T= 1 follows: x = 1.
In the transformed coordinate system, the straight line equation is: L a a x= + 0 1 .
From there, follows for x = +1: 10016 0 1. = +a a ,for x = 1: 100 04 0 1. = a a .
At this point the reason for the coordinate transformation is clear; the coefficients a0 and
a1 are thus easy to calculate by addition or subtraction of both equations:
a 0100 16 100 04
21001=
+=
. .. a 1
100 16 100 04
20 06=
=
. .. .
The coefficient a0 is the mean value of both lengths: aL L
0
2 1
2=
+.
The coefficient a1 is the half effect (see Figure 7.0.1): aL L
1
2 1
2=
.
Thus, in the transformed system the equation of the straight line is:
L L L L L
x=+
+
2 1 2 12 2
L x= + 1001 0 06. . .
The equation of the straight line in the original system is found by reverse transformation
LT
= +
1001 0 0662 5
37 5. .
.
. L T= + 100 0 0016. .
-
8/2/2019 Curso Bosch Diseo de experimentos
40/104
- 40 -
Example 2: Two-Factor Design
This example should clarify the mathematical procedure followed when evaluating ex-
periments using orthogonal arrays applying a known and analytically exact physical fact -
Ohms law.
We put ourselves in the position of an experimenter, who does not know the relationshipbetween voltage, current and resistance and wants to investigate it with the help of a sim-ple experiment.
We assume he has conducted four individual experiments according to Figure 7.0.2 andignores experimental repetitions and measuring errors.
R 1 20= R 2 60= I A1 4= I A2 12=
Searched: U R I= ( , )
Transformation:
x
RR R
R R
R1
2 1
2 1
2
2
40
20=
+
=
x
II I
I I
I2
2 1
2 1
2
2
8
4=
+
=
Figure7.0.2
-
8/2/2019 Curso Bosch Diseo de experimentos
41/104
- 41 -
Multilinear formulation of solution:
U a a x a x a x x= + + +0 1 1 2 2 12 1 2
1. x1 1= x 2 1= a a a a0 1 2 12 80 + =
2. x1 1= + x 2 1= a a a a0 1 2 12 240+ =
3. x1 1= x 2 1= + a a a a0 1 2 12 240 + =
4. x1 1= + x 2 1= + a a a a0 1 2 12 720+ + + =
On the right side there are the voltages U, determined by individual experiment combina-tions.
a0 80 240 240 7204
320= + + + =
a1720 240
4
240 80
4160=
+
+=
a2720 240
4
240 80
4160=
+
+=
a12720 80
4
240 240
480=
+
+=
Substituted in the formulated solution, one gets: U x x x x= + + +320 160 160 801 2 1 2 .
Reverse transformation:
UR I R I
= +
+
+
320 16040
20160
8
480
40
20
8
4
U R I=
Remark:
In this example, the right solution (Ohms law) is bound to come out because the multi-linear form U a a x a x a x x= + + +0 1 1 2 2 12 1 2 was just the right formulation. A more com-plex functional relationship with quotients or exponentials of the influence factors wouldbe described with this formulation only approximately or otherwise never described at all(see 7.3).
-
8/2/2019 Curso Bosch Diseo de experimentos
42/104
- 42 -
Generalization:
For two factors and two levels, the equation of the multilinear form in the transformed
system is:
y a a x a x a x x= + + +0 1 1 2 2 12 1 2.
The coefficients can easily be determined with the following matrix. One designates this
matrix as an orthogonal arrangement or an orthogonal array (see 7.4). The term or-
thogonality in this connection, simply said, means that in each column both levels (-) and
(+) appear equally frequently (see also general formulation scheme in 7.4.1). The or-
thogonality is explained in [1] through mathematical orthogonality conditions.
I x1 x 2 x x1 2 y
+ - - + y1
+ + - -y
2
+ - + - y3
+ + + + y 4
a y y y y
0
1 2 3 4
4=
+ + +
a y y y y
1
2 4 3 1
4=
+ +( ) ( )
a y y y y
2
3 4 1 2
4=
+ +( ) ( )
a y y y y
12
1 4 2 3
4=
+ +( ) ( )
The coefficient a0 is the mean value of all measurement results. The coefficient a1 is the
half mean effect through a change ofx1 from -1 to +1.
a Effect x Effect x
1
2 21
2
1
2= = + =
( ) ( )
-
8/2/2019 Curso Bosch Diseo de experimentos
43/104
-
8/2/2019 Curso Bosch Diseo de experimentos
44/104
- 44 -
The representation in Figure 7.1.3 shows the contours of a hill (see Figure 7.1.2), as are
found on topographic charts. In the example shown, a jump from a line to the neighbouring
line corresponds to a height difference of 10 m.
Closely neighbouring contours represent a steep ascent in a direction perpendicular to the
contours. If one remains on a closed contour, then one moves - pictorially speaking - at a
constant height around the hill.
If we refrain from the picture of a hill and consider instead of the height generally a func-
tion y , which depends upon the parameters A and B : y A B= ( , ) .
Figure 7.1.2
Figure 7.1.3
-
8/2/2019 Curso Bosch Diseo de experimentos
45/104
- 45 -
y is a target characteristic, whose value is determined by the setting of the factors A and B.Each setting (A, B) then corresponds to a point in the A-B-plane and this again to a value y A B= ( , ) .
One finds for instance the following results:
A B y
6 12 43
12 12 62
6 20 78
12 20 113
The four points (A,B) form a rectangle in Figure 7.1.3.
They are drawn in Figure 7.1.2 over the A-B-plane with y as the third coordinate, whichcorresponds to the height above this plane.
From this representation, it is just as apparent as in Figure 7.1.1, that when dealing with
factorial designs at two levels, a linear model (straight line, plane) is taken as a basis, in
order to approximate the unknown, in general, curved response surface.
Figure 7.1.4 shows a further way to represent these results. The target characteristic y isentered as a function ofA withB as fixed parameter.
In Figure 7.1.3, an attempt is made to illustrate the three-dimensional surface y A B= ( , ) it corresponds to the hill surface two-dimensionally depending upon both factors AandB.
The dotted curves in Figure 7.1.4, on the contrary, represent the function y respectivelywhenB is fixed : y f A B const = =( , .)
They are, as such, the intersection lines of a perpendicular cut through the hills surfacewherebyB is constant (see Figure 7.1.2).
Figure7.1.4
-
8/2/2019 Curso Bosch Diseo de experimentos
46/104
- 46 -
Analogous to that, the dotted lines in Figure 7.1.5 represent the functiony when A is con-stant.
These facts are illustrated by the following figures, as further examples.
Figure 7.1.5
Figure 7.1.6
-
8/2/2019 Curso Bosch Diseo de experimentos
47/104
-
8/2/2019 Curso Bosch Diseo de experimentos
48/104
- 48 -
In principle, one can also use these methods for representing results of experiments.
The above scheme can be simplified, in which, one transforms the factorial levels A1 6= ,A2 12= , B1 12= , B2 20= respectively according to the following rule:
XX X
X X* ( )=
+2 12 1
2 .
Example: AA A
A A12 1
1 2
21 1* ( )=
+ =
AA A
A A22 1
2 2
21 1* ( )=
+ = +
B B B B B1
2 1
1 2
21 1
*
( )= + =
BB B
B B22 1
2 2
21 1* ( )=
+ = +
If one considers only the attained signs, then after the coordinate transformation one at-
tains the following design matrix for the two-factor design with two levels, instead of the
above scheme.
No. A B y
1 - - y1
2 + - y2
3 - + y3
4 + + y4
The second row corresponds accordingly to an experiment in which the factor A is set on
the upper level (+), the factor B on the lower level (-). Instead of using the form A1, A2 for
the settings of factorA one frequently uses A and A+ .
In the column y the results are y y1 4, . . . , of the four experiment rows. They allow being
represented in the following form.
-
8/2/2019 Curso Bosch Diseo de experimentos
49/104
- 49 -
This form of representation is also applicable, when one (or several) of the investigated
factors is not a quantitative adjustable variable, but instead a qualitative variable withfixed levels (e.g. material 1 - material 2). Naturally, an interpolation of intermediate values
is not reasonable in this case.
The results of three influence factors can be graphically represented by expanding Figure
7.1.10, into the form of a cubical. Each corner point thus corresponds to a combination of
levels of the factors A, B and C. When dealing with more than three factors, only two or
three-dimensional projections of an n-dimensional experimental space can be repre-sented.
Figure7.1.10
Figure7.1.11
-
8/2/2019 Curso Bosch Diseo de experimentos
50/104
-
8/2/2019 Curso Bosch Diseo de experimentos
51/104
- 51 -
Fig. 7.1.13
Fig. 7.1.14
Fig.7.1.15
-
8/2/2019 Curso Bosch Diseo de experimentos
52/104
- 52 -
These representations show clearly the principle appearance of a surface described by amultilinear form. The linearity with respect to both coordinates is obvious. In addition, itis seen that the minimum or maximum of every considered straight line respectively lieson the boundary of the experimental space.
7.2 Calculating the Effects
The effect of a factor gives the change of the target characteristic y, when a change takesplace from - level to + level, as an average over the settings of all the other factors. Natu-rally, the effect depends upon the explicit choice of the levels.
A graph of the effects, for the example of the two-factor design, is shown in Fig 7.2.1.
As long as the factors behave in an additive manner, both lines are parallel (see Figure7.1.11). If, on the contrary, the effect of a factor depends upon the setting (level) of an-
other, then an interaction of these factors exists, since they do not behave in an additivemanner.
The evaluation matrix of the two-factor design contains a columnAB for the interaction ofthese factors in addition to the columns for the factorsA andB.
No. A B AB y
1 - - + y1
2 + - - y2
3 - + -y
3
4 + + + y4
Fig. 7.2.1
-
8/2/2019 Curso Bosch Diseo de experimentos
53/104
- 53 -
The effect of factor X is calculated as a difference from the mean value of all y, resulting
when Xhas the + level and the mean value of all y, resulting when Xhas the level -. Thiscalculation rule is analogous for interactions and may be used generally for orthogonal
designs with m factors.
For this example the following is valid:
Effect Ay A y A y y y y
m m( )( ) ( )
= =+
++
2 2 2 21 1
2 4 1 3
Effect By B y B y y y y
m m( )( ) ( )
= =+
++
2 2 2 2
1 1
3 4 1 2
Fig. 7.2.2
Fig. 7.2.3
-
8/2/2019 Curso Bosch Diseo de experimentos
54/104
- 54 -
Effect AB y AB y AB y y y y
m m( )( ) ( )
= =+
++
2 2 2 21 1
1 4 2 3.
Here, the designation of the factor levels with + and - as opposed to the notation 1 and 2,
that is frequently used, proves advantageous, since the signs of y i on the right side of
these equations can directly be read for A, B and AB from the evaluation matrix. Further-
more, the column AB of the evaluation matrix can be determined character-wise as the
product of the columnsA and B (( ) ( ) = +1 1 1).
When dealing with fractional factorial designs, confounding of factors with interactions
can occur. The effects of confounded quantities can then no longer be calculated sepa-
rately.
Hint:
Calculation of mean effects is given here only as a matter of completeness. By using the
Figures 7.1.6 - 7.1.9, one can easily see that if a stronger interaction AB exists, the mean
effect of both factors A and B can become zero, although each factor exhibits great total
effects.
-
8/2/2019 Curso Bosch Diseo de experimentos
55/104
- 55 -
7.3 Regression Analysis
From the factors effects, the coefficients of the multilinear form (regression polynomial)may be calculated by using the coordinate transformation which transforms the settingvalues of factors into the coded form, + level, - level. The searched coefficients corre-
spond to half of the effects.
Consider, as an example, the function: y x x x x= + +3 4 2 51 2 1 2 .
The four experiments with the settings
A = 5 A+ = 10B = 6 B+ = 12
would accordingly deliver the following results if experimental noise remained unconsid-ered:
y1 3 4 5 2 6 5 5 6 161= + + =
y2 3 4 10 2 6 5 10 6 331= + + =
y3 3 4 5 2 12 5 5 12 299= + + =
y4 3 4 10 2 12 5 10 12 619= + + = .
We now proceed as though the above initial polynomial was unknown and try to derive thecoefficients from the experimental data (see 7.2).
Effect A y y y y
( ) =+
+
=2 4 1 3
2 2245
Effect B y y y y
( ) =+
+
=3 4 1 2
2 2213
Effect AB y y y y
( ) =+
+
=1 4 2 3
2 275
Constant term y y y y
=+ + +
=1 2 3 4
43525.
If one now substitutes half of the effects as coefficients into the polynomial (model)
y a a x a x a x x= + + +0 1 1 2 2 12 1 2
and considers the coordinate transformation (see Section 7.1)
XX X
X X* ( )
= +
21
2 12
,
-
8/2/2019 Curso Bosch Diseo de experimentos
56/104
- 56 -
then this results
y A
B
A B
= +
+
+
+
+
+
+
3525245
2
2
10 510 1
2132
212 6
12 1
75
2
2
10 510 1
2
12 612 1
. ( )
( )
( ) ( )
and after solving this expression:
y A B AB= + +3 4 2 5 .
It is therefore possible to calculate the coefficients of the regression polynomial from the
results of the experiment which was chosen as a formulation model for the experimentaldesign.
Therefore, it is possible to determine interpolated values within the experimental space. If
one or several additional experiments are conducted in the center of the experimental
space (e.g. rectangle in Figure 7.1.3) (design with center point), it is possible to get infor-
mation about the adequacy of the model used as a basis, by comparing the results for this
point with the corresponding interpolated values, i.e. about the quality of the fit .
If greater deviations occur between the results of additional experiments and the values
interpolated with the help of the regression polynomial, then this shows that the chosen
model describes the reality insufficiently, if not fully wrong.
Here the whole crux of DOE with orthogonal arrays shows itself: right results can onlybe attained with the right model.
7.4 Factorial Designs
7.4.1 Design Matrix
In Section 7.1, the creation of a simple scheme for a 2 2 -design is shown by considering acoordinate transformation:
No. A B
1 - -
2 + -
3 - +
4 + +
Strictly speaking, one can interpret the first two rows of the design as a one-factor-at-a-time experiment, where the factor A is set to the lower (-) or upper (+) level, while thefactorB is on the - level.
-
8/2/2019 Curso Bosch Diseo de experimentos
57/104
- 57 -
In the rows 3 and 4, A is set on the two - and + levels, thoughB is held fixed on + level.
This scheme is the basis for a general rule of factorial designs that is made clear by means
of the following representation.
25-Design
24-Design
2 3-Design
22-Design
Experiment A B C D E
1
23
4
-
+-
+
-
-+
+
-
--
-
-
--
-
-
--
-
5
6
7
8
-
+
-
+
-
-
+
+
+
+
+
+
-
-
-
-
-
-
-
-
9
10
11
12
1314
15
16
-
+
-
+
-+
-
+
-
-
+
+
--
+
+
-
-
-
-
++
+
+
+
+
+
+
++
+
+
-
-
-
-
--
-
-
17
18
19
20
21
22
23
2425
26
27
28
29
30
31
32
-
+
-
+
-
+
-
+-
+
-
+
-
+
-
+
-
-
+
+
-
-
+
+-
-
+
+
-
-
+
+
-
-
-
-
+
+
+
+-
-
-
-
+
+
+
+
-
-
-
-
-
-
-
-+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++
+
+
+
+
+
+
+
Scheme for illustrating the general rule for factorial designs
(see [1] p. 53).
-
8/2/2019 Curso Bosch Diseo de experimentos
58/104
-
8/2/2019 Curso Bosch Diseo de experimentos
59/104
- 59 -
Remark:
In this model, the designations x1, x 2 and x 3 will be used instead of the names A, B and C
for the three factors. Correspondingly, e.g. a12 is the coefficient of the interactionAB.
The columns of the evaluation matrix assigned to the interactions can be calculated, char-acter-wise as products of the columns of related factors (( ) ( ) = +1 1 1). For example,the column for the interactionACresults when one multiplies the columns of the factors A
and Cwith each other.
7.4.3 Confounding
If all 8 experiments of a 23- design were conducted, the effects and thus the coefficients of
the model for all factors and interactions can be calculated separately. Mathematically
considered, the calculation of the coefficients means solving a system of 8 equations with8 unknowns (see model, design and evaluation matrix).
y a a a a a a a a1 0 1 2 12 3 13 23 123= + + + y a a a a a a a a2 0 1 2 12 3 13 23 123= + + +y a a a a a a a a3 0 1 2 12 3 13 23 123= + + +y a a a a a a a a4 0 1 2 12 3 13 23 123= + + + y a a a a a a a a5 0 1 2 12 3 13 23 123= + + +y a a a a a a a a6 0 1 2 12 3 13 23 123= + + + y a a a a a a a a7 0 1 2 12 3 13 23 123= + + +
y a a a a a a a a8 0 1 2 12 3 13 23 123= + + + + + + +
The coefficients of this system of equations are easy to calculate due to the simple struc-ture. For example, the constant a 0 can be determined by adding all rows and dividing the
sum by 8 (mean of all results yi, see 7.3, Regression Analysis).
Owing to the balanced nature of the system of equations in front of every coefficient aplus sign appears as frequently as a minus sign by addition, all members on the right-hand side, except for a 0 cancel each other out. In order to calculate a1 the rows 1, 3, 5 and
7 are multiplied by -1 respectively and then all 8 rows are added together. Again, apartfrom a1 all elements on the right-hand side cancel each other out. The calculation for all
the remaining coefficients is analogous. If one compares this procedure with the equationsin Section 7.2, it will be evident that the calculation of the coefficients of the system ofequations and the calculation of half effects of the factors are identical processes.
Because a plus sign appears in front of a 0 in every row of the equation system, the
evaluation matrix is often given a precedent column with exclusively plus signs, which isdesignated with I (for identity) or 0.
-
8/2/2019 Curso Bosch Diseo de experimentos
60/104
- 60 -
If less than 8 experiments are conducted, then it is clear that it is no longer possible to
determine the coefficients separately . The so-called confounding occurs. This is explained
by means of an example of the 23 1
fractional factorial design. Where, three factors shall
be investigated, only 4 experiments are conducted.
Design matrix of the 23 1
design (see [9]):
A B C
1 - - +
2 + - -
3 - + -
4 + + +
We now consider how the interaction columns AB, AC and BC of the related evaluation
matrix look. They can be calculated as a product of the corresponding columns of the de-
sign matrix.
If one compares these columns with the columns of the design matrix, then it is evident
that AB with C, ACwith B and BCwith A are equivalent. Thus, the columns A and BC, B
and AC, Cand AB in the evaluation matrix are not distinguishable at all. One reckons that
the factor A with the interactionBC, the factor B with the interaction ACand the factor C
with the interactionAB are confounded.
A
BC
B
AC
C
AB
1 - - +
2 + - -
3 - + -
4 + + +
Evaluation matrix of the 2 3 1 fractional factorial design
BC
-
+
-
+
AB
+
-
-
+
AC
-
-
+
+
-
8/2/2019 Curso Bosch Diseo de experimentos
61/104
- 61 -
The occurrence of confounded factors will still be somewhat clearer if one directly con-
siders the incomplete system of equations corresponding to the 23 1
design:
y a a a a a a a a1 0 1 2 12 3 13 23 123= + + +y a a a a a a a a2 0 1 2 12 3 13 23 123= + + +
y a a a a a a a a3 0 1 2 12 3 13 23 123= + + +y a a a a a a a a4 0 1 2 12 3 13 23 123= + + + + + + + .
If, in this case, the first and third equation are multiplied by -1 and subsequently all four
equations are added together, then all elements on the right-hand side apart from a1 and
a 23 will cancel out. They are the coefficients assigned to the factor A or to the interaction
BC. Therefore A and BC are confounded. The remaining confounded factors are analo-
gously.
Remark:
Strictly considered, one should list an extra column in the evaluation matrix, for entering
the identity (column for the constant term a 0) and the three-factor interaction ABC. It is
neglected as a matter of simplicity.
It is therefore not possible in the preceding example, for instance, to calculate the effect of
factorA separate from the effect of interaction BC.
Here, a rather strange logic can be used now, which is found in most of the literature on
the subject of DOE. The effect of factor A can be determined if one assumes that the inter-
action BCdoesnt exist. This means that one must be sure that the factors B and Cbehavepurely additive. If this is clear, then it is sufficient to investigate B and C with the one-factor-at-a time experiment.
In textbooks on DOE, it is often assumed that three-factor and higher interactions are not
probable and as such this fact becomes exploited in order to formulate fractional factorialdesigns of the type 2 1m .
-
8/2/2019 Curso Bosch Diseo de experimentos
62/104
- 62 -
We investigate the evaluation matrix of the 24 1
design as an example.
A B
AB
CD
C
AC
BD
BC
AD
D
ABC
1 - - + - + + -
2 + - - - - + +
3 - + - - + - +
4 + + + - - - -
5 - - + + - - +
6 + - - + + - -
7 - + - + - + -
8 + + + + + + +
Instead of the 2 164 = experiments which would be necessary for investigating four fac-
tors on two levels each, in correspondence with the full factorial design, here only 8 ex-
periments will be conducted. If one determines the column of the interaction ABC, then
one sees that this corresponds with the column of factor D. Therefore, factor D is con-
founded with a three-factor interaction. When applying this design it is assumed that thethree-factor interaction ABC does not exist. When this assumption is false then a false
effect results forD.
In addition, two-factor interaction effects cannot be calculated separately. If, for instance,
a higher significance of the third column occurs during the column-wise evaluation (facto-
rial analysis of variance ), then it is not determinable whether this is due to the interaction
AB or CD. Otherwise AB and CD can compensate themselves (equivalent, counteracting
effects). This is not recognisable by the evaluation. The reduction in the extent of experi-
mentation is therefore a trade-off with the risk of a faulty result as well as loss of informa-
tion.
This statement is especially valid for a fractional factorial design with a reduction of the
experimental extent by more than factor 0.5 (Taguchi method, see [10]).
The rows 1-8 of the 24 1
-design correspond to the rows 1, 10, 11, 4, 13, 6, 7, 16 of the
complete 24-design (see [9], Appendix). An experiment on the basis of the 2
4 1-design
still allows being rescued, if necessary, by addition of the missing (complementary)eight rows. The c