Download - DOE in Excel
-
8/12/2019 DOE in Excel
1/30
Experimental Design
A Factorial Design Example
Factorial designs are a type of experimental design for screening experiments.
The theory of factorial designs is explained in the document 'Factorial Designs' availalble fo
Software suitable for analysis of factorial designs includes well-known programs such as Mi
However these packages are quite expensive and, as most experimenters have access to
has been set up to illustrate using Excel to analyse a factorial design.
The data used for this design is from the article 'Screening and Sequential Experimentation:
Flame Atomic Absorption Spectrometry Experiments', J. Chem. Ed., 74, 216 (Feb 1997)
-
8/12/2019 DOE in Excel
2/30
r download as a pdf file.
nitab and Statistica.
xcel, this spreadsheet
: Simulations and
-
8/12/2019 DOE in Excel
3/30
The Design
The experiment in this case is the analysis of silver by flame AAS
To set up the experimental design the following steps need to be carried out:
1. Define the Variables
What variables affect the outcome of the experiment?
In this case 6 variables are considered to affect the result:
A Flame Height above Base (mm)
B Flame Stoichiometry
C Acetic Acid (%)
D Lamp Current (mA)
E Wavelength (mm)
F Slit Width (nm)
2. Define the Response Variable(s)
In this case the resonse variable is the AA signal (mAbs)
3. Define the Experimental Domain
We need to specify an appropriate range for each variable I.e. a low and high va
This range needs to be wide enough to include the optimal conditions but be wit
settings for the instrument. The following limits have been defined:
Variable Low High
A 6 12
B lean rich
C 0 5
D 4 8
E 328.1 338.1
F 0.2 0.7
4. Choice of Design
The aim of the experiment is to carry out a screening I.e. determine which variabresponse. If a variable doesn't significantly affect the result then it can be 'scree
This means the variable is set at its mid-point value and not varied in subsequen
This is often a necessary step before a full optimization study, to reduce the nu
a manageable numer (preferrably 2-4).
Factorial designs are commonly used for screening. In this case, with 6 variable
design - I.e all combinations of each variable at the two levels- would require 2^6
The full factorial design, in coded form , is shown on the next sheet.
-
8/12/2019 DOE in Excel
4/30
In coded form the low settings for each variable are shown as -1 and the high se
-
8/12/2019 DOE in Excel
5/30
lue
in achievable
les significantly affect theed out'.
t experiments.
ber of variables to
, to carry out a full factorial
= 64 experiments.
-
8/12/2019 DOE in Excel
6/30
ttings as +1
-
8/12/2019 DOE in Excel
7/30
Full Factorial Design for 6 variables
A B C D E F
-1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -11 1 -1 -1 -1 -1
-1 -1 1 -1 -1 -1
1 -1 1 -1 -1 -1
-1 1 1 -1 -1 -1
1 1 1 -1 -1 -1
-1 -1 -1 1 -1 -1
1 -1 -1 1 -1 -1
-1 1 -1 1 -1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 -1 -1
1 -1 1 1 -1 -1
-1 1 1 1 -1 -11 1 1 1 -1 -1
-1 -1 -1 -1 1 -1
1 -1 -1 -1 1 -1
-1 1 -1 -1 1 -1
1 1 -1 -1 1 -1
-1 -1 1 -1 1 -1
1 -1 1 -1 1 -1
-1 1 1 -1 1 -1
1 1 1 -1 1 -1
-1 -1 -1 1 1 -1
1 -1 -1 1 1 -1
-1 1 -1 1 1 -1
1 1 -1 1 1 -1
-1 -1 1 1 1 -1
1 -1 1 1 1 -1
-1 1 1 1 1 -1
1 1 1 1 1 -1
-1 -1 -1 -1 -1 1
1 -1 -1 -1 -1 1
-1 1 -1 -1 -1 1
1 1 -1 -1 -1 1
-1 -1 1 -1 -1 1
1 -1 1 -1 -1 1
-1 1 1 -1 -1 1
1 1 1 -1 -1 1-1 -1 -1 1 -1 1
1 -1 -1 1 -1 1
-1 1 -1 1 -1 1
1 1 -1 1 -1 1
-1 -1 1 1 -1 1
1 -1 1 1 -1 1
-1 1 1 1 -1 1
1 1 1 1 -1 1
-
8/12/2019 DOE in Excel
8/30
-1 -1 -1 -1 1 1
1 -1 -1 -1 1 1
-1 1 -1 -1 1 1
1 1 -1 -1 1 1
-1 -1 1 -1 1 1
1 -1 1 -1 1 1
-1 1 1 -1 1 11 1 1 -1 1 1
-1 -1 -1 1 1 1
1 -1 -1 1 1 1
-1 1 -1 1 1 1
1 1 -1 1 1 1
-1 -1 1 1 1 1
1 -1 1 1 1 1
-1 1 1 1 1 1
1 1 1 1 1 1
-
8/12/2019 DOE in Excel
9/30
Fractional Factorial Design
It might be decided that the previous design contains too may experiments
The following is a reduced Fractional Factorial Design containing 16 experime
A B C D E F
-1 -1 -1 -1 -1 -1
1 -1 -1 -1 1 -1
-1 1 -1 -1 1 1
1 1 -1 -1 -1 1
-1 -1 1 -1 1 1
1 -1 1 -1 -1 1
-1 1 1 -1 -1 -1
1 1 1 -1 1 -1
-1 -1 -1 1 -1 1
1 -1 -1 1 1 1-1 1 -1 1 1 -1
1 1 -1 1 -1 -1
-1 -1 1 1 1 -1
1 -1 1 1 -1 -1
-1 1 1 1 -1 1
1 1 1 1 1 1
How was this design arrived at?
The columns A-D contain a full factorial design in these 4 variables I.e. all combinations of t
Column E was created by multiplying the coefficients in columns A, B and C row-wise
I.e E = ABC e.g. for row 10 -1 (cell F10) = -1(B10) * -1(C10) * -1(D10)Similalry column F was created by B*C*D I.e F = BCD
This creates a resolution 4 design since the defining word is I = ABCE or I = BCDF
A fuller explanation is contained in the document 'Factorial Designs'
On the next sheet the above design is displayed in actual levels. The responses were meas
16 experiments and the results displayed in the results column.
-
8/12/2019 DOE in Excel
10/30
nts
e two levels
ured for the
-
8/12/2019 DOE in Excel
11/30
Response
A B C D E F Signal
6 lean 0 4 0.2 328.1 95
12 lean 0 4 0.7 328.1 416 rich 0 4 0.7 338.1 63
12 rich 0 4 0.2 338.1 83
6 lean 5 4 0.7 338.1 59
12 lean 5 4 0.2 338.1 114
6 rich 5 4 0.2 328.1 121
12 rich 5 4 0.7 328.1 59
6 lean 0 8 0.2 338.1 107
12 lean 0 8 0.7 338.1 38
6 rich 0 8 0.7 328.1 44
12 rich 0 8 0.2 328.1 73
6 lean 5 8 0.7 328.1 60
12 lean 5 8 0.2 328.1 976 rich 5 8 0.2 338.1 105
12 rich 5 8 0.7 338.1 53
-
8/12/2019 DOE in Excel
12/30
Calculation of Main Effects
Response
A B C D E F Signal A
-1 -1 -1 -1 -1 -1 95 -95
1 -1 -1 -1 1 -1 41 41
-1 1 -1 -1 1 1 63 -63
1 1 -1 -1 -1 1 83 83
-1 -1 1 -1 1 1 59 -59
1 -1 1 -1 -1 1 114 114
-1 1 1 -1 -1 -1 121 -121
1 1 1 -1 1 -1 59 59
-1 -1 -1 1 -1 1 107 -107
1 -1 -1 1 1 1 38 38
-1 1 -1 1 1 -1 44 -44
1 1 -1 1 -1 -1 73 73
-1 -1 1 1 1 -1 60 -60
1 -1 1 1 -1 -1 97 97-1 1 1 1 -1 1 105 -105
1 1 1 1 1 1 53 53
-12
How do we determine if the variable has a significant effect on the response?
To determine this the main effectsfor each variable are calculated.
To do this we average the responses for the variable at the highlevel and subtract from it t
at the low level
This is equivalent to multiplying the response column (H) by the column of coefficients for th
(e.g. column B for variable A) and dividing by half the number of experiments (8)
How do we interpret the results?
The main effects give the relative importance of each variable. The (numerically) largest eff
wavelength (variable E), followed by Flame height and % Acetic Acid
The signof the effect also gives information. A negative effect means that the response is h
In this case, for example, the absorbance is higher at the low wavelength setting of 328.1n
From these experiments we could definitely 'screen out' flame stoichiometry and lamp curre
experiments I.e set them at mid point values (stoichiometry between lean and rich and curre
-
8/12/2019 DOE in Excel
13/30
B C D E F
-95 -95 -95 -95 -95
-41 -41 -41 41 -41
63 -63 -63 63 63
83 -83 -83 -83 83
-59 59 -59 59 59
-114 114 -114 -114 114
121 121 -121 -121 -121
59 59 -59 59 -59
-107 -107 107 -107 107
-38 -38 38 38 38
44 -44 44 44 -44
73 -73 73 -73 -73
-60 60 60 60 -60
-97 97 97 -97 -97105 105 105 -105 105
53 53 53 53 53
-1.25 15.5 -7.25 -47.25 4
Main effects
e average response
e variable
ct is for
igher at the low setting.
t from further
nt of 6 mA)
-
8/12/2019 DOE in Excel
14/30
Main Effects Plots
These plots give us another way to compare the effects of the variables
This is the data used to calculate the main effects
A B C D E F
-95 -95 -95 -95 -95 -95
41 -41 -41 -41 41 -41
-63 63 -63 -63 63 63
83 83 -83 -83 -83 83
-59 -59 59 -59 59 59
114 -114 114 -114 -114 114
-121 121 121 -121 -121 -121
59 59 59 -59 59 -59
-107 -107 -107 107 -107 107
38 -38 -38 38 38 38
-44 44 -44 44 44 -44
73 73 -73 73 -73 -73
-60 -60 60 60 60 -6097 -97 97 97 -97 -97
-105 105 105 105 -105 105
53 53 53 53 53 53
Step 1 Order each column from lowest to highest
A B C D E F
-121 -114 -107 -121 -121 -121
-107 -107 -95 -114 -114 -97
-105 -97 -83 -95 -107 -95
-95 -95 -73 -83 -105 -73
-63 -60 -63 -63 -97 -60-60 -59 -44 -59 -95 -59
-59 -41 -41 -59 -83 -44
-44 -38 -38 -41 -73 -41
38 44 53 38 38 38
41 53 59 44 41 53
53 59 59 53 44 59
59 63 60 60 53 63
73 73 97 73 59 83
83 83 105 97 59 105
97 105 114 105 60 107
114 121 121 107 63 114
Step 2:In the above table responses at the low (-1 ) settings have a negative sign and responses at the high (+We need to get the average of the absolute values at each setting. A main effects plot compares these
A B C D E F
low 81.75 76.375 68 79.375 99.375 73.75
high 69.75 75.125 83.5 72.125 52.125 77.75
Step 3: Plot the data
-
8/12/2019 DOE in Excel
15/30
A variable with the biggest difference between the 'high' and 'low' values will be the most sig
I.e E followed by A, C. These are shown by the steepest slopes in the above graphs
0
20
40
60
80
100
120
low high
-
8/12/2019 DOE in Excel
16/30
1) settings are positive.
two averages graphically
-
8/12/2019 DOE in Excel
17/30
nificant
A
B
C
D
E
F
-
8/12/2019 DOE in Excel
18/30
Interactions
The interactions between variables can also be calculated. The column of coded coefficient
is calculated by multiplying the columns of coefficients of the corresponding variables
The interaction effect is then found by multiplying the response column by this column of co
summing the column and dividing by 8
Response
A B C D E F Signal A*B A*C
-1 -1 -1 -1 -1 -1 95 1 1
1 -1 -1 -1 1 -1 41 -1 -1
-1 1 -1 -1 1 1 63 -1 1
1 1 -1 -1 -1 1 83 1 -1
-1 -1 1 -1 1 1 59 1 -1
1 -1 1 -1 -1 1 114 -1 1
-1 1 1 -1 -1 -1 121 -1 -1
1 1 1 -1 1 -1 59 1 1
-1 -1 -1 1 -1 1 107 1 1
1 -1 -1 1 1 1 38 -1 -1
-1 1 -1 1 1 -1 44 -1 11 1 -1 1 -1 -1 73 1 -1
-1 -1 1 1 1 -1 60 1 -1
1 -1 1 1 -1 -1 97 -1 1
-1 1 1 1 -1 1 105 -1 -1
1 1 1 1 1 1 53 1 1
Coefficients for main effects
95 95
-41 -41
The two largest interaction effects are A*C , B*E, B*D and C*F -63 63
83 -83
CAUTION! 59 -59
-114 114It is no coincidence that A*B and B*E are the same value. In experimental -121 -121
design language the two interaction effects are confounded. Confoundings 59 59
occur because of the reduced nature of the fractional factorial design. 107 107
A discussion of confoundings (aliases) can be found in -38 -38
the Factorial Designs document -44 44
73 -73
60 -60
Alias Table -97 97
-105 -105
A BCE DEF ABCDF 53 53
B ACE CDF ABDEF
C ABE BDF ACDEF -4.25 6.5
D AEF BCF ABCDE
E ABC ADF BCDEFF ADE BCD ABCEF
AB CE ACDF BDEF
AC BE ABDF CDEF
AD EF ABCF BCDEF
AE BCE DF ABCDEF
AF DE ABCD BCEF
BD CF ABEF ACDEF
-
8/12/2019 DOE in Excel
19/30
BF CD ABDE ACEF
ABD ACF BEF CDEF
ABF ACD BDE CEF
-
8/12/2019 DOE in Excel
20/30
s for each interaction
efficients,
A*D A*E A*F B*C B*D B*E B*F C*D C*E
1 1 1 1 1 1 1 1 1
-1 1 -1 1 1 -1 1 1 -1
1 -1 -1 -1 -1 1 1 1 -1
-1 -1 1 -1 -1 -1 1 1 1
1 -1 -1 -1 1 -1 -1 -1 1
-1 -1 1 -1 1 1 -1 -1 -1
1 1 1 1 -1 -1 -1 -1 -1
-1 1 -1 1 -1 1 -1 -1 1
-1 1 -1 1 -1 1 -1 -1 1
1 1 1 1 -1 -1 -1 -1 -1
-1 -1 1 -1 1 1 -1 -1 -11 -1 -1 -1 1 -1 -1 -1 1
-1 -1 1 -1 -1 -1 1 1 1
1 -1 -1 -1 -1 1 1 1 -1
-1 1 -1 1 1 -1 1 1 -1
1 1 1 1 1 1 1 1 1
Coefficients for interactions
95 95 95 95 95 95 95 95 95
-41 41 -41 41 41 -41 41 41 -41
63 -63 -63 -63 -63 63 63 63 -63
-83 -83 83 -83 -83 -83 83 83 83
59 -59 -59 -59 59 -59 -59 -59 59
-114 -114 114 -114 114 114 -114 -114 -114121 121 121 121 -121 -121 -121 -121 -121
-59 59 -59 59 -59 59 -59 -59 59
-107 107 -107 107 -107 107 -107 -107 107
38 38 38 38 -38 -38 -38 -38 -38
-44 -44 44 -44 44 44 -44 -44 -44
73 -73 -73 -73 73 -73 -73 -73 73
-60 -60 60 -60 -60 -60 60 60 60
97 -97 -97 -97 -97 97 97 97 -97
-105 105 -105 105 105 -105 105 105 -105
53 53 53 53 53 53 53 53 53
-1.75 3.25 0.5 3.25 -5.5 6.5 -2.25 -2.25 -4.25
-
8/12/2019 DOE in Excel
21/30
-
8/12/2019 DOE in Excel
22/30
C*F D*E D*F E*F
1 1 1 1
1 -1 1 -1
-1 -1 -1 1
-1 1 -1 -1
1 -1 -1 1
1 1 -1 -1
-1 1 1 1
-1 -1 1 -1
-1 -1 1 -1
-1 1 1 1
1 1 -1 -11 -1 -1 1
-1 1 -1 -1
-1 -1 -1 1
1 -1 1 -1
1 1 1 1
95 95 95 95
41 -41 41 -41
-63 -63 -63 63
-83 83 -83 -83
59 -59 -59 59
114 114 -114 -114-121 121 121 121
-59 -59 59 -59
-107 -107 107 -107
-38 38 38 38
44 44 -44 -44
73 -73 -73 73
-60 60 -60 -60
-97 -97 -97 97
105 -105 105 -105
53 53 53 53
-5.5 0.5 3.25 -1.75
-
8/12/2019 DOE in Excel
23/30
What don't these experiments tell us?
(1) What are the best settings for each variable? To determine this we would n
optimization design such as the Central Composite Design. Optimization design
for each variable - this is why we carry out screening first
(2) Is there curvature in the design? Consider variable B - although the main eff
missing something - perhaps the resonse is significantly higher (or lower) in the
This means there is curvature in the design.
(3) The above analysis tells us the relativeeffect of each variable but it does n
whether the variable has a significant effect.
(2) and (3) can be tested for by modifying the design to include centre points. Texperiments with variables set at their mid-points, and given codes of 0.
The mid-point values are:- 9 mm(A), lean/rich(B), 2.5% (C), 6mA (D), 333.2 (E)
The extended design, in coded form, is shown below, with the responses.
Response
A B C D E F Signal
-1 -1 -1 -1 -1 -1 95
1 -1 -1 -1 1 -1 41
-1 1 -1 -1 1 1 63
1 1 -1 -1 -1 1 83
-1 -1 1 -1 1 1 59
1 -1 1 -1 -1 1 114
-1 1 1 -1 -1 -1 121
1 1 1 -1 1 -1 59
-1 -1 -1 1 -1 1 107
1 -1 -1 1 1 1 38
-1 1 -1 1 1 -1 44
1 1 -1 1 -1 -1 73
-1 -1 1 1 1 -1 60
1 -1 1 1 -1 -1 97
-1 1 1 1 -1 1 105
1 1 1 1 1 1 530 0 0 0 0 0 79
0 0 0 0 0 0 74
0 0 0 0 0 0 77
We will illustrate an alterantive analysis on the next sheet. The data will be fitted to a polynomial, with lin
interaction terms, as follows:-
constant y = b0
-
8/12/2019 DOE in Excel
24/30
first order terms +b1*A + b2*B +b3*C +b4*D +b5*E +b6*F
two way interactions +b12*A*B +b13*A*C +b14*A*D +b15*A*E +b16*A*F +b24*B*D +b26
three way interactions *b134*A*B*D +b126*A*B*F
Note: not all possible terms can be included due to confounding (see alias table on previous
The coefficients are then determined using least squares regression . This can be carried out in Excelusing the array function LINEST (consult the Excel help files for use of this function and using array func
However before performing regression columns corresponfing to the interaction coefficients need to be
as previously shown.
-
8/12/2019 DOE in Excel
25/30
ed to carry out a full
need at least three settings
ct is small perhaps we are
ange between 4 - 8?
t tell us absolutely
hese are
and 0.45nm(F)
ar and
-
8/12/2019 DOE in Excel
26/30
B*F
sheet )
tions)
onstructed
-
8/12/2019 DOE in Excel
27/30
Multivariate Regression
A B C D E F A*B A*C A*D
-1 -1 -1 -1 -1 -1 1 1 1
1 -1 -1 -1 1 -1 -1 -1 -1
-1 1 -1 -1 1 1 -1 1 1
1 1 -1 -1 -1 1 1 -1 -1
-1 -1 1 -1 1 1 1 -1 1
1 -1 1 -1 -1 1 -1 1 -1
-1 1 1 -1 -1 -1 -1 -1 1
1 1 1 -1 1 -1 1 1 -1
-1 -1 -1 1 -1 1 1 1 -1
1 -1 -1 1 1 1 -1 -1 1
-1 1 -1 1 1 -1 -1 1 -1
1 1 -1 1 -1 -1 1 -1 1
-1 -1 1 1 1 -1 1 -1 -11 -1 1 1 -1 -1 -1 1 1
-1 1 1 1 -1 1 -1 -1 -1
1 1 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
b126 b134 b26 b24 b16 b15 b14 b13 b12
-0.125 3.25 -1.125 -2.75 0.25 1.625 -0.875 3.25 -2.125
0.55508 0.55508 0.55508 0.55508 0.55508 0.55508 0.55508 0.55508 0.55508
0.998699 2.220321 #N/A #N/A #N/A #N/A #N/A #N/A #N/A
153.5552 3 #N/A #N/A #N/A #N/A #N/A #N/A #N/A
11355 14.78947 #N/A #N/A #N/A #N/A #N/A #N/A #N/A
the first line above contains the parameters. The second line is the standard errors, third line is R 2 an
the fourth line is the F statistic and degrees of freedom and the fifth line regression and residual sum of
critical t value 4.3
df =2
(since the error is based on the 3 replicates of the centre point)
the confidence interval for each coefficient is b +/-t*se where se is the standard error in the second line
2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845 2.386845
1015
Regression coefficients
-
8/12/2019 DOE in Excel
28/30
Viewing this graph now gives us an answer to which effects are significant
An effect is significant (at the 95% confidence level ) if its' regression coefficient is significan
I.e. its' confidnce interval does not include zero. This applies to b5, b4, b1, b3,b13,b24,b134
This means the variables E (wavelength), A( Flame Height), C(% acetic acid) and D(lamp c
are significant. There are also significant interactions but due to confounding we cannot defiwhich are significant. The only way to remove the confounding is to do more experiments. H
screen out variables B and F we could do a full factorial on 4 variables = 16 experiments an
interactions.
Note that it is still possible B and F may have significant interactions even though their main
Note You may see a connection between the regression coefficients and the main effhalf the size of the main effects so both give the same information.
Curvature? Compare the average of the response for the factorial points (first 16
average response for factorial points 75.75
average response for centre points 76.66667
Since these averages are very similar there is little curvature in the model.
Note that if significant curvature is indicated these experiments cannot tell which variable ca
curvature; a design such as a central composite design is needed to determine this.
-30-25-20-15-10
-505
b126 b134 b26 b24 b16 b15 b14 b13 b12 b6 b5
-
8/12/2019 DOE in Excel
29/30
Response
A*E A*F B*D B*F A*B*D A*B*F Signal
1 1 1 1 -1 -1 95
1 -1 1 1 1 1 41
-1 -1 -1 1 1 -1 63
-1 1 -1 1 -1 1 83
-1 -1 1 -1 -1 1 59
-1 1 1 -1 1 -1 114
1 1 -1 -1 1 1 121
1 -1 -1 -1 -1 -1 59
1 -1 -1 -1 1 1 107
1 1 -1 -1 -1 -1 38
-1 1 1 -1 -1 1 44
-1 -1 1 -1 1 -1 73
-1 1 -1 1 1 -1 60-1 -1 -1 1 -1 1 97 75.75
1 -1 1 1 -1 -1 105 76.66667
1 1 1 1 1 1 53
0 0 0 0 0 0 79
0 0 0 0 0 0 74
0 0 0 0 0 0 77
b6 b5 b4 b3 b2 b1 b0
2 -23.625 -3.625 7.75 -0.625 -6 75.89474
0.55508 0.55508 0.55508 0.55508 0.55508 0.55508 0.509377
#N/A #N/A #N/A #N/A #N/A #N/A #N/A
#N/A #N/A #N/A #N/A #N/A #N/A #N/A
#N/A #N/A #N/A #N/A #N/A #N/A #N/A
standard error of y
squares
of the output
2.386845 2.386845 2.386845 2.386845 2.386845 2.386845
-
8/12/2019 DOE in Excel
30/30
tly non-zero
urrent)
nitely say owever since we can
determine all
effects are not significant.
cts. The coefficeints are
) and the centre points (last 3)
uses the
b4 b3 b2 b1