a brief introduction of design of experiments and robust

Part2: Analysis

Prepared by: Paul Funkenbusch, Department of Mechanical Engineering, University of Rochester

Review

ANOM

ANOVA ◦ Error estimate Replication vs. Pooling

◦ ANOVA table

◦ Judging statistical significance

DOE mini-course, part 2, Paul Funkenbusch, 2015 2

Terms Example (measure the volume of a balloon as a function of temperature and pressure)

Factors variables whose influence you want to study.

Levels specific values given to a factor during experiments (initially limit ourselves to 2-levels)

Treatment condition one running of the experiment

Response result measured for a treatment condition

Temperature, Pressure

50C, 100C 1Pa, 2Pa Set T = 50C, P = 1Pa

and measure volume

measured volume


Level

Factor -1 +1

X1. Temperature (C) 50 100

X2. Pressure (Pa) 1 2

TC X1 X2 y

1 -1 -1 y1

2 +1 -1 y2

3 -1 +1 y3

4 +1 +1 y4

etc.

DOE mini-course, part 2, Paul Funkenbusch, 2015

Use X1, X2, etc. to designate factors. Use -1, +1 to designate levels X1 at level -1 means T = 50 C

Use a table to show factor levels and response (a) for each treatment condition. For example, during TC2, set T = 100C and P = 1Pa, measure the balloon volume = y2

y response y = volume of balloon

4

Test all combinations

Responses 4 DOF ◦ 4 measured (y) values

Effects 4 DOF ◦ 4 values calculated

◦ m* = (y1+y2+y3+y4)/4

◦ DX1 = (y3+y4)/2 - (y1+y2)/2

◦ DX2 = (y2+y4)/2 - (y1+y3)/2

◦ D12 = (y1+y4)/2 - (y2+y3)/2

Model 4 DOF ◦ 4 constants in model

◦ ypred = ao+a1X1+a2X2+a12X1X2


Level

Factor -1 +1

X1. Temp (C) 50 100

X2. Pressure (Pa) 1 2

TC X1 X2 X1*X2 y

1 -1 -1 +1 y1

2 -1 +1 -1 y2

3 +1 -1 -1 y3

4 +1 +1 +1 y4

5

Most basic level of analysis

Which effects are largest (which factors/interactions most important)?

Which levels produce the best (highest or lowest) responses?

Just based on the D values ◦ you’ve already done this


11 m - m

1 levelat response average - 1 levelat response average

D

6

Sign and Magnitude of D Graphically

Positive D ◦ +1 level should increase

the response

◦ -1 level should decrease the response

Negative D reversed

Magnitude of D ◦ Indicates relative

importance


m*

A-1 A+1 B-1 B+1

Choose A-1 and B+1 to increase response

A is more important

7

Can treat interactions terms the same way

Larger D more important interaction

If interactions are large “best settings” to increase (or decrease) the response will depend on the combination of factor levels

Use model to test different combinations ◦ ypred = ao+a1X1+a2X2+a12X1X2



Level

Factor -1 +1

X1. applied load (kg) 2 3

X2. previous cuts 0 (new)

20

From part I.

Removal rate of osteotomy drills vs. applied load and number of cuts.

Which effects are most important?

How can you increase the removal rate?

TC X1 X2 X1*X2 Removal rate (mm3/s)

1 -1 -1 +1 3

2 -1 +1 -1 2

3 +1 -1 -1 5

4 +1 +1 +1 2

9


Effects ◦ DX1 = +1 ◦ Dx2 = -2

◦ D12 = -1

Number of previous cuts is most important

new drill (level -1) will increase removal rate the most

Applied load and the interaction are comparable higher load (level +1) will increase removal rate

but need to test combinations (since interaction is important too).

Factor Level

-1 +1

X1. applied load (kg) 2 3

X2. previous cuts 0 (new)

20

10


ypred = 3.0 + (0.5)X1 - (1.0)X2 - (0.5)X1X2

Set X2 = -1 (Much larger than X1 and interaction)

What is best level for X1? ◦ For X1 = -1, X2 = -1 ypred = 3.0 ◦ For X1 = +1, X2 = -1 ypred = 5.0

Best settings X1 = +1 (3kg load), X2 = -1 (new drill)

Note; This is a synergistic interaction,

◦ Best level for interaction (-1) corresponds to best factor levels [X1X2 = (+1)(-1) = -1]

◦ Interaction enhances effects of “best” factor level choices

For an anti-synergistic interaction, ◦ Conflict between best factor settings and best interaction level ◦ Best overall settings then depend on relative strength of the

interaction vs. factor

11

Second level of analysis

Which of the observed effects are statistically significant? ◦ Based on comparing observed effects against an estimate

of error.

◦ Compares D2 for factors and interactions with D2 for error.

◦ Actually compare “mean square” or “MS” proportional to D2

How much does each factor/interaction contribute to the total variance in system?


Replication Pooling of higher-order interactions

Repeat (replicate) each of the treatment conditions

Independent experimental runs (not multiple measurements from the same TC)

Differences in the responses measured for identical TCs run at different times provide error

“Pure error” not dependent on modeling assumptions

Best way to estimate error, but

greatly increases effort


Assume that higher-order interactions are unimportant/zero

Must choose these interactions upfront (before examining results) these form a “pool” for error

Effects measured for pooled interactions are used to estimate error

“Error” includes modeling error (i.e. assumptions about interactions)

Requires less experimental effort,

but error estimate is not as good

13

TC X1 X2 y

1 -1 -1 y1

2 -1 +1 y2

3 +1 -1 y3

4 +1 +1 y4


1b -1 -1 y1b

2b -1 +1 y2b

3b +1 -1 y3b

4b +1 +1 y4b

TC X1 X2 y

1a -1 -1 y1a

2a -1 +1 y2a

3a +1 -1 y3a

4a +1 +1 y4a

Original design two factors at 2-levels

4 DOF m*, DX1, Dx2, D12

Replicated design (2x) 8 DOF total 4 DOF m*, DX1, Dx2, D12

+ 4 DOF for error Contrast responses measured

under nominally identical TC

2X

14


Test more factors. Increase size and DOF.

Example: three 2-level factors instead of two. 8 DOF total

1 DOF for m* 3 DOF for factors DX1, DX2, DX3

3 DOF for 2-factor interactions D12, D13, D23

1 DOF for 3-factor interaction D123

Pool all interactions

decide before examining results assess m*, DX1, DX2, DX3 (4 DOF) use D12, D13, D23 ,D123 for error (4 DOF)

TC X1 X2 X3 y

1 -1 -1 -1 y1

2 -1 -1 +1 y2

3 -1 +1 -1 y3

4 -1 +1 +1 y4

5 +1 -1 -1 y5

6 +1 -1 +1 y6

7 +1 +1 -1 y7

8 +1 +1 +1 y8

Note: alternative choice (pool only the highest order , 3-factor,interaction) is possible, but only leaves 1 DOF for the error estimate not desirable. For larger experiments this is not as much of a constraint (more higher-order interactions that can be pooled).

15

24 = 16TC = 16 DOF ◦ m* 1 DOF

◦ factors 4 DOF

◦ 2-factor inter. 6 DOF



Pool 3 and 4 factor int. ◦ Assess factors and 2-factor

interactions

◦ 5 DOF for error estimate

25 = 32TC = 32 DOF ◦ m* 1 DOF ◦ factors 5 DOF ◦ 2-factor inter. 10 DOF ◦ 3-factor inter. 10 DOF ◦ 4-factor inter. 5 DOF ◦ 5-factor inter. 1 DOF

Pool 4 and 5 factor int. ◦ Assess factors and 2-factor

and 3-factor interactions

◦ 6 DOF for error estimate


Four 2-level factors Five 2-level factors

16

Best for: ◦ small numbers of factors

◦ systems with large uncertainties


“Pure error” (no modeling assumptions needed)

Assess more factors for same effort (or same number of factors for less effort)

Best for: ◦ large numbers of factors

◦ systems with strong time/cost constraints on experimental size

17

Source SS DOF MS F p % SS

A 100 1 100 20 0.011 56

B 50 1 50 10 0.034 28

AxB 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Typical way of presenting ANOVA results. Explain each column so you can interpret these

results. Most software packages will output their analysis in

some variant of this format. Will also show how you can do the calculations.



X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

ANOVA works by determining how much of the variance in the experiment can be attributed to each source.

Sources are factors, interactions, and the error. ◦ Interactions “pooled” to get error are included in the error row and do

not appear as a separate source (don’t double count them).

The “Total” includes everything except terms which are attributed to the overall average (i.e. m*).



X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Total variance of the responses about the overall average.

(summation over all treatment conditions)

For 2-level (factor or interaction) in a factorial design: SS = D2 х (# of TC)/4

Can obtain the error term by subtraction.


n

1=i

2

i *m = SS Total y

Measures the variance attributable to each effect (factor or interaction)

20


X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Total One less than the total number of treatment conditions (i.e. one less than the # of responses). ◦ Because 1 DOF is used in calculating m*

For 2-level (factors or interactions) 1 DOF

Can obtain the error term by subtraction.



X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

SS “normalized” against the DOF

For 2-level (factors or interactions) 1 DOF MS = SS

For error, averages the different measurements of the error variance.



X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Compares size of effect with error

The larger the F, the more likely an effect is “real”

But judging statistical significance also depends on the DOF for the effect, the DOF for the error, and chosen significance level.

Having at least 2 (and preferably 3 or 4) DOF for error greatly improves chances of identifying statistically significant effects.


Standard tables are available in most statistics textbooks to determine the critical value of F based on the DOF for error, DOF for effect, and the chosen significance level a).

a estimates the chance that an F value larger than the critical value could occur randomly (i.e. even if the effect was not real)

If F > Fcritical , the factor or interaction is judged statistically significant.


DOF (error)

DOF (effect)

1 2 3

1 161.45 199.50 215.71

2 18.51 19.00 19.16

3 10.13 9.55 9.28

4 7.71 6.94 6.59

Critical F for a = 0.05

Portion of an a = 0.05 table

(Note the very high values when there is only 1 DOF for error.)

24

Use a 0.05 Fcritical = 7.71

F (X1) = 20 > Fcritical X1 significant

F (X2) = 10 > Fcritical X2 significant

F (X1X2) = 2 < Fcritical X1X2 not significant


Source SS DOF MS F

X1 100 1 100 20

X2 50 1 50 10

X1X2 10 1 10 2

error 20 4 5 --

Total 180 7 -- --

25


X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Difficult/cumbersome to determine p values “manually”. But, commonly provided by statistical software packages

Estimates how likely an F value as big as that observed is to have occurred randomly (i.e. if the effect was not real)

p values below a chosen a value (typically = 0.05) indicate statistical significance

p depends on the DOF (effect) and DOF (error) in addition to F


Choose a significance level, a 0.05

p (X1) = 0.011 < 0.05 X1 significant

p (X2) = 0.034 < 0.05 X2 significant

p (X1X2) = 0.23 > 0.05 X1X2 not significant



X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

27

Source SS DOF MS F p %SS

X1 100 1 100 20 0.011 56

X2 50 1 50 10 0.034 28

X1X2 10 1 10 2 0.230 6

error 20 4 5 -- -- 11

Total 180 7 -- -- -- 100

Sometimes used to measure the “importance” of each factor/interaction

How much of the total variance in the experiment can be attributed to each of the factors/interactions?


84% of the total variance can be attributed to the two factor effects.

28

1b -1 -1 +1 3.2

2b -1 +1 -1 1.9

3b +1 -1 -1 5.3

4b +1 +1 +1 1.8


Data on the removal rate of osteotomy drills is collected as a function of the load applied and the number of previous cuts made.

Assume the data was collected as part of an experiment with one replication

TC X1 X2 X1*X2 Removal rate (mm3/s)

1a -1 -1 +1 2.8

2a -1 +1 -1 2.1

3a +1 -1 -1 4.7

4a +1 +1 +1 2.2

29


Effects m* = +3 DX1 = +1 Dx2 = -2 Dx1x2 = -1

X1 ◦ SS = D2*(# of TC)/4 = 12*(8)/4 = 2; DOF = 1

X2 ◦ SS = D2*(# of TC)/4 = (-22)*(8)/4 = 8; DOF = 1

X1X2 ◦ SS = D2*(# of TC)/4 = 12*(8)/4 = 2; DOF = 1

Total

◦ DOF = (# of TC) – 1 = 8 – 1 = 7

Error (by subtraction) ◦ SS = 12.36 – (2 + 8 + 2) = 0.36

◦ DOF = 7 – (1 + 1 + 1) = 4

36.1238.1...31.238.2*m = SS 8

1

222n

1=i

2

i i

y

30


Judge X1 (load), X2 (previous # of cuts), and X1X2

(interaction between load and # of cuts) significant.

Source SS DOF MS F

X1 2 1 2 22.2

X2 8 1 8 88.8

X1X2 2 1 2 22.2

error 0.36 4 0.09 --

Total 12.36 7 -- --

F critical = 7.71 for a 0.05

Critical F for a = 0.05

31

ANalysis OF Means (ANOM) ◦ Which effects are largest (which factors/interactions most

important)?

◦ Which levels produce the best (highest or lowest) responses?

ANalysis Of VAriance (ANOVA) ◦ Which of the observed effects are statistically significant?

Error estimate ◦ Replication pure error, multiplies required effort

◦ Pooling requires upfront assumptions (sparsity of effects)


This material is based on work supported by the National Science Foundation under grant CMMI-1100632.

The assistance of Prof. Amy Lerner and Mr. Alex Kotelsky in preparation of this material is gratefully acknowledged.

This material was originally presented as a module in the course BME 283/483, Biosolid Mechanics.

33 DOE mini-course, part 2, Paul Funkenbusch, 2015

a brief introduction of design of experiments and robust

Documents