one-way analysis of variance by abhishek vijayvargiya

2
ABOUT THE AUTHOR Abhishek Vijayvargiya is a graduate student in the Mechanical Engineering Department at the University of Miami. He can be reached by e-mail at [email protected]. For more Author information, go to gxpandjvt.com/bios [ One-Way Analysis of Variance Abhishek Vijayvargiya INTRODUCTION Analysis of variance (ANOVA) is an extremely important method in exploratory and confirmatory data analysis. Unfortunately, in some complex problems it is not always easy to set up an appropriate ANOVA. ANOVA is classified as one-way ANOVA and two-way ANOVA (1). This paper shows how easily one-way ANOVA can be used to deter- mine whether there is a statistically significant difference in the data analysis of a problem. ANOVA—TERM REVEALED In statistics, ANOVA is a collection of statistical models, and their associated procedures, in which the observed variance is partitioned into components because of dif- ferent explanatory variables. The initial techniques of the ANOVA were developed by the statistician and geneticist R. A. Fisher; thus, one can say that ANOVA is a statistical model meant to analyze data. Usually the variables in an ANOVA analysis are categorical, not continuous. For one-way ANOVA we make use of an F Table (2) with the value of alpha as 0.05. ANOVA’S ROLE ANOVA for balanced data does the following three things at once: Comparisons of mean squares, along with F-tests (1, 2) Under ANOVA the sum of squares indicate the variance of each component of the decomposition (1) Closely related to the ANOVA is a linear model fit with coefficient estimates and standard errors. IS ANOVA OBSOLETE? What is the analysis of variance? Econometricians see it as an uninteresting special case of linear regression. Instruc- tors see it as one of the hardest topics in classical statis- tics to teach. However this paper shows how the ideas of ANOVA are useful in many applications of statistics. For the purpose of this paper, we consider the following sample problem (3). A SAMPLE PROBLEM A medical device manufacturing company runs three injection-molding machines for the production of silicone valves. There is reason to believe that the three molding processes may not be producing similar products. A company expert in designed experiments and statistical techniques is called in to assess the operation. The expert is presented with the data in Table I, which represent the “Shore A” hardness, or durometer, of the molded silicone valves. The problem can be easily solved using one-way ANOVA and, after getting the final result, one can compare it with the value obtained from an F Table (2) for an alpha value of 0.05. FORMULAS USED For solving one-way ANOVA, one needs to find out the mean of the total sample. This can be done by adding the values of all the components and dividing the total sum by total number of components. Further, one needs to find out the correction factor, which can be obtained by using Formula 1. [Formula 1] For the problem presented in this paper, after finding the correction factor we find the sum of squares (SOS) or machine sum of squares (MSOS), under which we individually take the squares of the sum of all the three machines and then add all the three outputs, and fur- ther subtract it from the correction factor obtained (1). The total sum of squares (TSOS) is found by using For- mula 2. Error can be then found by subtracting MSOS by TSOS. PEER-REVIEWED 2 JOURNAL OF VALIDATION TECHNOLOGY [WINTER 2009] ivthome.com ivthome.com

Upload: vipul-gupta

Post on 18-Nov-2014

351 views

Category:

Documents


4 download

DESCRIPTION

This paper is published in Journal of Validation Technology, Vol.15, No.1, 2009. To contact the author, please go to the following linkhttp://www.linkedin.com/in/abhishekvijayvargiya or mail him at [email protected]

TRANSCRIPT

Page 1: One-Way Analysis of Variance by Abhishek Vijayvargiya

ABOUT THE AUTHORAbhishek Vijayvargiya is a graduate student in the Mechanical Engineering Department at the

University of Miami. He can be reached by e-mail at [email protected].

For more Author

information,

go to

gxpandjvt.com/bios[

One-Way Analysis of VarianceAbhishek Vijayvargiya

INTRODUCTIONAnalysis of variance (ANOVA) is an extremely important

method in exploratory and confirmatory data analysis.

Unfortunately, in some complex problems it is not always

easy to set up an appropriate ANOVA. ANOVA is classified

as one-way ANOVA and two-way ANOVA (1). This paper

shows how easily one-way ANOVA can be used to deter-

mine whether there is a statistically significant difference

in the data analysis of a problem.

ANOVA—TERM REVEALEDIn statistics, ANOVA is a collection of statistical models,

and their associated procedures, in which the observed

variance is partitioned into components because of dif-

ferent explanatory variables. The initial techniques of the

ANOVA were developed by the statistician and geneticist

R. A. Fisher; thus, one can say that ANOVA is a statistical

model meant to analyze data. Usually the variables in

an ANOVA analysis are categorical, not continuous. For

one-way ANOVA we make use of an F Table (2) with the

value of alpha as 0.05.

ANOVA’S ROLEANOVA for balanced data does the following three things

at once:

• Comparisons of mean squares, along with F-tests (1,

2)

• Under ANOVA the sum of squares indicate the variance

of each component of the decomposition (1)

• Closely related to the ANOVA is a linear model fit with

coefficient estimates and standard errors.

IS ANOVA OBSOLETE?What is the analysis of variance? Econometricians see it as

an uninteresting special case of linear regression. Instruc-

tors see it as one of the hardest topics in classical statis-

tics to teach. However this paper shows how the ideas

of ANOVA are useful in many applications of statistics.

For the purpose of this paper, we consider the following

sample problem (3).

A SAMPLE PROBLEMA medical device manufacturing company runs three

injection-molding machines for the production of silicone

valves. There is reason to believe that the three molding

processes may not be producing similar products. A

company expert in designed experiments and statistical

techniques is called in to assess the operation. The expert

is presented with the data in Table I, which represent

the “Shore A” hardness, or durometer, of the molded

silicone valves.

The problem can be easily solved using one-way

ANOVA and, after getting the final result, one can

compare it with the value obtained from an F Table

(2) for an alpha value of 0.05.

FORMULAS USEDFor solving one-way ANOVA, one needs to find out the

mean of the total sample. This can be done by adding the

values of all the components and dividing the total sum

by total number of components. Further, one needs to

find out the correction factor, which can be obtained by

using Formula 1.

[Formula 1]

For the problem presented in this paper, after finding

the correction factor we find the sum of squares (SOS)

or machine sum of squares (MSOS), under which we

individually take the squares of the sum of all the three

machines and then add all the three outputs, and fur-

ther subtract it from the correction factor obtained (1).

The total sum of squares (TSOS) is found by using For-

mula 2. Error can be then found by subtracting MSOS

by TSOS.

P E E R - R E V I E W E D

2 JOURNAL OF VALIDATION TECHNOLOGY [WINTER 2009] i v t home.comiv thome.com

Page 2: One-Way Analysis of Variance by Abhishek Vijayvargiya

A B H I S H E K V I J AY VA RG I Y A

[Formula 2]

PROBLEM SOLVING Table II presents the ANOVA for solving the sample

problem.

Machine degrees of freedom (DOF) = Number of

Machines – 1

Error DOF = Total number of data values - Number

of Machines

Total DOF = Total number of data values - 1

Correction factor = 24579.46

Machine sum of squares = 1.732

Total sum of squares = 3.424

Error SOS- (3.424 – 1.732) = 1.692

Machine mean square = (Machine SOS)/(Machine

DOF)

Error mean square = (Error SOS)/(Error DOF)

Calculated F value = (Machine mean square)/(Error

mean square).

Use F Table with alpha as 0.05 and degrees of

freedom as 12 and 2. The Tabled F value can also be

obtained from Excel using the following function:

Tabled F value = FINV(0.05, Machine DOF, Error

DOF).

Calculate the F value as above.

If the Tabled F value is more than the calculated value

we fail to reject the hypothesis of equality of the means.

If the table value is less than the calculated value we reject

the hypothesis of equality of the means. Table III is the

final ANOVA table.

Because the calculated F value is 6.1418 and the value

obtained from F Table with 2 and 12 degrees of freedom

is 3.8853, which is less, we reject the hypothesis.

CONCLUSIONSThe problem presented in this paper clearly shows how

ANOVA can be helpful in solving various statistical prob-

lems. The following should be kept in mind while using

ANOVA:

• If you obtain a negative value when calculating quanti-

ties in ANOVA, which should be positive such as the

SOS, Mean square or F, check your work

• Be sure to read the recommended references in this

paper before attempting an ANOVA to be sure your

data meet the assumptions (normality, variance homo-

geneity, independence, balance) of ANOVA

• If the calculated F value is not less than the F tabular

value, reject the hypothesis of equality of the means.

REFERENCES1. Hicks, Charles Robert and Turner, Kenneth V., Jr., Fundamen-

tal Concepts in the Design of Experiments, New York: Oxford,

58-64, 507-520, 1999.

2. Table of F-Statistics P=0.05, http://www.statsoft.com/text-

book/sttable.html.

3. Gelman, Andrew, Analysis of Variance—Why It Is More Impor-

tant Than Ever, Columbia University, 2005. JVT

ARTICLE ACRONYM LISTINGANOVA Analysis of Variance

DOF Degrees of Freedom

MSOS Machine Sum of Squares

SOS Sum of Squares

Table I: “Shore A” hardness, or durometer, of the molded silicone valves.

Machine A Machine B Machine C

40 39.8 40.5

40.5 39.8 41

40 40.5 41

40.6 40 41.1

40.2 41 41.2

Table II: ANOVA for sample problem.Machine A Machine B Machine C

40 39.8 40.5

40.5 39.8 41

40 40.5 41

40.6 40 41.1

40.2 41 41.2

Sum 201.3 201.1 204.8

Mean 40.26 40.22 40.96

Table III: Final ANOVA.

SOS DOF MEAN Square F Value

Machine 1.732 2 0.866 6.1418

Error 1.692 12 0.141

TOTAL 3.424 14

JOURNAL OF VALIDATION TECHNOLOGY [WINTER 2009] 3