some basic formula in statistics

Upload: raju-rimal

Post on 03-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Some basic formula in Statistics

    1/6

    Formula related toSTAT310

    2012

    FORMULA AND OTHER PRACTICAL THINGS TO KNOWRAJU RIMAL

    NORWEGIAN UNIVERSITY OF LIFE SCIENCES | s, Norway

  • 7/28/2019 Some basic formula in Statistics

    2/6

    For tesng the dierences in

    means:Hypothesis:

    : :

    Test Stasc:

    1 + 1

    Where,

    1 + 1 + 2 The condence interval for the dierencebetween two treatment means is given as,

    . . ,2 Fundamental Decomposion: + Where,

    ( ..) =

    =

    ...

    =

    ( . )== Tesng a two-way ANOVA modelThe model is given as,

    + + + + Where,

    1,2, , For Factor A 1,2, , For Factor B 1,2, , Replication The esmate for the unknown factors , andresidual is given as,

    Esmate .. . .

    . ... .+

    ANOVA Table

    Source of Error .. Sum of SquareTotal 1

    =

    =

    =

    Factor A 1 1

    . .

    =

    Factor B 1 1 .. = Interacon 1 1 1 .

    =

    =

    Error 1

  • 7/28/2019 Some basic formula in Statistics

    3/6

    Post-hoc Test

    Mulple Tesng

    While tesng the dierent levels of factors, if we

    use the mulple paired t-test, we can have

    problem of falsely rejecng hypothesis. For

    instance, if a factor has 4 levels then we can

    have 6 pairwise t-test, so that,

    at least one false rejecon 1 1 0.05 0.264So that assumpons of independence will be

    violated, this is adjusted by Tukeys post-hoc

    test.

    Tukeys HSD Test

    Based on,

    | | The two means are declared signicantly

    dierent if,

    | | > . .,

    Where,

    = Number of groups= Degree of Freedom.,= From TableThe condence interval or is given as,

    . . ,

    2 1

    + 1

    ContrastContrast is dened as,

    = , 0

    = For example, while tesng the treatment totals,

    the contrast can be constructed as,

    .=

    The variance of is,var =

    For tesng the contrast hypothesis : 0,the test stasc is constructed as, . =

    However is unknown it is replaced by .Alternave approach is to construct the -Stasc as,

    . = ResidualThe residual is given by,

    Standardize Residual

    The standardize Residual is given as,

    1

    Outliers

    With common rule of thumb the standardized

    residual greater than 3 or smaller than -3 are

    considered as outliers.

    Normality

    Normality is checked primarily with the graphs.

    The scaered points in Residuals VS Fied graphshould be random and should not follow any

    kind of paern.

    Further, in Normal Q-Q plot (Theorecal

    Quinles VS Standardized Residuals) the points

    should lie close to the standard line. The points

    that are far from the line are considered as an

    outliers.

    Condence interval for

    The condence interval for is given as,

  • 7/28/2019 Some basic formula in Statistics

    4/6

    [ , ]

    Power of Test

    The power of test is the probability of rejecngNull Hypothesis when it is false.

    In any test, the possible outcomes are,

    Accept Reject is true Correct1 Type error is false Type error Correct1 It is given as,

    > ( > )Here,

    Solving for

    , we get,

    ( )

    Paral F-TestFor tesng the eect of a factor in an

    experiment, we reduce the model and compare

    it with the full model. For reducing the model

    for tesng a factor, it is removed along with all

    of its interacons. The reduced model is then

    ed. For Example, if we are tesng the eect

    of factor C then the hypothesis is set as,

    : 0The Test stasc is,

    (Reduced Full) Full This is distributed with with and errordegree of freedom

    . Where

    is the

    number of parameter in , i.e. the dierence

    in degree of freedom of error for Full and

    Reduced Model.

    Lan Square DesignThis is special case with two or more factors

    regarded as blocks and doesnt have enough

    observaons to do completely randomized

    block experiment.

    In this design, each treatment level are tested

    exactly once in each lock of the rst blocking

    factor (Row) and exactly once in each block of

    second blocking factor (Columns).

    Example:

    A B CB C AC A B

    2Factorial DesignThe full factorial design with 3 factors is wrien

    as,

    + + + + + + + + The eect and standard error of the eect,

    Eect Contrast2 Sum of Square Contrast2

    Before ng the full model, we can check

    which factors and their interacons have

    signicant eect on the model. This can also be

    performed only with one replicaon using the

    Normal Probability Plot.

    In Normal Probability Plot, the negligible eects

    tends to fall along the line and the signicant

    eects will have non-zero mean and fall o the

    lines.

    Non-signicant eects are considered to be

    removed from the model.

    Fraconal Factorial Design

    A Full design with factor requires 2 experiments per replicaon which will increase

  • 7/28/2019 Some basic formula in Statistics

    5/6

    signicantly for large and only few degree offreedom are used in esmaon by main and

    lower degree interacon eects.

    For Instance, in

    2 experiment,

    64runs are

    needed and the main eects use only 6 degreeof freedom and two factor interacon use 15.Other 42 degree of freedom are associated withhigher order interacon which might have

    insignicant eect on response. If we can run

    the fracon of full model experiment, it can

    save a lot of work and cost of experiment.

    Aliases

    While running fraconal factorial design, some

    factor are confounded with other. For instance,

    if BC is confounded with A then when

    esmang A, we are actually esmang A+BC.

    Thus A and BC are aliases.

    Design Resoluon

    The resoluon of a design is equal to the

    smallest number of eects in the dening

    relaon.

    Random Eect Model

    When a factor in a model is considered asrandom then a restricon that it follows an

    independent and idencal normal distribuon.

    In the model,

    + + It is assumed that,

    0, 0, The term

    and

    are called variance

    components. Thus,

    var() var( + + ) + Also, the covariance is given as,

    cov( , ) cov( + + , + + ) cov , + cov , + cov( , ) + cov( , ) + 0 + 0 + 0 Thus, the correlaon between the two is givenas,

    cor( , ) cov( , )var() var

    +

    Nested DesignThe design discussed so far are all cross-

    seconal design. The design where the levels of

    a factor is nested under the levels of another

    factor is called Nested Design.

    A Two stage Nested Model is,

    + + + Here, the factor with levels is nested underfactor with level . Interacon is not possibleunder Nested Design.

    The Sum of square for a Nested Design from a

    cross-seconal model with interacon can be

    obtained as,

    + Expected Sum of Square

    Cross-seconal Designs

    Two Factor, both Fixed

    +

    1

    () + 1

    Sire 1

    Herd 1

    Cow 1

    Cow 2

    Herd 2

    Cow 3

    Cow 4

    Sire 2

    Herd 3

    Cow 5

    Cow 6

    Herd 4

    Cow 7

    Cow 8

  • 7/28/2019 Some basic formula in Statistics

    6/6

    () + 1 1 Two Factor, both random

    + + () + + () + Two Factor, one random

    + + 1 () +

    ()

    +

    Nested DesignThree Stage Nested- A and B Fixed C Random

    + + 1 () + + 1 () +

    ANCOVAANCOVA is a combinaon of regression and a

    linear model without covariate as independent

    factor. An ANCOVA model might contain both

    categorical and connuous variable in the samemodel.

    From example, the weight of a person can

    depend on the sex and height. The model

    including these variables along with their

    interacon is,

    + + + . + Here, 1 for female and 1 formale.

    is the measurement for the height.

    Then the separate regression line for female and

    male can be obtained as,

    For Female:

    + + + + For Male:

    + +