lecture 11

5
Models Not of Full Rank Estimation/Hypothesis Testing Y Nx1 X Nxp b px1 e Nx1 e y Ey and Ee 0 EY Xb, vare Eee 2 I N results in e 0, 2 I and Y Xb, 2 I Normal equations: Using least squares we obtain XX b XY Consider a completely randomized design y ij i e ij for i 1,2,3 Then b 1 2 3 and data represented as observation 1 2 3 y 11 1 1 0 0 y 12 1 1 0 0 y 13 1 1 0 0 y 21 1 0 1 0 y 22 1 0 1 0 y 31 1 0 0 1 where the data is: normal off-type aberrant 101 84 32 105 88 94 totals 300 172 32 The sum of the last 3 columns is the first column; every y ij contains therefore the first column of X is all ones. Also every y ij contains just one therefore the sum of the last three columns is one hence X is not of full column rank. X X is square symmetric; its elements are inner products of the columns of X with each other. X is not of full column rank therefore X Xis not of full column rank. 1

Upload: luh-putu-safitri-pratiwi

Post on 16-Dec-2015

215 views

Category:

Documents


0 download

DESCRIPTION

m

TRANSCRIPT

  • Models Not of Full Rank

    Estimation/Hypothesis Testing

    YNx1 = XNxpbpx1 + eNx1e = y Ey and Ee = 0 EY =Xb, vare = Eee = 2IN

    results in e 0,2I and Y Xb,2I

    Normal equations:

    Using least squares we obtain

    XXb = XY

    Consider a completely randomized design

    y ij = + i + e ij for i = 1, 2, 3

    Then b = 1 2 3 and data represented as

    observation 1 2 3y11 1 1 0 0y12 1 1 0 0y13 1 1 0 0y21 1 0 1 0y22 1 0 1 0y31 1 0 0 1

    where the data is:normal off-type aberrant101 84 32105 8894

    totals 300 172 32

    The sum of the last 3 columns is the first column; every y ij contains therefore the first column of X is allones. Also every y ij contains just one therefore the sum of the last three columns is one hence X is not offull column rank. X X is square symmetric; its elements are inner products of the columns of X with eachother. X is not of full column rank therefore X Xis not of full column rank.

    1

  • XX =

    6 3 2 13 3 0 02 0 2 01 0 0 1

    NOTE: elements of XX are the number of times that parameter of the model occurs in a totali.e. occurs 6 times in y,1 occurs 3 times in y,2 occurs 2 times in y,3occurs once in y occurs 3 times in y1,1 occurs 3 times in y1,2 and 3 do not occur in y1

    and

    XY =

    1 1 1 1 1 11 1 1 0 0 00 0 0 1 1 00 0 0 0 0 1

    y11y12y13y21y22y31

    =

    yy1y2y3

    =

    50430017232

    101 = y11 = + 1 + e11105 = y12 = + 2 + e12

    XY is a vector consisting of the inner product of columns of X with Y and since the nonzero elements of X are

    ones, we obtain

    yy1y2y3

    .

    Since XX is not of full column rank, there is not one unique solution to the normal equations

    XXb0 = XY

    2

  • where

    b0 =

    0

    10

    20

    30

    and applying generalized inverse G we write

    GXXb0 = GXY b0 = GXY

    6 3 2 13 3 0 02 0 2 01 0 0 1

    0

    10

    20

    30

    =

    yy1y2y3

    =

    50430017232

    The normal equations are re-written as

    EXY = XY

    replacing b by b0on LHS.

    Hence a solution isb0 = GXY

    Consequence of a Solution:b0 is a function of Y

    a.

    Eb0 = GXEY= GXXb= Hb

    b.varb0 = varGXY

    = GXvarYXG= GXXG2I

    For XX symmetric orthogonal permutation matrix P

    3

  • XPXP = PXXP

    2 =A11 A12A12 A22

    then

    G = PA111 00 0

    P

    andGXXG = G

    and varb0 = G2

    c. Estimating Ey

    Ey = y = Xb0

    = XGXyNote this vector is invariant to G since XGX is invariant hence y is always the same regardless ofb0

    d. Residual Error Sum of Squares

    SSE = y Xb0 y Xb0

    = yy yXb0 Xb0 y + Xb0 Xb0

    = yI XGXI XGXy= yI XGXy= yy yXGXy= yy b0Xy in computational form

    and XGX is invariant to G so SSE is invariant to G and hence invariant to b0.

    e. Estimating residual error variance

    With y NXb.2I

    ESSE = EyI XGXy= trI XGX2I + XbI XGXXb= 2rankI XGX= N rankX2

    4

  • Hence

    2 = SSEN rankX

    f. Partitioning the SST (sum of squares total):

    SST = yy SST = yySSM = Ny2 = yN111y from fitting a general mean SSTm = yy Ny2

    SSR = yXGXy = b0Xy SSRm = yXGX N111y =SSR SSM SSRm0 = yXGX N111SSE = yI XGXy SSE = yI XGXy SSE = yI XGXy

    g. Coefficient of Determination R2The estimated expected values of y are y

    The coefficient of determination = product-moment correlation between observations y and y 2so

    R2 =

    N

    i=1

    y i y y i y

    2

    N

    i=1

    y i y2N

    i=1

    y i y

    2

    Note:

    XXGX = X

    and because 1 is the first row of X, 1XGX = 1 y = y and thus

    R2 = SSRm2

    SSTmSSRm=

    SSRmSSTm

    5