exam interpretation

Upload: shashi-dhar

Post on 07-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Exam Interpretation

    1/14

    Name: N.V. Shashidhar Parimi Stat Exam Date:

    04/04/11

    Problem 3.1.1 Is the graphical representation calculation about Explaining Variability.

    B/w BG Girls and BG Boys..

    > library(alr3)

    > attach(BGSgirls)

    > names(BGSgirls)

    "Sex" "WT2" "HT2" "WT9" "HT9" "LG9" "ST9" "WT18" "HT18" "LG18 "ST18" "Soma"

    > pairs(~WT2+HT2+WT9+HT9+LG9+ST9+Soma)

    WT2

    85 95 125 145 20 60 100

    10

    13

    16

    85

    95

    HT2

    WT925

    35

    45

    125

    145

    HT9

    LG924

    28

    32

    20

    60

    100

    ST9

    10 13 16 25 35 45 24 28 32 3 5 7

    3

    5

    7

    Soma

    Abouve scatterplot matrix of Graphs show that the clear relation b/w WT2, HT2, WT9, HT9,

    LG9 ,ST9. It means if one of the person growth increase then also others ( variable values )

    Increas, So can state that the +ve attitude towards each other variable shown.

    > pairs(HT2~HT9+Soma)

  • 8/4/2019 Exam Interpretation

    2/14

    HT2

    125 135 145

    85

    90

    95

    125

    135

    145

    HT9

    85 90 95 3 4 5 6 7

    3

    4

    5

    6

    7

    Soma

    Can see the above graph of the matrix of sample correlations b/w the height variables of each ..

    3.1.3 For finding a multiple regression model for mean function

    > lm(Soma~HT2+WT2+HT9+WT9+ST9)

    call

    lm(formula = Soma ~ HT2 + WT2 + HT9 + WT9 + ST9)

    Coefficients:

    (Intercept) HT2 WT2 HT9 WT9 ST9

    8.8590417 -0.0792535 -0.0409358 -0.0009613 0.1280506 -0.0092629

    > summary(lm(Soma~HT2+WT2+HT9+WT9+ST9))

    lm(formula = Soma ~ HT2 + WT2 + HT9 + WT9 + ST9)

    the residual values are

  • 8/4/2019 Exam Interpretation

    3/14

    Min 1Q Median 3Q Max

    -2.03132 -0.34062 0.01917 0.43939 0.97266

    For the coefficient we have to estimate the errors values which is Pr(>|t|)

    An Intercept 8.8590417 2.3764431 3.728 0.00041

    1) HT2 -0.0792535 0.0354034 -2.239 0.028668

    2) WT2 -0.0409358 0.0754343 -0.543 0.589244

    3) HT9 -0.0009613 0.0260735 -0.037 0.970704

    4) WT9 0.1280506 0.0203544 6.291 3.2e-08.

    5) ST9 -0.0092629 0.0060130 -1.540 0.128373

    The std Error is 0.5791, 64 degrees, the Multiple R is 0.5211, and by the adjustment of value the

    R value would be 0.4837 stated ,the statistics of F would be 13.93, on 5& 64, The P value is

    3.309e-09

    Yes it fit the multiple linear regression model are good.. The born Childers of girls and boys data.

    Executed

    Problem 5.6.1 .. .. Jevons gold coins The data

    library(alr3)

    > attach(jevons)

    > names(jevons)

    [1] "Age" "n" "Weight" "SD" "Min" "Max"

    > plot(Weight~Age)

  • 8/4/2019 Exam Interpretation

    4/14

    The above plot show the values has been decreasing from in order .. but stil its in proper order ..

    may not accept the decrees values .. But in its in line. it has more morel ethics to decrees even

    Now for the to find SD with Age of Ppl lets plot the value.

    > plot(SD~Age)

    Fr

    The above values shows that (plot or graph) increasing values of Age and in SD (Age is

    proportionate S.D)

  • 8/4/2019 Exam Interpretation

    5/14

    Conclusion: So, From the above give data s .. Scatter plot of Weight versus Age. And the SD hasbeen found .. it says the total process of plot and graphs explains the data has some regressiondelay in mean while at initial stage when it compared with SD .. Moved in order by reveres.. I meana Incremental order.. it is Good plot of view . as given standard weight of a gold sovereign wassupposed to be 7.9876 g; the minimum legal weight was 7.9379 g. the final SD shows its in equal to

    the actual data provided and summated

    The problem 6.17

    Continuation with above Jevons The data by using Delta method. The program fallows as

    The weight and age are taken in account of SD with Coefficient .

    The summer state as

    > summary(lm(Weight~Age,weights=(n/SD^2)))$coef

    Estimate Std. Error t value Pr(>|t|)

    (Intercept) 7.99652179 0.0013219826 6048.88577 9.964249e-12

    Age -0.02375617 0.0008797498 -27.00333 1.114497e-04

    Then the plot value of weight and age again written

    > plot(Weight~Age)

  • 8/4/2019 Exam Interpretation

    6/14

    1 2 3 4 5

    7.88

    7.9

    0

    7.9

    2

    7.9

    4

    7.96

    Age

    Weight

    > abline(m1)

    WITH VALUE COMPARESION OF M1 AS 7.93SO ON

    > abline(7.9379,0)

    THE ABOVE VAUE SHOWS THAT ALLTHY DO NOT TOUCH EACH OTHER AD DISTICT

    IN THE LINE IT MEANS FAIL TO THE OVERALL TREND IN THEGRAPH. IT IS GOOD ..

    THE APPLYING DELTA METH NOW DIRECTLY

    > delta.method(m1,"(7.9379-b0)/b1")

    Estimate SE (7.9379-b0)/b1 0.003317994 1.071017e-05

    (7.9379-7.996521)/(-0.002375615)

    [1] 24.67614

  • 8/4/2019 Exam Interpretation

    7/14

    > 2.467612-1.96*0.04940154

    [1] 2.370785

    > 2.467612+1.96*0.04940154

    [1] 2.564439

    I could notb able to understand way he .. determined a SD error of jevous voind dta where .. The

    number in inverse form the plot showed that the data has error when Sd and the delta methods

    woudl try to solve the data error more elaborate .. but it has bit on unsolved condition date is good

    still my be need more concentration in solving this errors the cut of mark state above as 7.936

    around if it would be somewhere 7.91 to 7.915 is good .. good to see it has equal intervals of time

    in values.

    1)Question)

    Anwer) The function (regression) cannot always be written as a linear combination of the terms some times may be bit in order of out flow of data. So weed in condition to estimate the nonlinear

    regression

    Example: in the turkey diet supplement experiment description.

    The Gauss network method is the Best example for nonlinear regression which realise on linear

    approximations to the non linear .

    > library(alr3)

    > attach(lakemary)

    > names(lakemary)

    [1] "Age" "Length"

  • 8/4/2019 Exam Interpretation

    8/14

    > plot(Length~Age)

    Can see the above increasing functional graph as may be bit merged.. but in order and increasing

    way. Age Vs Length

    > lm(log(1-(Length/200))~Age)

    Call:

    lm(formula = log(1 - (Length/200)) ~ Age)

    Coefficients:

    (Intercept) Age

    -0.03288 -0.36283

    The intercept is beta0,

    the age is 1

    becuz ( it is lestha .5 , and above the .05 value)

    K*=-betahat1

    t*=betahat0/-betahat1

    t*-> -0.03288/+0.36283

    [1] -0.09062095

    > nls(Length~Linf*(1-exp(-k*(Age-t0))),start=list(Linf=200,k=0.36283,t0=0.09062095))

  • 8/4/2019 Exam Interpretation

    9/14

    Nonlinear regression model

    model: Length ~ Linf * (1 - exp(-k * (Age - t0)))

    data: parent.frame()

    Linf k t0

    192.81044 0.40628 0.08087

    residual sum-of-squares: (9014)

    Number of iterations to convergence: 4

    Achieved convergence tolerance: 1.288e-06

    > summary(nls(Length~Linf*(1-exp(-k*(Age-t0))),start=list(Linf=200,k=0.36283,t0=0.09062095)))

    Formula: Length ~ Linf * (1 - exp(-k * (Age - t0)))

    Parameters:

    Estimate Std. Error t value Pr(>|t|)

    Linf 192.81044 13.08015 14.741 < 2e-16**

    k 0.40628 0.08845 4.593 1.73e-05 **

    t0 0.08087 0.24019 0.337 0.737

    Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

    Residual standard error: 10.96 on 75 degrees of freedom

    Number of iterations 4 for the convergence,So the Convergence tolerance has bees find it conclude

    the value as 1.288e-06

    With the above value of T0 . we canot control the booster so we have to do boost case with m1 ..

    where equal to 9 value.

    > bootCase(m1,B=9)

    [,1] [,2] [,3]

    [1,] 185.2013 0.4560830 0.07057400

    [2,] 168.6544 0.6901305 0.48709120

    [3,] 202.1633 0.3436856 -0.09846979

  • 8/4/2019 Exam Interpretation

    10/14

    [4,] 194.7148 0.4009924 0.02435563

    [5,] 188.7526 0.4779254 0.52709451

    [6,] 209.1252 0.3113117 -0.25082608

    [7,] 197.3969 0.3947679 0.08624991

    [8,] 194.8587 0.3807068 -0.02749190

    [9,] 199.5296 0.3629699 -0.07258126

    > matrixboot matrixboot

    >hist(matrixboot[,1])

    Histogramof matrixboot[, 1]

    matrixboot[, 1]

    Frequency

    150 200 250 300

    0

    100

    200

    300

    400

    500

    > quantile(matrixboot[,1],0.025)

    2.5%

    172.4119

    > quantile(matrixboot[,1],0.975)

    97.5%

    231.5225

    Can start proceeding with other reaming too becuz 96% above quantities in variable 1. And

    conclude the final step.

    > quantile(matrixboot[,2],0.025)

  • 8/4/2019 Exam Interpretation

    11/14

    2.5%

    0.2520726

    > quantile(matrixboot[,2],0.975)

    97.5%

    0.6915002

    > quantile(matrixboot[,3],0.025)

    2.5%

    -0.3213763

    By applying Quantitative to the matrix boot value ..

    > quantile(matrixboot[,3],0.975)

    97.5%

    0.8307757

    The confidence intervals are 95 above good

    1)variabile (172.4119, 231.5225)

    2)variabile (0.2520726, 0.6915002)

    3)variabile (-0.3213763, 0.8307757)

    Hense from the data ,,we ploted a non linear regggression model..

    OR another exaple can be wrote .. as . For non linear modellibrary(car)

    > deltaMethod(mols, "-b1/(2*b2)")

    Estimate SE

    -b1/(2*b2) 183.1104 5.961452

    183.1104 is the Days value that maximizes E(LCPUE/Day).

    11.4.3

  • 8/4/2019 Exam Interpretation

    12/14

    E(Y/X) = - 2 x +

    = = 0.0921151 (intercept)

    = - / 2 = - 0.0466052 / 2*-0.0001273 = 183.0526

    is the value of the predictor that gives the maximum value of the response,

    previously computed with the delta method;

    = = -0.0001273 (measure of curvature)

    > nls(LCPUE~th1+th2*(1-exp(-th3*Day)),start=list(th1=0.0921151, th2=183, th3=1))

    Nonlinear regression model

    model: LCPUE ~ th1 + th2 * (1 - exp(-th3 * Day))

    data: parent.frame()

    th1 th2 th3

    -0.03691 3.69560 0.01974

    residual sum-of-squares: 10.54

    Number of iterations to convergence: 16

    Achieved convergence tolerance: 9.227e-06

    > m1 summary(m1)

    Formula: LCPUE ~ th1 + th2 * (1 - exp(-th3 * Day))

    Parameters:

    Estimate Std. Error t value Pr(>|t|)

    th1 -0.036912 0.385318 -0.096 0.924478

    th2 3.695597 0.421873 8.760 6.1e-09 ***

    th3 0.019743 0.004647 4.249 0.000281 ***

    Residual standard error: 0.6626 on 24 degrees of freedom

    Number of iterations to convergence: 16

  • 8/4/2019 Exam Interpretation

    13/14

    Achieved convergence tolerance: 9.227e-06

    > plot(LCPUE~Day)

    > lines(Day,predict(m1,data.frame=Day))

    0 50 100 150 200 250

    0

    1

    2

    3

    4

    Day

    LCPUE

    Hens The above grapg is nonlinear regressionmodel graph.where the variables are not in in curvey

    of non orders where tried to explain the plot where the days are fell in intervel of LCPUCE..

    2)

    Answ) Thesimple linear regression modelconsists of the constant function and the variance function , but

    binomial regression deals with the number of successes out of M independent trials, each with the sameprobability of success.

    In the binomial regression problem, the response counts the number of successes in( Mi )trials, and so (Mi)(Yi) of the trials were failures. In addition, we have p terms or predictors xi possibly including a constant

    for the intercept

    Yi = response , success : Mi

    In linear regression models, the mean function and the variance function generallyhave completely separate parameters, but that is not so for binomial regression.The value of probability (xi) determines both the mean function and the variancefunction, so we need to estimate (xi).

    So we canot employee the somple linear regression model into the binary model where theconception is different , and diff in approach in the solution for data. If may be data is samegiven .

    Simple linear regression equations:

    E(Y |X = x) = 0 + 1x

  • 8/4/2019 Exam Interpretation

    14/14

    Var(Y |X = x) = 2(2.1)The parameters in the mean function are the intercept 0, which is the value ofE(Y |X = x) when x equals zero, and the slope 1, which is the rate of change inE(Y |X = x) for a unit change by varying the parameters

    The variance function is assumed to be constant in the Simple linear regression equation , witha positive value 2that is usually unknown. Because variance 2 > 0, the observed value ofthe ith responseyi willtypically not equal its expected value E(Y |X = xi ).

    Where as in binary function (Regression)

    Binary Equations:

    y Bin(m, )

    Pr(y = j)= (m j) j (1 )(mj)

    Y equals a specific integer j = 0, 1, . . ., m,

    (Y |X = xi) Bin(mi, (xi )), i = 1, . . . , n

    4) Problem ..

    Ans)

    d attach(d)

    > names(d)

    [1] "Country" "Ccode" "dUnRate" "dEmRate" "bbd"

    [6] "UnRate94" "EmRate94" "dEmComConst"