regression diagnostics using r - yiqing...
TRANSCRIPT
Regression Diagnostics Using R
Yiqing Xu
Based on Jens Hainmueller’s MIT Lecture 17800.10-1/2
Data used in this handout is based on Eggers-Hainmueller (2009)
1 Overview
1. Scatterplot: car and lattice
2. Regression Diagnostics
• Hat-values: identifying leverage points
• Studentized residuals: identifying outliers
• QQ plot: evaluating model fit and normality
• DFBetas: evaluating influence for each coefficient
• Cook’s distance: summarizing influence across coefficients
• Automatic regression diagnosis
3. Standard Error Adjustment
• Breusch-Pagan test
• Robust standard errors
1
2 Scatterplot
wealth
20 40 60 80 100
●●
●●
●
●
●●●
●●
● ●● ●●
●
●●● ●● ●
●
●
●
●●● ●
●
●
●
●●
●
●●●
●
● ●●● ●●
●
●
●
●●
●
●● ●●●●●
●
●
●●
●●● ●● ●●● ●●●●●
●
● ●● ●●
●
●●●
●
●●
●
●
●● ● ●
●
●●●
●
●
●
●●
● ●●●
●●
●
● ●●●
● ●●●
●
●
●
●●
●●
● ●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
● ●●●
●
●
●●●●
●
●
●●● ●
●
● ●
●●
●● ●
●
●●●●●
●
●●
●
●●
●●
●
●● ●●
●
●
●
●●● ●● ●●●
●
●
●●
0.0e
+00
5.0e
+06
1.0e
+07
1.5e
+07
●●
●●●
●
●●●
●●● ●
●●●
●
●●●●● ●
●
●
●
●●● ●
●
●
●
●●
●
●●●
●
● ●●●● ●●
●
●
●●
●
●●●●● ●●
●
●
●●
●●● ●●●●●● ●●●●
●
● ●●● ●
●
●●●
●
●●●
●
●● ●●
●
●●●
●
●
●
●●
●●●●
●●
●
●●●●
● ●●●
●
●
●
●●
●●● ●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●●●●
●
●
●● ●●●
●
●●●●
●
●●
●●
●●●
●
●●●●●
●
●●
●
●●
●●
●
●● ●●
●
●
●
●●●● ●●●●●
●
● ●
2040
6080
100
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●●
●●●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●
●● ●●
●
●●
●
●●
●●
●
●●●
●●
●●
●●
●
●●●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●●
●●
●
●●
●●
●
●
●●●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●● ●
●●
●
●●
●●
●
●
●
●
●
●
●
●●●
●
●●
●
●●●
●
●
● ●
●
●
●
●●●●
●
●
●
●
●
●
●
● ●●●
●●●
●●●
●
●
●●
●
●
●
●
●
●
● ●
●●
runage
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
● ●
●●●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●●●
●
●
●
●
●●
●
●●●
●●
●●
●
●●
●●
●
●●●
●●
●●
●●
●
●●●●
●
●
●●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●●
●●
●
●●
●●
●
●
●●●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●● ●
●●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●
●
●
● ●
●
●
●
●●●●●
●
●
●
●
●
●
●●●●
●●●
●●●
●
●
●●
●
●
●
●
●
●
●●
●●
0.0e+00 5.0e+06 1.0e+07 1.5e+07
●
●
●
●●
●●
●
●
● ●●
●●●
● ●
●●●
●●
●
●● ●
●●
●
● ●
● ●●● ●●●●
●
●
●
●●●
●
●●
●●● ●●
●●●●
●●
● ●
●
●
●
●●
●
●●●●●
●●●
● ●●
●
●●
● ●●●
●
●●
●●
●●
●
●●
●
●
●●
●● ●●
●
●●●
●
●● ●●●
●
●
●
●●
● ●
● ●●●
●●●
● ●
●
●●●
● ● ●
●
●●
● ●
●
●
●●●
●●● ●●
●
●
●●
●
●●
●
● ●●●
●
●● ●●
●●● ●●
●
●●
●
●●
●●●●●
● ●● ●
●
● ●●● ●
●●
●● ●
●
●
●●
●
●
●●
●●
● ●
●
● ●
●
●
● ●
● ●
●
●
● ●●
●● ●
●●
●●●
●●
●
●● ●
●●
●
●●
●● ●●● ●●●
●
●
●
● ● ●
●
●●
●●●● ●
● ●●●
● ●
●●
●
●
●
●●
●
● ●●● ●
●●●
● ●●
●
● ●
● ●● ●
●
● ●
● ●
●●
●
● ●
●
●
● ●
●● ●●
●
● ●●
●
●●● ● ●
●
●
●
●●
●●
● ●● ●
●●●
● ●
●
●●●
●● ●
●
●●
● ●
●
●
●●●
● ●●● ●
●
●
●●
●
●●
●
● ●●●
●
●●● ●
●● ●● ●
●
●●
●
● ●
●●●●●
● ●● ●
●
●●●● ●
●●
●●●
●
●
●●
●
●
● ●
●●
●●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
served
R Code
library(foreign)
d<-read.dta("bp.dta") # loading the data
library(car)
scatterplotMatrix(d[,c("wealth","runage","served")])
2
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ●
20 40 60 80 100
0.0e
+00
5.0e
+06
1.0e
+07
1.5e
+07
runage
wea
lth
●●
●
●
●
●
●●
●
●
●
● ●
●●●
●
●●●●
●●
●
●
●
●●● ●
●
●
●
●●
●
●●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●
●●●
●●● ●
●
●●
●
●
●●
● ●●
●
●
●●
●
●
●
●
●
●
● ● ●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
● ●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●●●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●● ●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●●
●
●●●
●
●
●●
David,LlewellynGeorge,Walker
Michael,Grylls
John,WaringLeslie,Pine
Albert,Costain
Bonner,PinkReginald,Webster
Nicholas,Ridley
Michael,Underhill
Shelagh,Roberts
Gerald,Coles William,GrantJohn,Bidgood Michael,FidlerRoyden,Greene
Norris,McWhirter
Anthony,Bourne−ArtonWillfred,BakerHumphry,BerkeleyEdwin,Lee
Maurice,CowlingGodman,Irvine
Peter,Jenkin−Jones
James,Davis
Harry,Goodwin
Constance,MonksTom,NormantonHarold,ClippingdaleMichael,Havers
William van,Straubenzee
Muriel,Williamson
Brian,Bell
David,BellDennis,Larrow
Robin,Leach
Robert,BrumTimothy,Keigwin
Robert,Johnston
Thomas,Iremonger
Huw,Griffith Harold,GurdenGeoffrey,Tucker
Cyril,UnsworthDorothy,RussellAnthony,Barber
Michael,Hooker
Geoffrey,Waite
Richard,Thompson
Kenneth,Thompson
Arthur,Bottomley
Geoffrey,Rippon
Leonard,CleaverAubrey,MoodyBrian,WarrenAlexander,LeitchFrancis,Richardson
Robert,AdleyJulian,Ridsdale
Francis,Lofthouse
Richard,Lamb
Patrick,Wall
William,Lowe
Frederick,BurdenMartin,HenryK Graham,Routledge
Neil,MartenSydney,RipleyAnne,PapworthPatrick,CrottyDonald,Allen Noel,O'Reilly
John,Tilney
Christopher,WoodhouseMichael,CoulsonPaul,Beard
Michael,Ogden
Kenneth,PayneJohn,Rodgers
Clive,Howson David,MaxwellHarold,Soref
John,Cordle
Cuthbert,AlportRonald,BrayJohn,Stuart−Mills
Ian,Percival
Michael,HiggsNeil,Murray
William,Beale
Frederick,Corfield
Michael,HamiltonRobert,HorrocksAnthony,CourtneyErnest,Partridge
Graham,Partridge
Frank,TaylorJames,TaylorFrederick,Hingston
Christopher,Soames
Roger,White
Cyril,Black
Robert,Cooke
Charles,Lawson
Spencer,Le MarchantRupert,Speir
Geraint,MorganLeslie,Morgan
Robert,CrouchDavid,Crouch
Michael,Alison
Eric,BullusGilbert,LongdenThomas,John
Percy,Lucas
K A,Quas−CohenColin,Mitchell
Nigel,FisherDouglas,Barnard
Bernard,Owens
Julian,Critchley
Charles,Fletcher−Cooke
Anthony,RoyleWilliam,Peel
Harold,Denman
David,Martin−JonesThomas,Beattie−EdwardsDonald,Kaberry
Charles,Orr−Ewing
David,Napley
John,Litchfield
Ronald,Scott−MillerGodfrey,LagdenJames,Bazin
Michael,Scholfield
Patrick,Radford
Edward,Gardner
Bruce,Butcher
John,Fawcett
Michael,McNair−Wilson
Alan,Green
John,Grigg
Trevor,Skeet
Leonard,CaplanPatrick,MeddPhilip,Heselton
Jacob,Astor
John,Astor
Enoch,Powell
Peter,Emery
John,Stokes
William,Loftus
David,Lane
Leslie,PriestleyMichael,Way
John,Biggs−DavisonKenneth,DunkleyArthur,O'ConnorJohn,Spence
Richard,Lonsdale
Cyril,Lipman
Herbert,DaviesJohn,DaviesWyndham,DaviesHorace,Cutler
Michael,Argyle
Malcolm,St Clair
Alison,Tennant
James,HillGordon,MatthewsTerence,Clarke
Basil,de Ferranti
William,YatesAlbert,Holdsworth
Stephen,Hastings
Denys,Bullard
Joyce,Ratcliffe
Percy,BrowneGeoffrey,Hirst
Peter,Boydell
Stanley,CheethamDenton,HinchcliffeMartin,BrannanMaurice,ChandlerNicholas,Scott
Alan,Glyn
Neil,McLean
James,Lindsay
Banner,Adkins
Julian,Amery
David,Gibson−Watt
Nicholas,BudgenMarcus,Fox
Harmar,Nicholls
Geoffrey,SingletonWilliam,HowPeter,MillsIvor,Stanbrook
Arthur,Jones
Murray,Leask
Idris,Owen
Joseph,NortonKenneth,Hargreaves
Reginald,BevinsRobert,YoungsonJames,Scott−Hopkins
Paul,HawkinsJohn,TresmanKenneth,Lawton
Bernard,Braine
Anthony,Leavey
Kathleen,SmithDonald,Thompson
R Code
scatterplot(d$wealth~d$runage,xlab="runage",ylab="wealth")
text(y=d$wealth,x=d$runage,labels=d$name,pos=3,cex=.6,col=4)
3
runage
wea
lth
0
5000000
10000000
15000000
20 40 60 80 100
●●
●●
●
●
●●
●
●●
●
●
●
● ●
●
●●
●
●●
●
●●
● ●
●
● ● ●●●●
●
●
●●
●●●● ●●
●
●● ●●
●●
●
●
●●
●● ●
●●
●
●
●
●●●
●
●
●
●
●
●●●
●●●
●● ●
●
●●●
●
●
●
●●
●
●●●● ●●●
●
●●
●●
●
●
served
20 40 60 80 100
●
● ●
●
●
●●
●●●●●
●● ●
●
●
●●
●
●
●
●
●●
●
● ●●●
●●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●●
●●
●●
●
● ●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●
●
●
●●●●
●
●
●
●
●
●●
●
●●
●
●
●●●
●
●
●
served
R Code
library(lattice)
mypanel<-function(x,y,...) {
panel.xyplot(x,y,...)
panel.lmline(x,y)
panel.loess(x,y)
}
xyplot(wealth~runage|served,data=d,panel=mypanel)
4
3 Regression Diagnostic
3.1 Testing Two Models
----------------------------------------------------------
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2349006 702638 3.3431 0.0009808 ***
runage -40120 16862 -2.3794 0.0182380 *
served 691895 290422 2.3824 0.0180933 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
----------------------------------------------------------
----------------------------------------------------------
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.599589 0.451991 30.0882 < 2.2e-16 ***
runage -0.025181 0.010847 -2.3215 0.0212179 *
served 0.698336 0.186822 3.7380 0.0002392 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
----------------------------------------------------------
R Code
library(lmtest)
d<-na.omit(d)
mod1<-lm(wealth~runage+served,data=d)
mod2<-lm(log(wealth)~runage+served,data=d)
coeftest(mod1)
coeftest(mod2)
5
3.2 Hat-values
hatv
John,Grigg
Kenneth,Lawton
Royden,Greene
David,Maxwell
Nicholas,Ridley
Geoffrey,Rippon
Nicholas,Scott
Gerald,Coles
Bonner,Pink
Michael,Fidler
Charles,Lawson
Humphry,Berkeley
Robert,Cooke
Peter,Emery
Basil,de Ferranti
Constance,Monks
Ernest,Partridge
Harry,Goodwin
Dorothy,Russell
John,Litchfield
Alison,Tennant
Charles,Orr−Ewing
Horace,Cutler
Reginald,Webster
James,Lindsay
0.05 0.10 0.15 0.20
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
R Code
d$hatv <- hatvalues(mod1)
d <- d[order(d$hatv),]
d$name <- factor(d$name,levels=d$name,ordered=T)
n <- mod1$df.residual + mod1$rank # num of obs
k <- mod1$rank # num of regressors
cutoffhatv <- 2*k/n
mypanel = function(x,y,...){
panel.dotplot(x,y,...)
panel.abline(v=cutoffhatv,col="green")
}
dotplot(name~hatv,data=d[d$hatv>.02,],panel=mypanel)
6
3.3 Studentized Residuals
●
●●●●●
●
●●●
●
●●●
●
●
●
●●
●
●●●
●
●
●
●●
●
●●●
●
●●
●
●●●●
●●●●●
●
●
●
●
●
●●●●●●●
●
●●
●
●
●
●
●
●●
●●●
●
●
●
●
●●
●
●●●●
●
●
●●●
●
●
●
●●
●●●●
●
●●
●●
●
●
●
●●●
●●●●
●
●●
●
●●●●●
●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●●
●
●●●●
●
●●
●
●
●
●
●
●●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
●
●●
●
●
●
●●
●●
●
●
●
0 50 100 150 200
02
46
8
Index
Stu
dent
ized
Res
idua
ls
Cutoff 95%Adjusted Cutoff
Thomas,Beattie−Edwards
Marcus,Fox Ian,Percival
R Code
d$studresid <- rstudent(mod1)
cutoffstud <- qt(.025, n-k, lower.tail=F)
cutoffstudadj <- qt(.025/(n-k), n-k, lower.tail=F)
plot(d$studresid, ylab="Studentized Residuals", pch=19)
abline(h=cutoffstud, col="blue")
abline(h=cutoffstudadj, col="red")
legend("topleft", legend=c("Cutoff 95%","Adjusted Cutoff"),
lty=1, col=c("blue","red"), cex=.6)
text(y=d$studresid[d$studresid>cutoffstudadj],
x=(1:length(mod1$fitted.values))[d$studresid>cutoffstudadj],
label=d$name[d$studresid>cutoffstudadj], pos=1, cex=.6)
graphics.off()
7
3.4 Q-Q Plot
−3 −2 −1 0 1 2 3
02
46
8
t
Stu
dent
ized
Res
idua
ls(m
od1)
● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●
●●●●●●●●●●
●●●
●
●●●●●●
●
●
● ●
(a) Model 1: Raw data
−3 −2 −1 0 1 2 3
−3
−2
−1
01
2
tS
tude
ntiz
ed R
esid
uals
(mod
2)
●
●
●
●●
●
●
●
●●●●
●●●
●●●●●●●
●●●●●●●●
●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●
●●●●●●●●
●●●●
●●●●●●●●
●●
● ●●
●
(b) Model 2: After log transformation
R Code
qqPlot(mod1,"t",envelope=TRUE)
qqPlot(mod2,"t",envelope=TRUE)
8
3.5 DFBetas
●●● ●●●
●
●
●●
●
●
●●
●●
● ●●
●
●●●●●
●
●● ●●
●●
●
●●
●
●●●●●
●●●●
●
●●
●● ●●
●●●●●
●● ●
●
●●
●●
●
● ●●●●
●● ●
● ●
●●● ●●
●
●●●
●●● ●● ●●
●
●●
●●
●●●
●●
●
●●●●
●
●●
●●●
●● ●
●● ●
●
●
●
●
●●●●●
●
●
●
●●●●
●
●●●
●●
●
●●●●
●
●●
●
●●●●●●
●●
●●
●
●●●
● ●
●
●
●●●
●
●●●● ●
●●
●●●●
●
●
●●
●
●
● ●
●●●●●
●● ●●
●●●
● ●
●
●●
●
●●●●
−0.4 −0.2 0.0 0.2 0.4 0.6
−1.
0−
0.5
0.0
0.5
DFBETAS served
DF
BE
TAS
run
age
Frederick,CorfieldRupert,SpeirDavid,CrouchArthur,JonesJohn,CordleAlan,Green
Idris,Owen
Donald,Thompson
Kenneth,ThompsonMichael,Hamilton
John,Astor
Alan,Glyn
Richard,ThompsonCuthbert,Alport
Michael,HiggsMichael,Alison
Enoch,Powell Wyndham,DaviesDenys,Bullard
Harmar,Nicholls
William,GrantGodman,IrvineTom,NormantonChristopher,WoodhouseRonald,Bray
David,Lane
Gordon,MatthewsIvor,StanbrookReginald,BevinsShelagh,Roberts
Francis,LofthouseDouglas,Barnard
Norris,McWhirter
Edwin,LeeDennis,Larrow
Aubrey,Moody
Richard,LambKenneth,PayneGraham,PartridgeMichael,ScholfieldNigel,Fisher
Charles,Fletcher−CookeRobert,JohnstonFrancis,RichardsonSydney,Ripley
Bruce,Butcher
Leonard,CaplanJohn,Davies
Kathleen,SmithArthur,BottomleyNeil,MartenJohn,Tilney
William,PeelDonald,KaberryEdward,GardnerStephen,HastingsPeter,Mills
Alexander,LeitchMichael,OgdenJames,Taylor
Thomas,Beattie−Edwards
James,BazinJohn,Tresman
Clive,HowsonHarold,Denman
Albert,HoldsworthJohn,BidgoodJulian,RidsdalePercy,BrowneBernard,Braine
George,WalkerK A,Quas−CohenRichard,LonsdaleHarold,Clippingdale
Stanley,CheethamDenton,Hinchcliffe
Willfred,BakerJohn,RodgersEric,BullusJohn,SpenceJohn,Waring
Neil,MurrayRobert,Horrocks
David,NapleyPatrick,MeddWilliam,How
Patrick,WallHarold,SorefSpencer,Le MarchantPercy,LucasAnthony,LeaveyDavid,BellWilliam,BealeFrederick,HingstonLeslie,PriestleyKenneth,Dunkley
Herbert,DaviesPeter,BoydellFrederick,BurdenColin,Mitchell
Godfrey,LagdenJohn,Stuart−Mills
Michael,Argyle
Geoffrey,SingletonTimothy,KeigwinCyril,UnsworthBrian,Warren
Noel,O'ReillyDavid,Martin−JonesThomas,Iremonger
Michael,McNair−WilsonWilliam,YatesJames,Scott−Hopkins
David,LlewellynAnthony,Bourne−ArtonRobert,CrouchRonald,Scott−Miller
James,HillTerence,Clarke
Geoffrey,Hirst
Brian,Bell
Huw,Griffith
Geoffrey,Tucker
Geoffrey,WaiteRobert,BrumPatrick,RadfordMartin,BrannanTrevor,Skeet
John,Biggs−Davison
Nicholas,Budgen
Michael,Havers
Anthony,CourtneyJohn,StokesMichael,UnderhillMaurice,Cowling
Leslie,Morgan
Philip,HeseltonMuriel,Williamson
Martin,HenryJoseph,NortonKenneth,Hargreaves
Robert,Youngson
Michael,CoulsonJacob,AstorNeil,McLeanDavid,Gibson−Watt
Marcus,Fox
Frank,TaylorCyril,Black
Gilbert,Longden
Michael,HookerK Graham,RoutledgeArthur,O'ConnorMalcolm,St ClairAnne,PapworthBanner,Adkins
William van,StraubenzeeRobert,Adley
Roger,WhiteGeraint,Morgan
Julian,Amery
Albert,CostainPeter,Jenkin−JonesJames,DavisWilliam,LoweBernard,Owens
Maurice,Chandler
Thomas,John
William,LoftusJoyce,RatcliffeAnthony,Barber
Ian,Percival
Christopher,SoamesMichael,GryllsLeonard,CleaverDonald,AllenMurray,Leask
Julian,CritchleyPaul,Beard
John,FawcettPatrick,CrottyAnthony,RoyleRobin,Leach
Michael,Way
Harold,Gurden
Paul,HawkinsLeslie,Pine
Cyril,Lipman
John,Grigg
Kenneth,LawtonRoyden,Greene
David,MaxwellNicholas,RidleyGeoffrey,RipponNicholas,ScottGerald,Coles
Bonner,PinkMichael,FidlerCharles,LawsonHumphry,Berkeley
Robert,CookePeter,EmeryBasil,de Ferranti
Constance,MonksErnest,Partridge
Harry,Goodwin
Dorothy,RussellJohn,Litchfield
Alison,TennantCharles,Orr−Ewing
Horace,CutlerReginald,WebsterJames,Lindsay
R Code
dfbetas <- dfbetas(mod1)
2/sqrt(n)
plot(dfbetas[,3],dfbetas[,2],xlab="DFBETAS served",ylab="DFBETAS runage")
text(dfbetas[,3],dfbetas[,2],label=d$name,post=1,cex=.6,col=4)
9
3.6 Influence Plots
0.00 0.05 0.10 0.15 0.20 0.25
02
46
8
hat values
stud
entiz
ed r
esid
uals
●
● ●●●
●
●●●
●
●●●●
●
●●
●●●●
●
●
●
●●
●
●●●●
●
●
●●
●
●
● ●●●
●
●
●
●
●
●● ●●●
●●●
●●●
●
●
●
●
●●
● ●
●
●
●
●●
●●
●
●
●●
●
●●●●
●
●
●
●
●
●●●●
●
●●
●●
●
●
●
●
●●
●●
●●●●
●
●●●●●
●
●
●
●
●
●
●●
●●
●
●●●
●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●●
●●●●
●
●
●
●
●●
●
●●●
●
●●
●
●●
●
●●●●
●
●
●●●
●
●
●
●●
●●
●
●
● Donald,Thompson
Thomas,Beattie−Edwards
Marcus,Fox
William,Lowe
Ian,Percival
Cyril,Lipman
R Code
symbols(y=rstudent(mod1), x=hatvalues(mod1), circles=sqrt(cookd(mod1)),
ylab="studentized residuals",xlab="hat values",
ylim=c(-1,9.5),xlim=c(0,.26))
abline(h=cutoffstud,col="blue")
abline(h=cutoffstudadj,col="red")
abline(v=cutoffhatv,col="green")
filter <- rstudent(mod1) > cutoffstudadj | hatvalues(mod1) > cutoffhatv
text(y=rstudent(mod1)[filter], x=hatvalues(mod1)[filter], label=d$name[filter],pos=3)
10
3.7 Automatic Regression Diagnostics
−1000000 0 1000000 2000000
0.0e
+00
1.0e
+07
Fitted values
Res
idua
ls
●● ●●● ●●
●●●
●
●● ●●
●
●
●●●
● ● ●
●
●
●
● ●●
● ●●
●
● ●
●
●●● ●
●●●●●
●
●●
●
●● ●●●● ●●
●●●
●
●
●●
●● ●
●●●●
●●●
●●
●
● ●●●
●
●
●●●●
●●
● ●
●● ●●
●
●●● ●
●
●
●
● ●●●●
● ●
●
●●●
●● ● ●●
●
●
●
●●●
● ●●
●
●
●
●●●
●
●
●
● ●●
●●
●● ● ●
●
●
●
●
●●●
● ●●
●● ●●
●
● ●●
●
●
●
●
● ●●
●
●
● ●●
●●●
●
●●●●
●
●
●
●
●
●●
●●
●
●● ●●
●
●
●
●●
●●
●●●
● ●
●●
●
Residuals vs Fitted
160 188
62
●●●● ● ●
●●
● ●●
●● ●●
●
●
●●●
● ●●
●
●
●
●●●
● ● ●
●
●●
●
●●●●
● ● ●● ●
●
●●
●
●● ●●● ●●●
●● ●
●
●
●●
●●●
●● ●●
●●
●● ●
●
●● ● ●
●
●
●● ●●
●●
●●
● ●● ●
●
● ●●●
●
●
●
●●●●
●●●
●
● ●●
●●●● ●
●
●
●
●
●●●●●
●
●
●
●●●
●
●
●
●●
●
●●
●●●●
●
●
●
●
●●●
●●●
●● ●●
●
●●●
●
●
●
●
●●●
●
●
● ●●
●●●
●
●●●●
●
●
●
●
●
●●
● ●
●
●●● ●
●
●
●
●●
●●
●●●
● ●
●●
●
−3 −2 −1 0 1 2 3
02
46
8Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
Normal Q−Q
160188
62
−1000000 0 1000000 2000000
0.0
0.5
1.0
1.5
2.0
2.5
Fitted values
Sta
ndar
dize
d re
sidu
als
●
● ●●
● ●
●
●●●
●
●● ●
●
●
●
●●
●
●● ●●●
●●
●
●
●●
●
●
●●
●
●●
●●
●●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●●
●●●
●●
●
●
●●●
●●
●
●
●●
●
●
● ●
●
●●
●
●● ● ●
●
● ●●
●
●
●
●●
●
●
●
● ●●
●●
●
●
● ●
●●
●
●●
●●
●
● ●
●
●●●●
●
●
●
●●●
●
● ●
●
●
●
●
● ● ●●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●●
●
●●
●●●●●
●●
●●
●
●●
●
●
●
●●
●●●
●
●
Scale−Location160 188
62
0.00 0.05 0.10 0.15 0.20
02
46
8
Leverage
Sta
ndar
dize
d re
sidu
als
●●●● ●●
●●
●●●
●●●●●
●
●●●
●●●
●
●
●
●●●●●●
●
●●
●
●●●●●●●● ●
●
●●●
●● ●●●●●●●
●●
●
●
●●●●●
●●●●
●●●●●
●
●●●●
●
●
●●●●
●●●●
●●● ●
●
●●●●
●
●
●
●●●●●●●
●
●●●
●●●●●
●
●
●
●●●●●●
●
●
●
●●●●
●
●
●●●
●●
●●●●
●
●
●
●
●●●●●●
●●●●
●
●●●
●
●
●
●
●●●
●
●
●●●
●●●
●
●●●●●
●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●●●
●●●●
●●●
Cook's distance
0.5
1
Residuals vs Leverage
188160
205
(c) Model 1: Raw data
12.0 12.5 13.0 13.5
−4
−2
02
4
Fitted values
Res
idua
ls
●
●●●
●
●
●
●
●
●
●
● ●
●●
●
●
●●
●●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●●
●
●
●
●
●
●●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●●●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●●●
●
●
●
●●
● ●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●●
●
●●● ● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
Residuals vs Fitted
214224
49
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
−3 −2 −1 0 1 2 3
−3
−2
−1
01
23
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als
Normal Q−Q
214224
49
12.0 12.5 13.0 13.5
0.0
0.5
1.0
1.5
Fitted values
Sta
ndar
dize
d re
sidu
als
●
●
●●
● ●
●
●
●
●● ●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
Scale−Location214
22449
0.00 0.05 0.10 0.15 0.20
−4
−2
01
23
Leverage
Sta
ndar
dize
d re
sidu
als
●
●●●
●
●
●
●
●
●
●
●●
● ●
●
●
●●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●● ●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●●
●
●
●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
Cook's distance1
0.5
0.5
1
Residuals vs Leverage
205
224
188
(d) Model 2: After log transformation
R Code
par(mfrow=c(2,2))
plot(mod1)
par(mfrow=c(2,2))
plot(mod2)
11
4 SE Adjustment
4.1 Breusch-Pagan Test
----------------------------------------------------------
Breusch-Pagan test
data: mod1
BP = 83.4522, df = 2, p-value < 2.2e-16
----------------------------------------------------------
R Code
library(lmtest)
bptest(mod1, studentize = FALSE)
4.2 Robust SE
----------------------------------------------------------
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2349006 702638 3.3431 0.0009808 ***
runage -40120 16862 -2.3794 0.0182380 *
served 691895 290422 2.3824 0.0180933 *
----------------------------------------------------------
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2349006 813865 2.8862 0.004306 **
runage -40120 19900 -2.0161 0.045065 *
served 691895 285120 2.4267 0.016081 *
----------------------------------------------------------
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2349006 819658 2.8658 0.004582 **
runage -40120 20042 -2.0018 0.046589 *
served 691895 287149 2.4095 0.016835 *
----------------------------------------------------------
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
R Code
library(sandwich)
library(lmtest)
coeftest(mod1) # homoskedasticity
coeftest(mod1,vcov=vcovHC(mod1,type="HC0")) # classic White
coeftest(mod1,vcov=vcovHC(mod1,type="HC1")) # small sample correction
12