trellis graphics types of plots - huit sites...
TRANSCRIPT
Trellis graphics 1
Trellis graphics
Extremely useful approach for graphical exploratory data analysis (EDA)
Allows to examine for complicated, multiple variable relationships.
Types of plots
xyplot: scatterplot
bwplot: boxplots
stripplot: display univariate data against a numerical variable
dotplot: similar to stripplot
histogram
densityplot: kernel density estimates
barchart
piechart: (Not available in R)
splom: scatterplot matrices
contourplot: contour plot of a surface on a regular grid
levelplot: pseudo-colour plot of a surface on a rectangular grid
Trellis graphics 2
wireframe: perspective plot of a surface evaluated on a regular grid
cloud: perspective plot of a cloud of points (3D scatterplot)
Note in S-Plus, the trellis graphics functions are contained in the library, trellis. This library should be loaded automatically when starting S-Plus. However if it isn’t, give the command library(trellis). In R, the trellis graphics features are usually not automatically loaded. To load them you need to give the command library(lattice).
Trellis graphics 3
<digression>
. Fi r st ( ) and . Last ( ) functions
It is possible to configure S-Plus and R to load certain files, override defaults, etc when they start up with the use of the . Fi r st ( ) function. This is similar to the use of . cshr c/. bashr c and . l ogi n files in Unix.
Any commands that you wish to run everytime you start up S-Plus and R should go in the . Fi r st ( ) function. An example of a . Fi r st ( ) function is . Fi r st <- f unct i on( ) {
l i br ar y( l at t i ce)
opt i ons( cont r ast s=c( “ cont r . t r eat ment ” , “ cont r . pol y” ) )
}
So everytime that S-Plus/R is run from the directory containing this . Fi r st ( ) function, these two commands will be run. The opt i ons( ) command sets the default contrasts when dealing with factors in linear models.
The . Last ( ) function is similar to Unix’s . l ogout file in that it contains the commands you want run when quitting S-Plus/R.
</digression>
Trellis graphics 4
Trellis graph
dist
time
50
100
150
200
5 10 15 20 25
Usual approach
5 10 15 20 25
5010
015
020
0
dist
time
Trellis graphics 5
Trellis approach xyplot(time ~ dist, data=hills,
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.lmline(x, y ,type="l")
panel.abline(ltsreg(x, y), lty=3)
}
)
Usual plotting approach attach(hills)
plot(dist, time)
abline(lm(time ~ dist))
abline(ltsreg(time ~ dist), lty=3)
detach()
Trellis graphics 6
Trellis approach
Speed of Light Data
Speed
Exp
erim
ent
No.
1
2
3
4
5
600 700 800 900 1000 1100
Usual approach
1 2 3 4 5
700
800
900
1000
Speed of Light Data
Experiment No.
Trellis graphics 7
Trellis approach bwplot(Expt ~ Speed, main="Speed of Light Data", ylab="Experiment No.", data=morley,
panel = function(x, y, ...) {
panel.bwplot(x, y, ...)
panel.abline(v=734.5)
}
)
Usual plotting approach attach(morley)
plot.factor(Expt, Speed, main="Speed of Light Data", xlab="Experiment No.")
abline(h=734.5)
detach(morley)
Instead of plot.factor, the boxplot command boxplot(split(Speed,Expt), main="Speed of Light Data", xlab="Experiment No.")
could be used as well.
Trellis graphics 8
Many trellis type plots can not be done using the more standard graphics commands, or are very complicated. One example is a scatter plot matrix
Scatter Plot Matrix
Fertility
4040
50
50
60
60
70
7080
8090 90
Agriculture
00
2020
40
40
40
40
60
6080 80
Examination
1010
20
20
20
2030
30
Education
00
1010
20
20
30
3040
4050 50
Catholic
00
2020
40
40
60
6080
80100 100
Infant.Mortality
1010
1515
20
2025 25
splom(~ swiss.df, aspect="fill",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...); panel.loess(x, y, ...)
}
)
Trellis graphics 9
This example shows the power of panel functions, as the basic plot commands can easily be applied to multi-panel displays. In this example the scatterplot smoother loess is applied to each panel of the plot
Viscosity
Tim
e
0
50
100
150
200
250
0 50 100 150 200 250 300
Weight: 20 grams 50 grams 100 grams
sps<-trellis.par.get("superpose.symbol") sps$pch <- 1:7 trellis.par.set("superpose.symbol",sps)
xyplot(Time ~ Viscosity, data=stormer, groups = Wt, panel = panel.superpose, type="b", key = list(columns=3, text= list(paste(c("Weight: ","",""), unique(stormer$Wt), "grams")), points=Rows(sps,1:3)))
Trellis graphics 10
There are an number approaches with Trellis graphics for displaying 3 dimensional information
0
1
2
3
4
5
6
1 2 3 4 5 6
-700
-750
-800
-850
-900
-950
topo.plt <- expand.grid(topo.mar)
topo.plt$pred <- as.vector(predict(topo.loess,topo.plt))
levelplot(pred ~ x*y, topo.plt,aspect=1,
at = seq(690, 960, 10), xlab="", ylab="",
panel = function(x,y,subscripts, ...) {
panel.levelplot(x,y,subscripts, ...)
panel.xyplot(topo$x, topo$y, cex=0.5, col=1)
}
)
Trellis graphics 11
x
pred
pred
700
750
800
850
900
950
wireframe(pred ~ x*y, topo.plt,aspect=c(1,0.5), drape=T,
screen=list(z=-150, x=-60),
colorkey=list(space="right", height=0.6))
Trellis graphics 12
0
1
2
3
4
5
6
0 1 2 3 4 5 6
700 725 750 775 800
800
825
850
850
875
875
875 875 900 900 925
925
950
contourplot(z ~ x * y, mat2tr(topo.lo), aspect=1,
at = seq(700, 1000, 25), xlab="", ylab="",
panel = function(x,y,subscripts, ...) {
panel.contourplot(x,y,subscripts, ...)
panel.xyplot(topo$x, topo$y, cex=0.5) } )
Note that these last two plots are from S-Plus not R (where the rest of today’s plots are from. I couldn’t get the code to work properly in R for these two plots.
Trellis graphics 13
Where Trellis plots get particularly useful is when we condition.
Scatter Plot Matrix
Comp.1
-1-1
-0.5
-0.5
0
0
0.5
0.5
1
11.5 1.5
Comp.2
-0.1-0.1
-0.05
-0.05
0
0
0.05
0.05
0.1
0.10.15 0.15
Comp.3
-0.1-0.1
-0.05
-0.05
0
0
0
0
0.05
0.050.1 0.1
Blue male Blue female Orange male Orange female
lcrabs <- log(crabs[,4:8]) lcrabs.pc <- predict(princomp(lcrabs)) crabs.grp <- c("B","b","O","o")[rep(1:4,rep(50,4))] splom(~lcrabs.pc[,1:3], groups=crabs.grp, panel = panel.superpose, key = list(text=list(c("Blue male","Blue female", "Orange male", "Orange female")), points=Rows(trellis.par.get("superpose.symbol"),1:4), columns=4))
Trellis graphics 14
BlueFemale
Comp.1
Comp.2
Comp.3
OrangeFemale
Comp.1
Comp.2
Comp.3
BlueMale
Comp.1
Comp.2
Comp.3
OrangeMale
Comp.1
Comp.2
Comp.3
sex <- crabs$sex; levels(sex) <- c("Female","Male")
sp <- crabs$sp; levels(sp) <- c("Blue","Orange")
splom(~lcrabs.pc[,1:3] | sp*sex, pscales=0)
Trellis graphics 15
The princomp command calculates the principle components of the data. Principle components finds linear combinations of the data, such that each is orthogonal to each other, and they maximize the variance of each. The crab data is highly correlated and the principle component analysis rotates the data to 5 uncorrelated descriptors
Scatter Plot Matrix
FL
222.2
2.2
2.4
2.4
2.6
2.6
2.6
2.62.8
2.83 33.2
3.2
RW
1.81.8 22
2.2
2.2
2.4
2.4
2.4
2.4
2.6
2.62.8
2.83 3
CL
2.82.83
3
3.2
3.2
3.4
3.43.6
3.63.8 3.8
CW
2.82.8 33
3.2
3.2
3.4
3.4
3.4
3.4
3.6
3.63.8 3.84
4
BD
22
2.5
2.5
2.5
2.53 3
Blue male Blue female Orange male Orange female
Trellis graphics 16
> summary(princomp(lcrabs))
Importance of components:
Comp.1 Comp.2 Comp.3 Comp.4
Std deviation 0.5166405 0.07465358 0.047914392 0.024804021
Prop of Variance 0.9689051 0.02023046 0.008333672 0.002233308
Cum Proportion 0.9689051 0.98913557 0.997469244 0.999702552
Comp.5
Standard deviation 0.0090521887
Proportion of Variance 0.0002974484
Cumulative Proportion 1.0000000000
> loadings(princomp(lcrabs))
Loadings:
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5
FL -0.452 -0.157 0.438 0.752 0.114
RW -0.387 0.911
CL -0.453 -0.204 -0.371 -0.784
CW -0.440 -0.672 0.591
BD -0.497 -0.315 0.458 -0.652 0.136
Trellis graphics 17
Days
Primary
First form
Second form
Third form
0 20 40 60 80
FemaleAverage learner
Aboriginal
MaleAverage learner
Aboriginal
0 20 40 60 80
FemaleSlow learnerAboriginal
MaleSlow learnerAboriginal
Primary
First form
Second form
Third form
FemaleAverage learnerNon-aboriginal
MaleAverage learnerNon-aboriginal
0 20 40 60 80
FemaleSlow learner
Non-aboriginal
MaleSlow learner
Non-aboriginal
0 20 40 60 80
Quine <- quine
levels(Quine$Eth) <- c("Aboriginal","Non-aboriginal")
levels(Quine$Sex) <- c("Female","Male")
levels(Quine$Age) <- c("Primary", "First form", "Second form", "Third form")
levels(Quine$Lrn) <- c("Average learner","Slow learner")
bwplot(Age ~ Days | Sex*Lrn*Eth,data=Quine,layout=c(4,2))
Trellis graphics 18
Primary
First form
Second form
Third form
0 20 40 60 80
Eth: AboriginalSex: Female
Eth: Non-aboriginalSex: Female
Primary
First form
Second form
Third form
Eth: AboriginalSex: Male
0 20 40 60 80
Eth: Non-aboriginalSex: Male
Days of Absence
Average learner Slow learner
stripplot(Age ~ Days | Eth*Sex, data=Quine, groups=Lrn, jitter=T, panel = function(x,y, subscripts, jitter.data=F, ...) { if(jitter.data) y <- jitter(y) panel.superpose(x,y,subscripts,...) }, xlab = "Days of Absence", between = list(y=1), par.strip.text = list(cex=1.2), key = list(columns=2, text=list(levels(Quine$Lrn)), points=Rows(trellis.par.get("superpose.symbol"), 1:2)), strip = function(...) strip.default(..., strip.names=c(T,T),style=1))
Trellis graphics 19
The conditioning variables in trellis plot are treated as categorical. For continuous variables, you can have lots of categories
Temperature
Ozo
ne
0
50
100
150
60 70 80 90
wind wind
60 70 80 90
wind wind
60 70 80 90
wind wind
60 70 80 90
wind
wind wind wind wind wind wind
0
50
100
150
wind 0
50
100
150
wind wind wind wind wind wind wind
wind wind
60 70 80 90
wind wind
60 70 80 90
wind wind
60 70 80 90
0
50
100
150
wind
xyplot(ozone ~ temperature | wind, data=environmental,
ylab="Ozone",xlab="Temperature")
> length(unique(environmental$wind))
[1] 28
Trellis graphics 20
Lets split wind into different levels (shingles) and condition on that instead. Wind <- equal.count(environmental$wind, number=6, overlap=0.2)
xyplot(ozone ~ temperature | Wind, data=environmental,
ylab="Ozone",xlab="Temperature")
Temperature
Ozo
ne
0
50
100
150
60 70 80 90
Wind Wind
60 70 80 90
Wind
Wind Wind
60 70 80 90
0
50
100
150
Wind
Trellis graphics 21
> Wind
Data:
[1] 7.4 8.0 12.6 11.5 8.6 13.8 20.1 9.7 9.2 10.9 13.2 11.5 12.0 18.4 11.5 [16] 9.7 9.7 16.6 9.7 12.0 12.0 14.9 5.7 7.4 9.7 13.8 11.5 8.0 14.9 20.7
and so on
[106] 10.3 16.6 6.9 14.3 8.0 11.5
Intervals:
min max count
1 2.05 7.15 24 2 6.65 8.25 22 3 7.75 9.95 25 4 9.45 11.75 35 5 10.65 14.05 27 6 12.35 20.95 23
Ovrlap between adjacent intervals:
[1] 6 7 9 16 7
Trellis graphics 22
The shingles can also be displayed graphically with plot(Wind,aspect=0.4)
Range
Pan
el
1
2
3
4
5
6
5 10 15 20
Range
Pan
el
1
2
3
4
5
6
5 10 15 20
Trellis graphics 23
Temperature
Ozo
ne
0
50
100
150
60 70 80 90
WindRad
WindRad
60 70 80 90
WindRad
WindRad
60 70 80 90
WindRad
WindRad
WindRad
WindRad
WindRad
WindRad
WindRad
0
50
100
150
WindRad
0
50
100
150
WindRad
WindRad
WindRad
WindRad
WindRad
WindRad
WindRad
WindRad
60 70 80 90
WindRad
WindRad
60 70 80 90
WindRad
0
50
100
150
WindRad
60 70 80 90
xyplot(ozone ~ temperature | Wind * Rad, data=environmental
Temperature
Ozo
ne
0 50100150
60 70 80 90
RadWind
RadWind
60 70 80 90
RadWind
RadWind
RadWind
RadWind
RadWind
0 50100150
RadWind
0 50100150
RadWind
RadWind
RadWind
RadWind
RadWind
RadWind
RadWind
0 50100150
RadWind
0 50100150
RadWind
RadWind
RadWind
RadWind
RadWind
RadWind
60 70 80 90
RadWind
0 50100150
RadWind
60 70 80 90
Trellis graphics 24
Height (inches)
Den
sity
0.00
0.05
0.10
0.15
0.20
60 65 70 75
Bass 2 Bass 1
60 65 70 75
Tenor 2 Tenor 1
Alto 2 Alto 1
60 65 70 75
Soprano 2
0.00
0.05
0.10
0.15
0.20
Soprano 1
60 65 70 75
histogram( ~ height | voice.part, data = singer,
xlab = "Height (inches)", type = "density",
panel = function(x, ...) {
panel.histogram(x, ...)
panel.mathdensity(dmath = dnorm,
args = list(mean=mean(x),sd=sd(x)),
col=1)
}
)
Trellis graphics 25
Height (inches)
Den
sity
0.00
0.05
0.10
0.15
0.20
60 65 70 75
Bass 2 Bass 1
60 65 70 75
Tenor 2 Tenor 1
Alto 2 Alto 1
60 65 70 75
Soprano 2
0.00
0.05
0.10
0.15
0.20
Soprano 1
60 65 70 75
histogram( ~ height | voice.part, data = singer,
xlab = "Height (inches)", type = "density",
panel = function(x, ...) {
panel.histogram(x, ...)
panel.densityplot(x, darg = list(bw=1.5),
col=1)
}
)
Trellis graphics 26
Note that it is not possible to update Trellis graphics after they have been plotted. A Trellis plotting command actually creates a Trellis object which can be saved for example, speed.tr <-bwplot(Expt ~ Speed, main="Speed of Light Data", ylab="Experiment No.", data=morley)
This command creates the object but doesn’t display it. To see what it looks then give the command speed.tr
This will display the graph in the current Trellis device, which is probably to the screen.
A different Trellis device can be set with a command like trellis.device("win.metafile", file="tdev.emf", color=F)
win.metafile (R) and win.graph (S-Plus version 6 or later) create Windows metafile which are great for plugging into Microsoft Word. To see what other devices are available, see help(Devices). Then to display the graph in the new device, run the commands speed.tr
dev.off()
This is similar to using postscript or pdf with regular plotting commands.