Download - Visualization - Statistical Methods
![Page 1: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/1.jpg)
Visualization - Statistical Methods
Sarah Filippi, University of Oxford
20 October 2015Michaelmas Term 2015
![Page 2: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/2.jpg)
First step
The starting point of ALL good statistical data analysis beginswith graphical plots and summary statistics of the data
ALWAYS, ALWAYS, ALWAYS, PLOT YOUR DATA!!!
Graphics reveal data, communicate complex ideas anddependencies with clarity, precision and efficiency.
![Page 3: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/3.jpg)
Graphical excellence
Excellent graphics:
I show the data
I induce the viewer to think about the substance
I avoid bias
I make large complex data sets coherent
I encourage data exploration and debate
![Page 4: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/4.jpg)
![Page 5: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/5.jpg)
Categorical Data
Let’s start by not using one common graph type:
There is no data that can be displayed in a pie chartthat cannot be displayed BETTER in some other type ofchart.J. W. Tukey
And let’s not even think about 3D and exploded pie charts.
![Page 6: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/6.jpg)
What’s the matter with pie charts:
I people are not good at interpreting areas
I small and large slices are relatively distorted
I zero is often a very meaningful number but gets los
I very hard to compare two pie charts
Barplots are usually a much better choice: barplot(height)
![Page 7: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/7.jpg)
Suppose we have a few ordinal or categorical variables: theinteresting questions are then how they vary together. Here is across-tabulation on the caffeine consumption (in mg/day) ofwomen in a maternity ward by marital status. (A contingencytable.)
0 1-150 151-300 300+Married 652 1537 598 242Prev.married 36 46 38 21Single 218 327 106 67
The next two slides show two graphical representations, a set ofbarplots (aka bar charts), and a mosaic plot.Different versions of these plots and other plots for categoricaldata can be found in package vcd.
![Page 8: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/8.jpg)
Married Prev. married Single
00−150150−300300+
020
040
060
080
010
0012
0014
00
Married Prev. married Single
00−150150−300300+
0.0
0.2
0.4
0.6
0.8
1.0
![Page 9: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/9.jpg)
A special case of a mosaic plot is sometimes called a spineplot.
0
0
Married Prev. married
00−
150
150−
300
0.0
0.2
0.4
0.6
0.8
1.0
![Page 10: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/10.jpg)
Barplots
Barplots should not (!) be used to compare distributions of dataacross groups.
![Page 11: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/11.jpg)
Box-and-whisker Plots (boxplots)
2000
2500
3000
3500
4000
Median
1st quartile
3rd quartile
Lower Whisker
Upper Whisker
Outliers
There are about as many variations as software designers.
![Page 12: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/12.jpg)
Parallel box plots are often useful to show the differences betweensubgroups of the data.
●
●●
●
●
●
●
●●●●
(0,100] (100,1000] (1000,1e+04] (1e+04,1e+05]
050
100
150
GDP
Infant mortality
![Page 13: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/13.jpg)
Violin plotsViolin plots replace the representation in a boxplot by avariable-width box determined by a density estimate.
See e.g. the help for function vioplot() in the package of thatname or the help for function panel.violin() in packagelattice.
050
100
150
(0,100] (100,1000] (1000,1e+04] (1e+04,1e+05]
●
●
●
●
![Page 14: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/14.jpg)
infant.mortality
(0,100]
(100,1000]
(1000,1e+04]
(1e+04,1e+05]
0 50 100 150
●
●
●
●
●●● ●
● ● ●● ●●●
![Page 15: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/15.jpg)
Histograms
I Very convenient to study the shape of the distribution of thedata.
I We can choose a set of breakpoints covering the data, andcount how many points fall into each interval.
I Warning: some software plot the counts or the proportions orpercentages.
A true histogram has the area of each bar proportional to thecount, and total area one. This matters if the breaks are notequally spaced. See function truehist() in package MASS.
How do we choose the number and position of the breaks?
hist(data, prob = FALSE, breaks=breaks)
truehist(data, x0)
![Page 16: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/16.jpg)
Histogram of infant.mortality
infant.mortality
Fre
quen
cy
0 50 100 150
020
4060
80
Histogram of infant.mortality
infant.mortality
Den
sity
0 50 100 150
0.00
00.
005
0.01
00.
015
0.02
0
Histogram of infant.mortality
infant.mortality
Fre
quen
cy
0 50 100 150
05
1015
2025
3035
Histogram of infant.mortality
infant.mortality
Den
sity
0 50 100 150
0.00
00.
010
0.02
00.
030
![Page 17: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/17.jpg)
1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
duration
1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
0.5
duration
Duration of Old Faithful eruptions
![Page 18: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/18.jpg)
Density Plots
I Histograms are density plots: the tops of the bars is apiecewise-constant estimator of the underlying pdf.
I We can use smooth estimates of density. Examples:I Kernel density estimates
f̂(x) =1
n
n∑i=1
Kh(x− xi)
where Kh is a kernel and h is the bandwidth.
density() – check arguments bw, from, to...
I Splines or losplines: a spline is a piecewise polynomial functionwhich has smooth properties at the places where thepolynomial pieces connect.
logspline() in package polspline
![Page 19: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/19.jpg)
kernel density
infant.mortality
Den
sity
0 50 100 150
0.00
00.
005
0.01
00.
015
0.02
00.
025
0.03
00.
035
logspline
infant.mortality
Den
sity
0 50 100 150
0.00
00.
005
0.01
00.
015
0.02
00.
025
0.03
00.
035
![Page 20: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/20.jpg)
Rugs
We can add rug below the x axes to highlight position of data. Seethe help for rug() and also jitter().
Possible to have semi-transparent grey rugs by specifying the color:col=rgb(0,0,0,0.25).
1 2 3 4 5
0.0
0.5
1.0
1.5
kernel density
duration
1 2 3 4 5
0.0
0.5
1.0
1.5
logspline
duration
![Page 21: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/21.jpg)
Scatterplots
The canonical plot of two continuous variables is a scatterplot.
Example from UN dataset:plot(infant.mortality ∼ gdp, UN, cex = 0.5)
plot(infant.mortality ∼ gdp, UN, log = "xy",...)
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
0 10000 20000 30000 40000
050
100
150
gdp
infa
nt.m
orta
lity
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
50 100 200 500 2000 5000 20000
25
1020
5010
020
0
gdp
infa
nt.m
orta
lity
![Page 22: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/22.jpg)
Using scatterplot() from package car.
50 100 200 500 1000 2000 5000 10000 20000 50000
25
1020
5010
020
0
gdp
infa
nt.m
orta
lity
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Tonga
Iraq
Afghanistan
Bosnia
Sao.Tome
Sudan
Gabon
Liberia
Korea.Dem.Peoples.Rep French.Guiana
![Page 23: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/23.jpg)
Smoother
I A fitted regression line and a smooth line have beenautomatically added.
I Such smooth curves often help to highlight trends in ascatterplot, but they can also be deceptive. Smoothers arethings we will return to, but see the functionsloess.smooth() and smooth.spline().
I With the scatterplot() function, outliers are automaticallylabelled. It is often best to do this manually with theidentify() function.
![Page 24: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/24.jpg)
It is often useful to visually convey confidence in your plots.
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
0
100
200
300
10 15 20 25 30 35mpg
hp
![Page 25: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/25.jpg)
Scatterplot with color by types
5000 10000 15000 20000 25000
2030
4050
6070
80
income
pres
tige
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
type
bc prof wc
scatterplot(prestige ∼ income|type, data=Prestige,
smoother=FALSE, reg.line=FALSE)
![Page 26: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/26.jpg)
Scatterplot Matrices (or pairs plots)
Sepal.Length
2.0 2.5 3.0 3.5 4.0
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●● ●
●●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●●
●●
●●
●
●●●
●●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●●●
●
●●●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●●
●●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●●
●●
●●
●
●●●
●●
●
●
●
●
●● ●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●●
●●
●
●
0.5 1.0 1.5 2.0 2.5
4.5
5.5
6.5
7.5
●●●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●● ●●●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●●
●●
●●●
●
●
●●●
●●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●●
●●
●
●
2.0
2.5
3.0
3.5
4.0
●
●
●●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●●
● ●
●
●
●
●
●
●●
●
●
●
●
●● ●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●● ●
●
●●
●
●
●
●
●Sepal.Width
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●
●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●●
●
●●
●
●●
● ●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●●●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●●
● ●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●●
●
●
●
●●
● ●●
●
●●
●
●
●
●
●
●●●● ●
●● ●● ● ●●
●● ●
●●●
●●
●●
●
●●
●● ●●●● ●● ●●
● ●●●●
●●●●●
●●
● ●●
●●
●
●
●●●
●
●
●●
●●
●
●
●●●
●
●
●
●
●●
● ●●
●
●
●●●
●
●
● ●●
●●●
●●
●
●
●●● ●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●●●●
●●
●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
●● ●● ●
●●●● ● ●●
●● ●
●●●
●●
●●
●
●●
● ●●●●● ● ●●●● ●●●
●●● ●●
●
●●
● ●●
●●
●
●
●●●
●
●
●●
●●
●
●
●●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
● ●●
●●●
●●
●
●
● ●●●
●
●
●
●
●●
●
●
●
●
●●
●●
●
● ●●
●
●●
●
●
●
●
●
●●
● ●
●●
●●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
Petal.Length
12
34
56
7
●●●●●
●●●●●●●●
●●●●●
●●
●●
●
●●● ●●●●● ●●●●●●
●●●
●●●●
●
●●●●●
●●●
●
●●●
●
●
●●
●●
●
●
●●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
●●●
●●●
●●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●●
●●
●
● ●●
●
●●
●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
4.5 5.5 6.5 7.5
0.5
1.0
1.5
2.0
2.5
●●●● ●
●●
●●●
●●●●
●
●●● ●●
●
●
●
●
● ●
●
●●●●
●
●●●● ●
●● ●
●●●
●
●●
●● ●●
●● ●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
● ●
●
●
●●●
●
●●
●●
●●●●
●
●
●
●●● ●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●●
●
●●
●
●
● ●
●
●
●●●
●
●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●
●
●● ●● ●
●●●●
●●●
●●●
●●● ●●
●
●
●
●
●●
●
●●●●
●
●●●● ●
●● ●
●●●
●
●●
●● ●●
●●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●●
●
●
●●●
●
●●
●●
● ●●●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●●
●
●●
●
●
●●
●
●
●● ●
●
●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●
●
1 2 3 4 5 6 7
●●●●●
●●●●●●●
●●●
●●●●●
●
●
●
●
●●
●
●●●●
●
●●●●●●
●●●●●
●
●●
●●●●
●● ●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●●● ●
●
●
●●
●
●
●●●
●
●●●●
●
●
●
●●●●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●●
●
●●
●
●
● ●
●
●
●●●
●
●
●●
●
●●
●●
●●
●
●●
●
●
●
●
●●
●
●
Petal.Width
Anderson's Iris Data −− 3 species
pairs(iris[1:4], bg = c("red", "green3",
"blue")[unclass(iris$Species)])
![Page 27: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/27.jpg)
Image or contours
The functions image or contour are useful to explore threedimensional data or to illustrate distributions in two dimensions.
−4 −2 0 2 4
−4
−2
02
4
0.02
0.04
0.06
0.08
0.1 0.12
0.1
4
−4 −2 0 2 4−
4−
20
24
![Page 28: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/28.jpg)
Aspect ration of plotThe aspect ratio of a plot is very important:Cleveland/McGill recommended an average slope of about 45◦ asthe eye is most sensitive to departures from 45◦.
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
● ●●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●
−3 −2 −1 0 1 2 3
−6
−4
−2
02
4
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●●●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
−3 −1 0 1 2 3−
6−
4−
20
24
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
![Page 29: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/29.jpg)
Arranging Several Plots on Single Page
par(mfrow=c(2,3))
for(i in 1:6) { plot(1:10) }
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index1:
10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
![Page 30: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/30.jpg)
The layout function allows to divide the plotting device intovariable numbers of rows and columns with the column-widths andthe row-heights specified in the respective arguments.
nf <- layout(matrix(c(1,2,3,3), 2, 2, byrow=TRUE),
c(3,7), c(5,5),respect=TRUE)
for(i in 1:3) { plot(1:10) }
●
●
●
●
●
●
●
●
●
●
2 4 6 8
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
24
68
10
Index
1:10
![Page 31: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/31.jpg)
Make your graphical display easy to understand
I Add labels to your axis (with appropriate font and font size)
I Control the scale of the axes using the commands xlim andylim. Also check the command axes.
I Add clear captions
I Use appropriate colors
I ....
![Page 32: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/32.jpg)
Save a figure in a pdf file
I recommend you to always save your figure in pdf as it makes iteasier to include in LaTeX. This can be done in R using thefollowing command line:
pdf("filename.pdf")
...
dev.off()
You can also specify the size of the figure with the options width
and height – the measures are in inches.
![Page 33: Visualization - Statistical Methods](https://reader030.vdocuments.site/reader030/viewer/2022032915/6242c2d0e82f201ae53166eb/html5/thumbnails/33.jpg)
To watch at home...
TED talk on The beauty of data visualization:
David McCandless turns complex data sets (like worldwide militaryspending, media buzz, Facebook status updates) into beautiful,simple diagrams that tease out unseen patterns and connections.Good design, he suggests, is the best way to navigate informationglut – and it may just change the way we see the world.
Link: http://www.ted.com/talks/david_mccandless_the_
beauty_of_data_visualization