r and visualization: a match made in heaven

47
www.edureka.co/r-for-analyti R and Visualization A Match Made in Heaven

Upload: edureka

Post on 14-Apr-2017

487 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: R and Visualization: A match made in Heaven

www.edureka.co/r-for-analytics

R and Visualization A Match Made in Heaven

Page 2: R and Visualization: A match made in Heaven

Slide 2Slide 2Slide 2 www.edureka.co/r-for-analytics

Today we will know about :

Have a basic understanding of Data Visualization as a field

Create basic and advanced Graphs in R

Change colors or use custom palettes

Customize graphical parameters

Learn basics of Grammar of Graphics

Spatial analysis Visualization

Agenda

Page 3: R and Visualization: A match made in Heaven

Slide 3Slide 3Slide 3 www.edureka.co/r-for-analytics

Part 1 : What is Data Visualization ?

Study of the visual representation of data

More than pretty graphs Gives insights Helps decision making Accurate and truthful

Why Data Visualization?"Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the useof statistics to bolster weak argumentCue to Anscombe-Case StudySource- Anscombe (1973) http://www.sjsu.edu/faculty/gerstman/StatPrimer/anscombe1973.pdf

Data Visualization In R

Page 4: R and Visualization: A match made in Heaven

Slide 4Slide 4Slide 4 www.edureka.co/r-for-analytics

> cor(mtcars)

Part 4 : Does This Make Sense?

Data Visualization In R

Page 5: R and Visualization: A match made in Heaven

Slide 5Slide 5Slide 5 www.edureka.co/r-for-analytics

Part 4 : Does This Make Better Sense?

>library(corrgram)> corrgram(mtcars) RED is negative BLUE is positiveDarker the color, more the correlation

Data Visualization In R

Page 7: R and Visualization: A match made in Heaven

Slide 7Slide 7Slide 7 www.edureka.co/r-for-analytics

Part 2 : John Maeda on Laws of Simplicity

Data Visualization In R

Also - http://lawsofsimplicity.com/

Page 8: R and Visualization: A match made in Heaven

Slide 8Slide 8Slide 8 www.edureka.co/r-for-analytics

Part 2 : Leland Wilkinson/Hadley Wickham on Grammar of Graphics

When creating a plot we start with data We can create many different types of plots using this same basic specification.

(Bars, lines, and points are all examples of geometric objects) We can scale the axes We can statistically transform the data (bins, aggregates) The concept of LayersPlot = data 1 + scales and coordinate system 2 + plot

annotations 3

1 data plot type 2 Axes and legends 3 background and plot title

See - http://vita.had.co.nz/papers/layered-grammar.pdf

Grammar of Graphics

Page 9: R and Visualization: A match made in Heaven

Slide 9Slide 9Slide 9 www.edureka.co/r-for-analytics

Part 2 : Leland Wilkinson/Hadley Wickham on Grammar of Graphics

The layered grammar defines the components of a plot as:

A default dataset and set of mappings from variables to aesthetics, One or more layers, with each layer having one geometric object, one statistical transformation, one

position adjustment, and optionally, one dataset and set of aesthetic mappings, One scale for each aesthetic mapping used, A coordinate system, The facet specification

Grammar of Graphics

Page 10: R and Visualization: A match made in Heaven

Slide 10Slide 10Slide 10 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R (and which one should we use when?)

Pie Chart (never use them) Scatter Plot (always use them?) Line Graph (Linear Trend) Bar Graphs (When are they better than Line graphs?) Sunflower plot (overplotting) Rug Plot Density Plot Histograms (Give us a good break!) Box Plots

Basic graphs in R

Page 11: R and Visualization: A match made in Heaven

Slide 11Slide 11Slide 11 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

plot(iris) Plot the entire object See how variables behave with each other

Basic graphs in R

Page 12: R and Visualization: A match made in Heaven

Slide 12Slide 12Slide 12 www.edureka.co/r-for-analytics

Part 3 Basic graphs in R

Plot(iris$Sepal.Length, iris$Species)

Plot two variables at a time to closely examine relationship

Basic graphs in R

Page 13: R and Visualization: A match made in Heaven

Slide 13Slide 13Slide 13 www.edureka.co/r-for-analytics

Part 3 Basic graphs in R

plot(iris$Species, iris$Sepal.Length) Plot two variables at a time Order is important

Hint- Keep factor variables to X axis Box Plot- Five Numbers! minimum, first quartile, median,third quartile, maximum.

Basic graphs in R

Page 14: R and Visualization: A match made in Heaven

Slide 14Slide 14Slide 14 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

plot(iris$Sepal.Length)

Plot one variable

Scatterplot

Basic graphs in R

Page 15: R and Visualization: A match made in Heaven

Slide 15Slide 15Slide 15 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

plot(iris$Sepal.Length, type='l')

Plot with type='l'

Used if you need trend (usually with respect to time)

Line graph

Basic graphs in R

Page 16: R and Visualization: A match made in Heaven

Slide 16Slide 16Slide 16 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

plot(iris$Sepal.Length, type='h')Graph

Basic graphs in R

Page 17: R and Visualization: A match made in Heaven

Slide 17Slide 17Slide 17 www.edureka.co/r-for-analytics

Part 3 Basic graphs in R

barplot(iris$Sepal.Length) Bar graph

Basic graphs in R

Page 18: R and Visualization: A match made in Heaven

Slide 18Slide 18Slide 18 www.edureka.co/r-for-analytics

Part 3 Basic graphs in R

pie(table(iris$Species)) Pie graph NOT Recommended

Basic graphs in R

Page 19: R and Visualization: A match made in Heaven

Slide 19Slide 19Slide 19 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

hist(iris$Sepal.Length)

Basic graphs in R

Page 20: R and Visualization: A match made in Heaven

Slide 20Slide 20Slide 20 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

hist(iris$Sepal.Length,breaks=20)

Basic graphs in R

Page 21: R and Visualization: A match made in Heaven

Slide 21Slide 21Slide 21 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

plot(density(iris$Sepal.Length)

Basic graphs in R

Page 22: R and Visualization: A match made in Heaven

Slide 22Slide 22Slide 22 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in R

boxplot(iris$Sepal.Length)

Boxplot

Basic graphs in R

Page 23: R and Visualization: A match made in Heaven

Slide 23Slide 23Slide 23 www.edureka.co/r-for-analytics

Part 3 : Basic graphs in RBoxplot with Rug

>boxplot(iris$Sepal.Length)

>rug(iris$Sepal.Length,side=2)

Adds a rug representation (1-d plot) of the data to the plot.

Basic graphs in R

Page 24: R and Visualization: A match made in Heaven

Slide 24Slide 24Slide 24 www.edureka.co/r-for-analytics

Part 3 Customizing Graphs

Multiple graphs on same screen

par(mfrow=c(3,2))

> sunflowerplot(iris$Sepal.Length)

> plot(iris$Sepal.Length)

> boxplot(iris$Sepal.Length)

> plot(iris$Sepal.Length,type="l")

> plot(density(iris$Sepal.Length))

> hist(iris$Sepal.Length)

Customizing Graphs

Page 25: R and Visualization: A match made in Heaven

Slide 25Slide 25Slide 25 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs

Multiple graphs on same screen

par(mfrow=c(3,2))

> sunflowerplot(iris$Sepal.Length)

> plot(iris$Sepal.Length)

> boxplot(iris$Sepal.Length)

> plot(iris$Sepal.Length,type="l")

> plot(density(iris$Sepal.Length))

> hist(iris$Sepal.Length)

???

Customizing Graphs

Page 26: R and Visualization: A match made in Heaven

Slide 26Slide 26Slide 26 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs

Multiple graphs on same screen

par(mfrow=c(3,2))

> sunflowerplot(iris$Sepal.Length)

> plot(iris$Sepal.Length)

> boxplot(iris$Sepal.Length)

> plot(iris$Sepal.Length,type="l")

> plot(density(iris$Sepal.Length))

> hist(iris$Sepal.Length)

Over-plotting

Customizing Graphs

Page 27: R and Visualization: A match made in Heaven

Slide 27Slide 27Slide 27 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs

X Axis, Y Axis, Title, Color

par(mfrow=c(1,2))

> plot(mtcars$mpg,mtcars$cyl,main="Example

Title",col="blue",xlab="Miles per Gallon",

ylab="Number of Cylinders")

> plot(mtcars$mpg,mtcars$cyl)

Customizing Graphs

Page 28: R and Visualization: A match made in Heaven

Slide 28Slide 28Slide 28 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs

Background

Try a variation of this yourself par(bg="yellow") boxplot(mtcars$mpg~mtcars$gear)

Customizing Graphs

Page 29: R and Visualization: A match made in Heaven

Slide 29Slide 29Slide 29 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs Use Color Palettes

> par(mfrow=c(3,2))> hist(VADeaths,col=heat.colors(7),main="col=heat.colors(7)")> hist(VADeaths,col=terrain.colors(7),main="col=terrain.colors(7)")> hist(VADeaths,col=topo.colors(8),main="col=topo.colors(8)")> hist(VADeaths,col=cm.colors(8),main="col=cm.colors(8)")> hist(VADeaths,col=cm.colors(10),main="col=cm.colors(10)")> hist(VADeaths,col=rainbow(8),main="col=rainbow(8)")

source- http://decisionstats.com/2011/04/21/using-color-palettes-in-r/

Customizing Graphs

Page 30: R and Visualization: A match made in Heaven

Slide 30Slide 30Slide 30 www.edureka.co/r-for-analytics

Part 3 : Customizing Graphs

Use Color Palettes in RColorBrewer

> library(RColorBrewer)

> par(mfrow=c(2,3))

> hist(VADeaths,col=brewer.pal(3,"Set3"),main="Set3 3 colors")

> hist(VADeaths,col=brewer.pal(3,"Set2"),main="Set2 3 colors")

> hist(VADeaths,col=brewer.pal(3,"Set1"),main="Set1 3 colors")

> hist(VADeaths,col=brewer.pal(8,"Set3"),main="Set3 8 colors")

> hist(VADeaths,col=brewer.pal(8,"Greys"),main="Greys 8 colors")

> hist(VADeaths,col=brewer.pal(8,"Greens"),main="Greens 8 colors")

source- http://decisionstats.com/2012/04/08/color-palettes-in-r-using-rcolorbrewer-rstats/

Customizing Graphs

Page 31: R and Visualization: A match made in Heaven

Slide 31Slide 31Slide 31 www.edureka.co/r-for-analytics

Part 4 Advanced Graphs

Hexbin for over plotting

(many data points at same) library(hexbin)

plot(hexbin(iris$Species,iris$Sepal.Length))

Advanced Graphs

Page 32: R and Visualization: A match made in Heaven

Slide 32Slide 32Slide 32 www.edureka.co/r-for-analytics

Part 4 Advanced Graphs

Hexbin for over plotting

(many data points at same)

library(hexbin)

plot(hexbin(mtcars$mpg,mtcars$cyl))

Advanced Graphs

Page 33: R and Visualization: A match made in Heaven

Slide 33Slide 33Slide 33 www.edureka.co/r-for-analytics

Part 4 : Advanced Graphs

Tabplot for visual summary of a dataset

library(tabplot)

tableplot(iris)

Advanced Graphs

Page 34: R and Visualization: A match made in Heaven

Slide 34Slide 34Slide 34 www.edureka.co/r-for-analytics

Part 4 : Advanced Graphs

Tabplot for visual summary of a dataset

library(tabplot)

tableplot(mtcars)

Advanced Graphs

Page 35: R and Visualization: A match made in Heaven

Slide 35Slide 35Slide 35 www.edureka.co/r-for-analytics

Part 4 Advanced Graphs

Tabplot for visual summary of a dataset

Can summarize a lot of data relatively fast

library(tabplot)

library(ggplot)

tableplot(diamonds

)

Advanced Graphs

Page 36: R and Visualization: A match made in Heaven

Slide 36Slide 36Slide 36 www.edureka.co/r-for-analytics

Part 4 : Advanced Graphs

vcd for categorical data

mosaic

library(vcd)

mosaic(HairEyeColor

)

Advanced Graphs

Page 37: R and Visualization: A match made in Heaven

Slide 37Slide 37Slide 37 www.edureka.co/r-for-analytics

Part 4 : Advanced Graphs

• vcd for categorical data

• mosaic

library(vcd)

mosaic(Titanic)

Advanced Graphs

Page 38: R and Visualization: A match made in Heaven

Slide 38Slide 38Slide 38 www.edureka.co/r-for-analytics

Part 4 : Lots of Graphs in R

heatmap(as.matrix(mtcars))

Advanced Graphs

Page 39: R and Visualization: A match made in Heaven

Slide 39Slide 39Slide 39 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis

Base R includes many functions that can be used for reading, vizualising, and analysing spatial data. The focus is on "geographical" spatial data, where observations can be identified with geographical locations

Sources –

http://spatial.ly/r/

http://cran.r-project.org/web/views/Spatial.html

http://rspatial.r-forge.r-project.org/

Spatial Analysis

Page 40: R and Visualization: A match made in Heaven

Slide 40Slide 40Slide 40 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Examplelibrary(sp) library(maptools)nc <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1],proj4string=CRS("+proj=longlat +datum=NAD27")) names(nc)# create two dummy factor variables, with equal labels: set.seed(31)nc$f = factor(sample(1:5,100,replace=T),labels=letters[1:5]) nc$g = factor(sample(1:5,100,replace=T),labels=letters[1:5])library(RColorBrewer)## Two (dummy) factor variables shown with qualitative colour ramp; degrees in axesspplot(nc, c("f","g"), col.regions=brewer.pal(5, "Set3"), scales=list(draw = TRUE))

Spatial Analysis

Page 41: R and Visualization: A match made in Heaven

Slide 41Slide 41Slide 41 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Examplelibrary(sp) library(maptools)

nc <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1], proj4string=CRS("+proj=longlat +datum=NAD27"))names(nc)# create two dummy factor variables, with equal labels: set.seed(31)nc$f = factor(sample(1:5,100,replace=T),labels=letters[1:5]) nc$g = factor(sample(1:5,100,replace=T),labels=letters[1:5]) library(RColorBrewer)## Two (dummy) factor variables shown with qualitative colour ramp; degrees in axesspplot(nc, c("f","g"), col.regions=brewer.pal(5, "Set3"), scales=list(draw = TRUE))

Spatial Analysis

Page 42: R and Visualization: A match made in Heaven

Slide 42Slide 42Slide 42 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Example

library(raster)

alt <- getData('alt', country =

"IND")

plot(alt)

Spatial Analysis

Page 43: R and Visualization: A match made in Heaven

Slide 43Slide 43Slide 43 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Example

library(raster)

gadm<- getData('GADM', country = "IND",

level=3)

head(gadm)

table(gadm$NAME_1)

gadm_GUJ=subset(gadm,gadm$NAME_1=="Guj

arat")

Spatial Analysis

Page 44: R and Visualization: A match made in Heaven

Slide 44Slide 44Slide 44 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Example

library(raster)

gadm<- getData('GADM', country =

"IND", level=3) head(gadm)

table(gadm$NAME_1)

gadm_GUJ=subset(gadm,gadm$NAME

_1=="Gujarat")

Spatial Analysis

Page 45: R and Visualization: A match made in Heaven

Slide 45Slide 45Slide 45 www.edureka.co/r-for-analytics

Part 5 : Spatial Analysis : Example

library(raster)

gadm<- getData('GADM', country =

"IND", level=3) head(gadm)

table(gadm$NAME_1)

gadm_GUJ=subset(gadm,gadm$NAME

_1=="Gujarat")

Spatial Analysis

Page 46: R and Visualization: A match made in Heaven

Slide 46

Your feedback is vital for us, be it a compliment, a suggestion or a complaint. It helps us to make your experience better!

Please spare few minutes to take the survey after the webinar.

Survey

Page 47: R and Visualization: A match made in Heaven