introduction to the data...datacamp communicating with data in the tidyverse the data you are going...

25
DataCamp Communicating with Data in the Tidyverse Introduction to the data COMMUNICATING WITH DATA IN THE TIDYVERSE Timo Grossenbacher Data Journalist

Upload: others

Post on 04-Jul-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Introductiontothedata

COMMUNICATINGWITHDATAINTHETIDYVERSE

TimoGrossenbacherDataJournalist

Page 2: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

ThisismeFindexamplesofdatajournalismonsrfdata.github.io

Page 3: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

ThelaststepintheTidyverseprocess

[1]RforDataScience(http://r4ds.had.co.nz/communicate-intro.html)

Page 4: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Whatyouaregoingtocreate

Page 5: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Page 6: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Thedatayouaregoingtoworkwithilo_working_hours

#Atibble:737x3countryyearworking_hours<chr><chr><dbl>1Australia1980.034.578852Canada1980.034.850003Denmark1980.031.898084Finland1980.035.563465France1980.035.423086Iceland1980.035.846157Italy1980.035.746358Japan1980.040.788469Korea,Rep.1980.055.3076910Norway1980.030.37885#...with727morerows

Page 7: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Thedatayouaregoingtoworkwithilo_hourly_compensation

#Atibble:831x3countryyearhourly_compensation<chr><chr><dbl>1Australia1980.08.442Austria1980.08.873Belgium1980.011.744Canada1980.08.875Denmark1980.010.836Finland1980.08.617France1980.08.908Greece1980.03.729HongKong,China1980.01.5010Ireland1980.06.44#...with821morerows

Page 8: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Theinner_join()verb/function

x%>%inner_join(y,by="key")

#>#Atibble:2×3#>keyval_xval_y#><dbl><chr><chr>#>11x1y1#>22x2y2

[1]RforDataScience(http://r4ds.had.co.nz/relational-data.html#inner-join)

Page 9: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Let'sdothis!

COMMUNICATINGWITHDATAINTHETIDYVERSE

Page 10: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Filteringandplottingthedata

COMMUNICATINGWITHDATAINTHETIDYVERSE

TimoGrossenbacherDataJournalist

Page 11: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

FilterthedataforEuropeancountries

ilo_data%>%filter(country=="Switzerland")

#Atibble:27x4countryyearhourly_compensationworking_hours<fctr><fctr><dbl><dbl>1Switzerland198010.9634.703852Switzerland198110.0134.334623Switzerland198210.3134.123084Switzerland198310.3333.842315Switzerland19849.5233.478856Switzerland19859.5533.359617Switzerland198613.6233.196158Switzerland198716.9033.173089Switzerland198817.8133.1626910Switzerland198916.5432.87308#...with17morerows

Page 12: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

The%in%operator

...equivalentto:

ilo_data%>%filter(country%in%c("Sweden","Switzerland"))

#Atibble:54x4countryyearhourly_compensationworking_hours<fctr><fctr><dbl><dbl>1Sweden198012.4029.169232Switzerland198010.9634.703853Sweden198111.7029.007694Switzerland198110.0134.334625Sweden19829.9929.27885#...with49morerows

ilo_data%>%filter(country=="Sweden"|country=="Switzerland")

Page 13: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Therelationshipbetweenbothindicatorsplot_data<-ilo_data%>%filter(year==2006)

ggplot(plot_data)+geom_histogram(aes(x=working_hours))

plot_data<-ilo_data%>%filter(year==2006)

ggplot(plot_data)+geom_histogram(aes(x=hourly_compensation))

Page 14: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Therelationshipbetweenbothindicators

Page 15: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Addinglabelstotheplot

Page 16: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Somedplyrfunctionrepetition

ilo_data%>%group_by(country)%>%summarize(median_working_hours=median(working_hours))

#Atibble:17x2countrymedian_working_hours<fctr><dbl>1Austria31.699042Belgium32.038463CzechRep.39.100004Finland34.048085France32.34615#...with12morerows

Page 17: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Let'spractice!

COMMUNICATINGWITHDATAINTHETIDYVERSE

Page 18: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Customggplot2themes

COMMUNICATINGWITHDATAINTHETIDYVERSE

TimoGrossenbacherDataJournalist

Page 19: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Theadvantagesofacustomlook

Page 20: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Thetheme()functionggplot(plot_data)+geom_histogram(aes(x=working_hours))+labs(x="Workinghoursperweek",y="Numberofcountries")+

theme(text=element_text(family="Bookman",color="gray25"))

Page 21: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Defaultggplot2themesggplot(plot_data)+geom_histogram(aes(x=working_hours))+labs(x="Workinghoursperweek",y="Numberofcountries")+

theme_classic()

Page 22: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Chainingtheme()callsggplot(plot_data)+geom_histogram(aes(x=working_hours))+labs(x="Workinghoursperweek",y="Numberofcountries")+

theme_classic()+

theme(text=element_text(family="Bookman",color="gray25"))

Page 23: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Themeconfigurationoptions

axis.titlelabelofaxes(element_text;inheritsfromtext)

axis.title.xxaxislabel(element_text;inheritsfromaxis.title)

axis.title.x.topxaxislabelontopaxis(element_text;inheritsfromaxis.title.x)

axis.title.x.bottomxaxislabelonbottomaxis(element_text;inheritsfromaxis.title.x)

?theme

Page 24: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Theelement_*functionfamilyelement_text()element_rect()element_line()element_blank()

ggplot(plot_data)+geom_histogram(aes(x=working_hours))+labs(x="Workinghoursperweek",y="Numberofcountries")+

theme(text=element_blank())

Page 25: Introduction to the data...DataCamp Communicating with Data in the Tidyverse The data you are going to work with ilo_working_hours # A tibble: 737 x 3 country year working_hours

DataCamp CommunicatingwithDataintheTidyverse

Let'stryoutthemes!

COMMUNICATINGWITHDATAINTHETIDYVERSE