open data: analysis and visualisation
DESCRIPTION
This presentation gives an overview of the Open data. A number of case studies are given on the spatio-temporal analysis and visualization of the Social Media data (Twitter). The presentation also explains the creation of a heatmap visualisation by using R.TRANSCRIPT
![Page 1: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/1.jpg)
Muhammad Adnan
Department of Geography, University College London
Web: http://www.uncertaintyofidentity.com
Twitter: @gisandtech
Open Data: Analysis and Visualisation
![Page 2: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/2.jpg)
Dr. Muhammad Adnan• Research Associate
– Working on an EPSRC funded project “Uncertainty of Identity”
– http://www.uncertaintyofidentity.com
• Data Mining• Social Media Analysis• Data Visualisation
Research Interests
![Page 3: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/3.jpg)
Outline
• Open Data
• Crowd-Sourced Data (Social Media)
• Analysis and Visualisation Challenges
• Twitter Case Study• Spatial Analysis• Temporal Analysis
• R• A brief introduction• How to create heat maps
![Page 4: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/4.jpg)
Open data
Data that is:
Open and Free to the public CompleteAccessibleTimely
Machine processableNon-discriminatory
![Page 5: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/5.jpg)
Dataset examples
• National Budgets• Car registries• National roads• Water heights• Schools• Weather• Public transport• Council tax bands• And many more
![Page 6: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/6.jpg)
![Page 7: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/7.jpg)
![Page 8: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/8.jpg)
![Page 9: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/9.jpg)
![Page 10: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/10.jpg)
Census Profiler• http://www.censusprofiler.org/• Users can visualise 2001 Census data
![Page 11: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/11.jpg)
Education Profiler• http://www.educationprofiler.org/• Users can visualise education datasets
![Page 12: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/12.jpg)
Open Data Profiler• http://www.opendataprofiler.com/• Users can visualise 60 different 2011 Census datasets
![Page 13: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/13.jpg)
Crowd Sourced datasets
• Twitter• Public streaming API can be used to download live tweets
• Four Square• Has an API which can be used to access the Four Square data
• Facebook• Facebook applications can access user information
• Flickr• Wikipedia• Youtube
![Page 14: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/14.jpg)
How big are crowd sourced datasets ?• Facebook
• Number of active users: 850 Million• Average daily uploaded photos: 360 Million• Total data size: 30+ Petabytes
• Twitter• Number of active users: 200 Million• Daily tweets (posts): 350 Million
• Foursquare• Number of active users: 15 Million• Total check-ins: 1.5 Billion
![Page 15: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/15.jpg)
What are the issues with these datasets ?
• How representative social media data sets are of the Census or Electoral roll data ?
• Who: Ethnicity, Gender, and Age of social media users
• Where: Where social media conversations are happening and who is leading them• Intelligence about where people are located and what they are doing
• When: What time of day conversations happen
![Page 16: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/16.jpg)
Twitter (www.twitter.com)
• Online social-networking and micro blogging service• Launched in 2006
• Users can send messages of 140 characters or less
• Approximately 200 million active users
• 350 million tweets daily
• In 2012, UK and London were ranked 4th and 3rd, respectively, in terms of the number of posted tweets
![Page 17: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/17.jpg)
Basic Analysis of the Twitter data
![Page 18: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/18.jpg)
Data available through the Twitter API
• User Creation Date• Followers• Friends• User ID• Language• Location• Name• Screen Name• Time Zone
• Geo Enabled• Latitude• Longitude• Tweet date and time• Tweet text
Users can download 1% sample of the live tweets through the API
![Page 19: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/19.jpg)
Created with approx. 100 million tweets
![Page 20: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/20.jpg)
![Page 21: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/21.jpg)
4 million geo-tagged tweets downloaded during August and December, 2012
![Page 22: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/22.jpg)
4 million geo-tagged tweets downloaded during August and December, 2012
![Page 23: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/23.jpg)
Hourly and Daily Twitter Activity in London
![Page 24: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/24.jpg)
Hourly Twitter Activity in London
![Page 25: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/25.jpg)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Monday
Hour
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Tuesday
Hour
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Wednesday
Hour
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Thursday
Hour
Daily Twitter Activity in London
![Page 26: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/26.jpg)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Friday
Hour
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Saturday
Hour
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
0
2000
4000
6000
8000
10000
12000
Sunday
Hour
Daily Twitter Activity in London
![Page 27: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/27.jpg)
Analysis of User Names on Twitter
• A name is a statement of the person’s ethnic, linguistic, and cultural identity.• E.g. Alex Singleton is an Anglo-Saxon name. Similarly, Pablo
mateos is a Spanish (Hispanic) name.
![Page 28: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/28.jpg)
Analysing Names on Twitter
• Some examples of NAME variations on Twitter
Real Names
Kevin Hodge
Andre Alves
Jose de Franco
Carolina Thomas, Dr.
Prof. Martha Del Val
Fabíola Sanchez Fernandes
Fake Names
JustinBieber_Home.
WHAT IS LOVE?
MysticMind
KIRILL_aka_KID
Vanessa
Petuna
![Page 29: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/29.jpg)
Analysing Names on Twitter• Some examples of NAME variations on Twitter
Real Names
Kevin Hodge -> F: ‘Kevin’ ; S: ‘Hodge’
Andre Alves -> F: ‘Andre’ ; S: ‘Alves’
Jose De Franco -> F: ‘Jose’ ; S: ‘De Franco’
Carolina Thomas, Dr. -> F: ‘Carolina’ ; S: ‘Thomas’
Prof. Martha Del Val -> F: ‘Martha’ ; S: ‘Del Val’
Fabíola Sanchez Fernandes -> F: ‘Fabíola’ ; S: ‘Fernandes’
![Page 30: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/30.jpg)
Where they tweet from:
Surname: JONES
![Page 31: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/31.jpg)
Where they tweet from:
Surname: DEE
![Page 32: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/32.jpg)
Where they tweet from:
Surname: SHAH
![Page 33: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/33.jpg)
• A name is a statement of the person’s ethnic, linguistic, and cultural identity.• E.g. Alex Singleton is an Anglo-Saxon name. Similarly, Pablo
mateos is a Spanish (Hispanic) name.
Predicting Ethnicity of Twitter Users by using their ‘Names’
![Page 34: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/34.jpg)
Classifying Twitter Data to ethnic origins
• Applied ONOMAP (www.onomap.org) on FORENAME + SURNAME pairs
Kevin Hodge (ENGLISH)
Pablo Mateos (Spanish)
…
…
…
…
![Page 35: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/35.jpg)
Top 10 Ethnic Groups of Twitter Users
![Page 36: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/36.jpg)
English Italian
Pakistani Indian
TurkishGreek
Bangladeshi
Spanish
German French
Portuguese
Sikh
Tweeting Activity by different Ethnic Groups
![Page 37: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/37.jpg)
• Onomap groups were aggregated to match the appropriate groups from the Census
London TotalWhite British
White other
Indian Pakistani BangladeshiBlack African
Chinese
Week Night
53611 71.35% 12.12% 2.63% 2.63% 1.82% 1.52% 1.74%
Week Day 80676 73.12% 11.80% 2.41% 2.41% 1.56% 1.25% 1.61%
Weekend 67351 72.86% 12.17% 2.61% 2.61% 1.67% 1.39% 1.73%
Comparison of Ethnic Groups between ‘2011 Census’ and ‘Twitter’
2011 Census 44.89% 12.65% 6.64% 2.74% 2.72% 7.02% 1.52%
![Page 38: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/38.jpg)
Comparison of the distribution of ethnicity with the 2011 Census
2011 Census Twitter
White British (Quintiles)
![Page 39: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/39.jpg)
Gender and Age Analysis of Twitter Users by using their ‘forenames’
![Page 40: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/40.jpg)
Gender Analysis of Twitter Users
Male Female Unisex Not Found0%
10%
20%
30%
40%
50%
60%
Number of Tweets Number of Unique Users
![Page 41: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/41.jpg)
Age estimation from ‘forenames’
0-4 5-9 10-14
15-19
20-24
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65-69
70-74
75-79
80-84
85+0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
PAUL BETTY GUY MUHAMMAD
Age group
Per
cen
t
Data: Monica (CACI, Ltd.) and Birth Certificate Data (Office of National Statistics)
![Page 42: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/42.jpg)
Age-Sex structure of Twitter Users and 2011 Census
Male Female
![Page 43: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/43.jpg)
Tweets by different Land-use Categories
![Page 44: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/44.jpg)
Temporal Activity: Tweets from different Land-use Categories
![Page 45: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/45.jpg)
Ethnic Segregation of Twitter Users
![Page 46: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/46.jpg)
Segregation Analysis
• To find out the level of integration/segregation of different types of Twitter users
• During different hours of the week and weekends
• Information Theory Index
![Page 47: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/47.jpg)
Segregation Analysis
• The value of the information theory index is between 0 (low segregation) and 1 (high segregation).
Ethnic Groups H (Domestic buildings and
gardens)
H (Week Nights) H (Week Days) H (Weekend)
British 0.483 0.401 0.211 0.315
Irish 0.670 0.571 0.357 0.475
White Other 0.630 0.510 0.303 0.420
Pakistani 0.765 0.679 0.488 0.633
Indian 0.748 0.673 0.451 0.590
Bangladeshi 0.864 0.834 0.671 0.784
Black Caribbean 0.831 0.808 0.548 0.666
Black African 0.764 0.704 0.492 0.640
Chinese 0.712 0.608 0.403 0.524
Other 0.710 0.593 0.374 0.497
![Page 48: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/48.jpg)
Extending the analysis to other cities
![Page 49: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/49.jpg)
Tweet density map of London
![Page 50: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/50.jpg)
Tweet density map of Paris
![Page 51: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/51.jpg)
Tweet density map of New York City
![Page 52: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/52.jpg)
Top 10 ethnic groups in London
![Page 53: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/53.jpg)
Top 10 ethnic groups in Paris
![Page 54: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/54.jpg)
Top 10 ethnic groups in NYC
![Page 55: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/55.jpg)
English Spanish
GermanJewish
Irish Italian
Portuguese
Tweeting Activity by different Ethnic Groups (NYC)
Scottish
Black Caribbean
Chinese
![Page 56: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/56.jpg)
French
GermanTurkish
Spanish Italian
Portuguese
Tweeting Activity by different Ethnic Groups (Paris)
English
Polish
![Page 57: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/57.jpg)
Gender Analysis
![Page 58: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/58.jpg)
Exploring the Languages on Twitter
![Page 59: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/59.jpg)
Data available through the Twitter API
• User Creation Date• Followers• Friends• User ID• Language• Location• Name• Screen Name• Time Zone
• Geo Enabled• Latitude• Longitude• Tweet date and time• Tweet text
![Page 60: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/60.jpg)
Twitter Languages (World)
![Page 61: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/61.jpg)
Twitter Languages (Europe)
![Page 62: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/62.jpg)
Twitter Language Maps
![Page 63: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/63.jpg)
Twitter Language Maps
![Page 64: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/64.jpg)
Twitter Language Maps
![Page 65: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/65.jpg)
Temporal Analysis of the data sets
![Page 66: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/66.jpg)
Temporal Analysis of the Twitter Data
• Data: 12 September, 2012 – 25 September, 2013
• We extracted a total of approx. 800 million tweets over the last year
• A temporal activity analysis of different cities could potentially reveal a lot of information about the residents of the city
• But Twitter data is not clean and has lots of problems !
![Page 67: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/67.jpg)
Problems with the data
1) Extracting the data for individual cities or places
• Use of bounding boxes to extract the data• New York City NW: 40.91762, -73.7004 SW: 40.47662, -74.2589
• http://isithackday.com could be used to find the bounding boxes of different cities
![Page 68: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/68.jpg)
Problems with the data
2) Twitter data has a GMT and BST timestamp. Conversion to other time stamp is very time consuming
• 12p.m. in ‘London’ is 5a.m in Los Angeles, if the time stamp is GMT.• 12p.m. in ‘London’ is 6a.m in Los Angeles, if the time stamp is BST.
![Page 69: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/69.jpg)
Temporal Analysis of different cities
Jaka
rta
Ista
nbul
Paris
Sao P
aulo
New Y
ork C
ity
London
Los Angel
es
Rio d
e Ja
nerio
Mex
ico C
ity
Riyad
h
Tokyo
Chicag
o
Buenos
Aires
Mad
rid
Dalla
s
Philadel
phia
Man
ches
ter
Houston
Was
hingto
n
Toronto
Boston
Seoul (
Korea)
Dubai
San F
anci
sco
Osaka
(Jap
an)
Atlanta
Sydney
Mel
bourne
Glasg
ow
Dublin
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
40,000,000
Nu
mb
er o
f T
wee
ts (
Mill
ion
s)
• Approx. 170 million tweets were sent from the following 30 cities.
![Page 70: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/70.jpg)
Temporal Analysis of different cities
LONDON
![Page 71: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/71.jpg)
Temporal Analysis of different cities
LONDON
PARIS
![Page 72: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/72.jpg)
Temporal Analysis of different cities
JAKARTA
![Page 73: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/73.jpg)
Temporal Analysis of different cities
JAKARTARIYADH
![Page 74: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/74.jpg)
Temporal Analysis of different cities
JAKARTA
ISTANBUL
![Page 75: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/75.jpg)
Introduction to R
![Page 76: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/76.jpg)
What is R?
• The R statistical programming language is a free open source package based on the S language developed by Bell Labs.
• The language is very powerful for writing programs.
• Many statistical functions are already built in.
• Very easy to create maps and different visualizations.
![Page 77: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/77.jpg)
• You will have to write some code to get the things done !
• R is available @ www.r-project.org
• Supports both 32 and 64 bit Windows PCs, Linux, Unix, and Mac OS operating sytems
What is R?
![Page 78: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/78.jpg)
Getting Started
• The R GUI?
![Page 79: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/79.jpg)
Getting Started
![Page 80: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/80.jpg)
80
Interacting with R
> 1 + 1[1] 2
> 1 + 1 * 7[1] 8
> (1 + 1) * 7[1] 14
> sqrt(16)[1] 4
> x <- 1> x[1] 1 > y <- 2> y[1] 2> z <- x+y> z[1] 3
Math: Variables:
![Page 81: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/81.jpg)
Importing Data
• How do we get data into R?
• First make sure your data is in an easy to read format such as CSV (Comma Separated Values)
• Use code:– D <- read.csv(“path”,sep=“,”,header=T)– D <- read.table(“path”,sep=“,”,header=T)
![Page 82: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/82.jpg)
Working with data.
• Accessing columns.• D has our data in it…. But you can’t see it directly.• To select a column use D$column.
![Page 83: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/83.jpg)
Basic Graphics
• Histogram– hist(D$wg)
![Page 84: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/84.jpg)
How to create a heat map in R ?
![Page 85: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/85.jpg)
How to create a heat map in R ?
• Three steps:– Read a CSV file– Chose the colours for the heat map– Create the heat map
![Page 86: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/86.jpg)
How to create a heat map in R ?
• Step 1: Read a CSV fileread.csv(“FILE NAME", sep=",", header=T)
![Page 87: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/87.jpg)
How to create a heat map in R ?
• Step 1: Read a CSV fileread.csv(“FILE NAME", sep=",", header=T)
• Assign it to a variableInput <- read.csv(“FILE NAME", sep=",", header=T)
i.e. with ‘<‘ (less than) and ‘-’ (dash) symbols.
![Page 88: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/88.jpg)
How to create a heat map in R ?
• Step 1: Read a CSV file
![Page 89: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/89.jpg)
How to create a heat map in R ?
• Step 2: Chose the colours for the heat map
colours <- c(0) (Create an empty variable)
![Page 90: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/90.jpg)
How to create a heat map in R ?
• Step 2: Chose the colours for the heat map
colours <- c(0)
colours[1] <- "#FDD49E"
colours[2] <- "#FDBB84"
colours[3] <- "#FC8D59"
colours[4] <- "#EF6548"
colours[5] <- "#D7301F"
colours[6] <- "#B30000"
colours[7] <- "#7F0000"
![Page 91: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/91.jpg)
How to create a heat map in R ?
• Step 2: Chose the colours for the heat map
colours <- c(0)
colours[1] <- "#FDD49E"
colours[2] <- "#FDBB84"
colours[3] <- "#FC8D59"
colours[4] <- "#EF6548"
colours[5] <- "#D7301F"
colours[6] <- "#B30000"
colours[7] <- "#7F0000"
![Page 92: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/92.jpg)
How to create a heat map in R ?
• Step 3: Create the heat map
heatmap(Input1_matrix, scale="col", Rowv = NA, Colv = NA, col=colours)
![Page 93: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/93.jpg)
How to create a heat map in R ?
• Step 3: Create the heat map
heatmap(Input1_matrix, scale="col", Rowv = NA, Colv = NA, col=colours)
Input Data
![Page 94: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/94.jpg)
How to create a heat map in R ?
• Step 3: Create the heat map
heatmap(Input1_matrix, scale="col", Rowv = NA, Colv = NA, col=colours)
Whether to apply scaling on the data. Options are ‘col’, ‘row’, and ‘none’.
![Page 95: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/95.jpg)
How to create a heat map in R ?
• Step 3: Create the heat map
heatmap(Input1_matrix, scale="col", Rowv = NA, Colv = NA, col=colours)
Leave them as they are!
![Page 96: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/96.jpg)
How to create a heat map in R ?
• Step 3: Create the heat map
heatmap(Input1_matrix, scale="col", Rowv = NA, Colv = NA, col=colours)
Colours
![Page 97: Open Data: Analysis and Visualisation](https://reader033.vdocuments.site/reader033/viewer/2022061306/54b3b5744a795958348b45fe/html5/thumbnails/97.jpg)
Any Questions ?
• Open Data• Crowd-Sourced Data (Social Media)• Analysis and Visualisation Challenges• Twitter Case Study
• Spatial Analysis• Temporal Analysis
• R• A brief introduction• How to create heat maps