infering activities from social media datasfaye.com/wsm17/constantinos_antoniou.pdf · 9,135 venues...
TRANSCRIPT
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017
Emmanouil Chaniotakis and Constantinos Antoniou
Chair of Transportation Systems Engineering Department of Civil, Geo and Engironmental EngineeringTechnical University of Munich (TUM)
Infering Activities From Social Media Data
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 2
Facebook the 3rd most visited website
Twitter the 13th most visited website
Facebook 1.09 billion daily active users
Twitter 100 million daily active users
MotivationFacebook Travel Diary Survey
Foursquare Twitter
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 3
Two major challenges• Functional use of Social Media• Data availability
Social Media in Transportation: Challenges
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 4
Social Media Functionalities
Chaniotakis, Antoniou, Pereira. "Mapping Social Media for Transportation Studies." IEEE Intelligent Systems 31.6 (2016): 64-70.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 5
Social Media Data Availability
Chaniotakis, Antoniou, Pereira. "Mapping Social Media for Transportation Studies." IEEE Intelligent Systems 31.6 (2016): 64-70.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 6
Social Media in Transportation
Chaniotakis, Antoniou, Pereira. "Mapping Social Media for Transportation Studies." IEEE Intelligent Systems 31.6 (2016): 64-70.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 7
Empirical – Exploratory Study for understanding Social Media useGoalsEmpirically compare travel demand from Social Media and conventional travel survey
Activities (based on destination type)
Temporal characteristics• Distribution comparison• Correlations in periods
Spatial Distribution• Zonal Visits• Concentration of Arrivals
Social Media for Travel Demand Data
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 8
Datasets• 3 Social Media
• Facebook • Twitter• Foursquare
• Recent travel diary survey
Study Area• Thessaloniki, Greece• 2nd largest city in Greece• 1m inhabitants (metropolitan area)• Moderate Social Media use
1st Case Study: SM vs. Surveys
City Centre
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 9
Data from 3 SM sources:• Twitter• Facebook• Foursquare
In all 3 cases• Geographically bounded data collection• Use of public APIs• Automatic data collection programs• Periodical collection of data
Social Media Data
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 10
Data Collection of both Real-time and Historical (per user) data:• 7,856 Twitter users using Real-time data collection (~1 year)• User’s timeline data collection (tweets not restricted to Thessaloniki)• 49,169 geotagged tweets (only in Thessaloniki)
Type of information:• Unique user ID• unique tweet ID • tweet text • time of tweet • (lon, lat)
Twitter Data
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 11
Data Collection of venues check-ins (public data)• 30 min query interval• Area based search à venues & characteristics• Centre (lat,lon) & radius (150m)• Only in the city centre of Thessaloniki
(due to API request number limitations)
7,511 Venues from Facebook9,135 Venues from Foursquare
Type of Information• venue ID• the venue name• the venue category• the total number of visitors
Facebook & Foursquare Data
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 12
Activities
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 13
Temporal Analysis (1)
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 14
Temporal Analysis (2)
Chaniotakis, Antoniou, Salanova, Dimitriou, "Can Social Media data augment travel demand survey data?."Intelligent Transportation Systems (ITSC), 2016 IEEE 19th International Conference on. IEEE, 2016.
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 15
Text and geolocation offered by one SM platformLocation characterization is offered by anotherThe connection between them may be available (URL)User-centric approach is possible.
2nd Case study: Inferring Activities from Social Media
PossiblythequietestWagamama I'veeverbeenin- andinthemiddleofLondon!!https://t.co/pY3UVIV4iw
Twitter & 4SQ Dataset
Recurrence Classification
per userData Enrichment Location Type
Text Classification
Twitter Dataset
{subset}
{trained classification}
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 16
Inferring Activities from Social Media (2)
Location Characterization
Enrichment
Twitter Dataset
{trained classification}
Definition of DBSCAN parameters
Per user DBSCAN application
Characterization of tweets’ clusters
Twitter Data Enrichment (Figure 1)
Tweets dataset classified on recurrence
Combined Twitter & Foursquare
Dataset
Enriched – Classified Dataset
Document Term Matrix Creation
Text cleaning
Machine Learning Algorithms application
Start
Identified Activities
Foursquare Dataset
{subset}
End
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 17
2 years’ data collected from London (Twitter API)482,883 unique users Collected timeline (for a random sample of 90,000 users)11,060,814 tweets in total
Datasets (1)
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 18
Subset Formation• 101.8 tweets with a standard deviation of 311.4• 0.8% standard deviation of percentage posted per day (close to uniform)
Datasets (2)
In London Twitter Dataset482,883 Users
8,141,996 geotagged tweets3,764,230 URLs
220,118 Foursquare URLs
{subset}{nTweet > 22 & mean dist >1}
Examined Dataset50,344 Users
5,080,362 geotagged tweets 2,127,071 URLs
145,192 Foursquare URLs
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 ”Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 19
8.3% of subset have posted at least one 4SQ link39% of Twitter-4SQ users’ posts include 4SQ link
Density–based Spatial Clustering of Applications with Noise• Point parameter (Eps) = 0.002 (GPS accuracy)• Clustering > 5 posts per location
Recurrence Classification (1)
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 20
Average number of clusters = 2.41 Standard deviation of 3.18Significant amount of users, who tend to only post in locations
• Activities posted performed scarcelyFrom 65,806 tweets with activities à 172,675 known tweets
Recurrence Classification (2)
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 21
Combine 4SQ & Twitter Data
Data Enrichment
Twitter API
Tweet Post Text
4SQ API
Twitter Posts
Tweets with Venue Information
Tweet Post Text
URL Extraction
(regex)
URL Extension
4SQ link?
4SQ query
Yes
n<N
Yes
No
JSON parser
Combined Information
Start
End No
1 Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 ”Inferring Activities from Social Media Data."
In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 22
Manually aggregated in 14 categoriesTendency towards leisure activities Education and work higher represented during week days
4SQ Activities Distribution
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 23
Linked tweet text with activitiesRemoved stop words, punctuation
Automatic Text Classification:• Support Vector Machine (SVM)• Generalized Linear Model via Penalized Maximum Likelihood (GLMPML)• Maximum Entropy (MAXENT)
• RTextTools library• Repeated 10 times• 20,000 randomly selected cases • 85% – (17,000 entries) training set• 15% – (3000 remaining entries) testing set
Location Type Classification
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 24
Acceptable results for MAXENT and GLM classification• Limited text in Twitter• Little manual intervention• Limited cases used
Classification Results
AveragePrecision
PrecisionStandardDeviation
AverageRecall
RecallStandardDeviation
Average F–Score
F–ScoreStandard Deviation
SVM 0.6374 0.0104 0.5808 0.0132 0.6006 0.0117MAXENT 0.8060 0.0112 0.7773 0.0081 0.7881 0.0080GLM 0.8392 0.0078 0.6752 0.0068 0.7249 0.0064
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 25Max Entropy
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 26
• Confusion Matrix for Max Entropy
• Most Confused: • Bar–Pub and restaurant pair
Classification Results
Max Entropy
Chaniotakis, E., Antoniou, C., Aifadopoulou, G., & Dimitriou, L., 2017 "Inferring Activities from Social Media Data."In Transportation Research Record (accepted).
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 27
Increased Data Availability from fused - diverse datasets
Data Enrichment based on both user-centric spatial and link information
Unsupervised Text Classification for Activity Generalization
Decoded social media data in useful, inexpensive information concerning activities
Large Span of time in Social Media Data à long–term activity–chains
Can be used to supplement travel surveys
Transferability place and platforms
Conclusions
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 28
Not activity-related tweets
Cross-validation of activities and user generated content information
Use of other classification methods
Expand the framework to other activity related aspects of tweets
Identify SM related biases
Future Work (short-term)
Chair of Transportation Systems EngineeringDepartment of Civil, Geo and Environmental EngineeringTechnical University of Munich
Constantinos Antoniou (TUM) | Workshop on Smart Mobility | 1-2 June 2017 29
Infer mobility patterns and information from various data sources• Traditional surveys• Social media data• Vehicle data (passenger, taxis, commercial, public transport, …)• Telco data (cell-phone, wifi, etc.)• Ride-sharing, car-sharing data• …
Future Work (mid-term)
30
Thank you for your attention!
Prof. Constantinos AntoniouChair of Transportation Systems Engineering
Department of Civil, Geo and Environmental Engineering
Technical University of Munich