exploring metropolitan dynamics with an agent-based model calibrated using social network data

18
Exploring Metropolitan Dynamics with an Agent-Based Model Calibrated using Social Network Data Nick Malleson & Mark Birkin School of Geography, University http://www.geog.leeds.ac.uk/people/n.malleson http://nickmalleson.co.uk/

Upload: ham

Post on 17-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Exploring Metropolitan Dynamics with an Agent-Based Model Calibrated using Social Network Data. Nick Malleson & Mark Birkin School of Geography, University http:// www.geog.leeds.ac.uk /people/ n.malleson http:// nickmalleson.co.uk /. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Exploring Metropolitan Dynamics with an Agent-Based Model Calibrated using Social Network Data

Exploring Metropolitan Dynamics with an Agent-Based Model Calibrated using Social Network DataNick Malleson & Mark BirkinSchool of Geography, University

http://www.geog.leeds.ac.uk/people/n.mallesonhttp://nickmalleson.co.uk/

ABSTRACT

Exploring Metropolitan Dynamics with an Agent-Based Model Calibrated using Social Network Data

Nick Malleson and Mark Birkin

Model calibration and validation are two of the most contentious issues surrounding the use of agent-based models and, in these respects, models of social phenomena are lagging behind other fields. For example, agent-based models rarely use dynamic data streams to improve their predictions as new social data become available. Contrast this to meteorology where daily weather updates are constantly used to improve the model predictions.

Data availability has been largely responsible for these problems, but in recent years new data sources have become available that contain a wealth of information about people's spatio-temporal behaviour. These are often spatially referenced, individual-level, updated continuously and have the potential to revolutionise our understanding of social phenomena and our approach to model evaluation.

The original feature of this research is to develop a framework for calibrating an agent-based model of metropolitan dynamics using a novel set of crowd-sourced data from Twitter. In these early research stages we incorporate two general dynamic behaviours: 'at home' and 'away from home', but, as our paper will show, an advanced analysis of the data can provide a much greater level of detail about peoples' daily behaviours (such as 'working', 'shopping', 'education' etc.). Through this modelling process we seek not just to understand residential patterns within the city, but the dynamic ebb and flow of the population in everyday metropolitan life.1OutlineResearch aim: develop a model of urban-dynamics, calibrated using novel crowd-sourced data.Background: Data for evaluating agent-based modelsCrowd-sourced dataData and study area: Twitter in LeedsEstablishing behaviour from tweetsIntegrating with a model of urban dynamics

Agent-Based Modelling

Autonomous, interacting agentsRepresent individuals or groupsUsually spatialModel social phenomena from the ground-upA natural way to describe systemsIdeal for social systems

Advantages of ABMMore natural for social systems than statistical approachesDynamic history of systemCan include physical space / social processes in models of social systemsDesigned at abstract level: easy to change scaleBridge between verbal theories and mathematical models

Disadvantages of ABMSingle model run reveals a theorem, but no information about robustnessComputationally expensiveSensitivity analysis and many runs requiredSmall errors can be replicated in many agentsMethodological individualismModelling soft human factorsLack of individual-level data for evaluation

Data in Agent-Based ModelsData required at every stage:Understanding the systemCalibrating the modelValidating the modelBut high-quality data are hard to come byMany sources are too sparse, low spatial/temporal resolutionCensuses focus on attributes rather than behaviour and occur infrequently

Crowd-Sourced Data for Social ScienceCrisis in empirical sociology (Savage and Burrows, 2007) Traditional surveys are small and occur infrequently Often focus on population attributes rather than behaviourOften spatially / demographically aggregatedhttp://www.guardian.co.uk/p/33p85These are being supersededknowing capitalismAmazon.com purchasing suggestions / supermarket reward cardscrowd-sourced data / volunteered geographical informationE.g. OpenStreetMap, Flikr, Twitter, FourSquare, FacebookPotentially very useful for agent-based modelsCalibration / validationEvaluating models in situ

Data and Study AreaTwitterSocial networking / microblogging serviceUsers create public tweets of up to 140 charactersFor the most part, tweets are publicly availableInclude information about user, time/date, location, text etc.Streaming API provides real-time access to tweetsCollected Data1.2M+ geo-located tweets around Leeds (June 2011 March 2012).403,922 Tweets within district2,683 individual usersHighly Skewed (10% of all tweets from 8 most prolific users)Filtered non-people

Temporal TrendsHourly peak in activity at 10pmDaily peak on Tuesday - ThursdayGeneral increase in activity over time

Spatial OverviewPoint density appears to cluster around urban centres.Also able to distinguish roads in non-urban areasGeneral pattern somewhat distorted by locations of prolific users

Analysis of Individual Behaviour Anchor PointsSpatial analysis to identify the home locations of individual usersSome clear spatio-temporal behaviour (e.g. communting, socialising etc.).Estimate home and then calculate distance from home at different timesJourney to work?

More important than aggregate patterns, we can identify the behaviour of individual usersEstimate home and then calculate distance at different timesCould estimate journey times, means of travel etc.Very useful for calibration of an ABM

Spatio-Temporal Behaviour

12Activity Matrices (I)Once the home location has been estimated, it is possible to build a profile of each users daily activityThe most common behaviour at a given time period takes precedenceRaw behavioural profiles

Interpolating to remove no-dataAt HomeAway from HomeNo DataUser01234567891011121314151617181920212223a300000000003003033333333b330003300013103003301300c130000001101010111111111d000000000100303111111111e000000000000000000000000f100003333333333331333111g000000003300130000333030h000000033000300000000000i000000000000000111000000User01234567891011121314151617181920212223a333333333333333333333333b333333333113133333331311c133331111111111111111111d111111111113333111111111e111111133333333333111111f111333333333333331333111g333333333331133333333333h111133333333333333111111i111111133333333111111111Activity Matrices (II)Overall, activity matrices appear reasonably realisticPeak in away from home at ~2pmPeak in at home activity at ~10pm.Next stages:Develop a more intelligent interpolation algorithm (borrow from GIS?)Spatio-temporal text mining routines to use textual content to improve behaviour classification

Towards A Model of Urban Dynamics DesignUse microsimulation to synthesise an initial population (all residents in a city)Estimate where people go to workEstimate when people go to work and how long they spend there (initial model parameters)Calibrate these parameters to data from Twitter (e.g. activity matrices) using a genetic algorithm

Prototype Model

Conclusions & Future WorkNew crowd-sourced data can help to improve social modelsImproved identification of behviourSpatio-temporal text miningMethods to classify text based on spatio-temporal location as well as textual contentIn situ model calibration

Thank youNick Malleson, School of Geography, University of Leeds

http://www.geog.leeds.ac.uk/people/n.mallesonhttp://nickmalleson.co.uk/