research article elm meets urban big data...

11
Research Article ELM Meets Urban Big Data Analysis: Case Studies Ningyu Zhang, 1 Huajun Chen, 2 Xi Chen, 2 and Jiaoyan Chen 2 1 College of Computer Science and Technology, Zhejiang University, Zhejiang 310014, China 2 Zhejiang University, Zhejiang, China Correspondence should be addressed to Ningyu Zhang; [email protected] Received 10 May 2016; Revised 3 July 2016; Accepted 1 August 2016 Academic Editor: Adel Alimi Copyright © 2016 Ningyu Zhang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In the latest years, the rapid progress of urban computing has engendered big issues, which creates both opportunities and challenges. e heterogeneous and big volume of data and the big difference between physical and virtual worlds have resulted in lots of problems in quickly solving practical problems in urban computing. In this paper, we propose a general application framework of ELM for urban computing. We present several real case studies of the framework like smog-related health hazard prediction and optimal retain store placement. Experiments involving urban data in China show the efficiency, accuracy, and flexibility of our proposed framework. 1. Introduction Urban computing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by diverse sources in urban spaces, such as sensors, devices, vehicles, buildings, and humans, to tackle the major issues that cities face (e.g., air pollution, increased energy consumption, and traffic congestion) [1]. With the rapid progress of urban computing, lots of applications have appeared to analyze amounts of urban data using big data analysis technologies for real practical problems. Most of them use conventional machine learning and computational intelligence solutions. However, with the exponential growth of data and complexity of systems, those methods have suffered lots of bottlenecks in learning (e.g., intensive human intervention and convergence time). In fact, the urban data acquired in real situations normally are heterogeneous and in big volumes. us, according to studies [2], treating features extracted from different data sources equally does not achieve the best performance. Moreover, most of the models of applications are hard to implement in real situations because of the complexity and sparsity of some data. erefore, fast machine learning and computational intelligence techniques are needed. Extreme learning machine (ELM) [3] can be a good solu- tion for the learning of large sample data as it has high generalization performance and fast training speed. More- over, the stacked ELMs [4] perform well when facing extremely large and complex training datasets. Indeed, ELM is flexible and easy to implement. In this paper, we propose a general application framework of ELM for urban com- puting. To solve challenges in urban computing using ELM, we firstly analyze the types, sources, and structures of urban data. We standardize the form of data from different sources and fuse data across modalities as input of ELM. In fact, the urban data can be obtained from online social media and offline physical sensors. Online social media (twitter, user comments) have opinions about places. ese data are normally text (user ratings are numeric) and uncertain. e bad situation is that the amounts of these data are influenced by the populations or type of district to a great extent. For instance, it is quite easy to retrieve amounts of data in a metropolis. However, less-developed regions have small populations and, hence, relatively low social media activity. Meanwhile, offline data reflect the physical condition of this region such as the flow of taxis and buses, traffic congestion index, real estate, POIs, road network, air quality, and mete- orological elements. e amounts of these data in different regions are usually similar. Nevertheless, these data are from different sources and are heterogeneous. We divide the city into regions by the road network. We obtain huge amounts Hindawi Publishing Corporation Computational Intelligence and Neuroscience Volume 2016, Article ID 4970246, 10 pages http://dx.doi.org/10.1155/2016/4970246

Upload: others

Post on 20-Jun-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Research ArticleELM Meets Urban Big Data Analysis Case Studies

Ningyu Zhang1 Huajun Chen2 Xi Chen2 and Jiaoyan Chen2

1College of Computer Science and Technology Zhejiang University Zhejiang 310014 China2Zhejiang University Zhejiang China

Correspondence should be addressed to Ningyu Zhang zhangningyuzjueducn

Received 10 May 2016 Revised 3 July 2016 Accepted 1 August 2016

Academic Editor Adel Alimi

Copyright copy 2016 Ningyu Zhang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

In the latest years the rapid progress of urban computing has engendered big issues which creates both opportunities andchallengesTheheterogeneous and big volumeof data and the big difference between physical and virtual worlds have resulted in lotsof problems in quickly solving practical problems in urban computing In this paper we propose a general application frameworkof ELM for urban computing We present several real case studies of the framework like smog-related health hazard predictionand optimal retain store placement Experiments involving urban data in China show the efficiency accuracy and flexibility of ourproposed framework

1 Introduction

Urban computing is a process of acquisition integration andanalysis of big and heterogeneous data generated by diversesources in urban spaces such as sensors devices vehiclesbuildings and humans to tackle the major issues that citiesface (eg air pollution increased energy consumption andtraffic congestion) [1] With the rapid progress of urbancomputing lots of applications have appeared to analyzeamounts of urban data using big data analysis technologiesfor real practical problems Most of them use conventionalmachine learning and computational intelligence solutionsHowever with the exponential growth of data and complexityof systems those methods have suffered lots of bottlenecks inlearning (eg intensive human intervention and convergencetime)

In fact the urban data acquired in real situations normallyare heterogeneous and in big volumes Thus according tostudies [2] treating features extracted from different datasources equally does not achieve the best performanceMoreover most of the models of applications are hard toimplement in real situations because of the complexity andsparsity of some data Therefore fast machine learning andcomputational intelligence techniques are needed

Extreme learning machine (ELM) [3] can be a good solu-tion for the learning of large sample data as it has high

generalization performance and fast training speed More-over the stacked ELMs [4] perform well when facingextremely large and complex training datasets Indeed ELMis flexible and easy to implement In this paper we proposea general application framework of ELM for urban com-puting

To solve challenges in urban computing using ELM wefirstly analyze the types sources and structures of urbandata We standardize the form of data from different sourcesand fuse data across modalities as input of ELM In factthe urban data can be obtained from online social mediaand offline physical sensors Online social media (twitteruser comments) have opinions about places These data arenormally text (user ratings are numeric) and uncertain Thebad situation is that the amounts of these data are influencedby the populations or type of district to a great extentFor instance it is quite easy to retrieve amounts of data ina metropolis However less-developed regions have smallpopulations and hence relatively low social media activityMeanwhile offline data reflect the physical condition of thisregion such as the flow of taxis and buses traffic congestionindex real estate POIs road network air quality and mete-orological elements The amounts of these data in differentregions are usually similar Nevertheless these data are fromdifferent sources and are heterogeneous We divide the cityinto regions by the road network We obtain huge amounts

Hindawi Publishing CorporationComputational Intelligence and NeuroscienceVolume 2016 Article ID 4970246 10 pageshttpdxdoiorg10115520164970246

2 Computational Intelligence and Neuroscience

of data in each region We formalize these data and buildstandard feature vectors including social view from onlinesocialmedia and physical view fromphysical sensors throughlocation based feature selection Social view andphysical vieware treated as different views on an object We adopt deepautoencoder method [5] to capture the middle-level featurerepresentation between two modalities (eg social view andphysical view) Then we train the data flexibly which meansthat the data with different sparse degree will be trained bydifferent kinds of ELM based methods

Furthermore we propose several case studies of ourframework For different real applications we can easily addor remove data sources and adjust parameters of the modelWe use the social media meteorological elements and airquality tomake smog-related health hazard prediction Trulythese data are heterogeneous while stacked ELM [4] canflexibly train these dataWe use user comments social mediathe flow of taxis and buses traffic congestion index realestate POIs and road network to help optimally retain storeplacement In real situations the user comments data sufferthe data sparsity problem We handle this by transferringknowledge from regions with huge data by domain adapta-tion transfer ELM [6]

The major contributions of this paper are as follows

(1) We propose a general application framework of ELMfor urban computing

(2) We propose several case studies for our frameworksuch as smog-related health hazard prediction andoptimal retain store placement

(3) We evaluate our framework in real urban datasets andreveal the advantages in precision and learning speed

The remainder of this paper is organized as followsSection 2 briefly reviews existing studies on ELM and urbancomputing Section 3 presents the framework and key chal-lenges Section 4 describes the case studies Section 5 presentsthe results of our experiments Finally Section 6 summarizesour findings and concludes the paper with a brief discussionon the scope for future work

2 Related Work

21 ELM Extreme learning machine (ELM) has recentlyattracted many researchersrsquo interest due to its very fastlearning speed good generalization ability and ease ofimplementation [3] Zhou et al [4] proposed stacked ELMs(S-ELMs) which are specially designed for solving largeand complex data problems The S-ELMs serially connectstacked small ELMs divided by a single large ELM networkand can approximate a very large ELM network with smallmemory requirement Moreover the S-ELMs support theELM autoencoder during each iteration which significantlyimproves the accuracy on big data problems Therefore ithas great chance to improve efficiency and precision using S-ELMs to train heterogeneous data However it is necessary tohave a standard data modal for it

With the rapid growth of cities huge amounts of data canbe obtained while there still exist data sparsity problems as

a result of sensor coverage difference of human activitiesand so on L Zhang and D Zhang [6] proposed a uni-fied framework referred to as Domain Adaptation ExtremeLearningMachine (DAELM) which learns a robust classifierby leveraging a limited number of labeled data from targetdomain for drift compensation as well as gases recognition inE-nose systems without loss of the computational efficiencyand learning ability of traditional ELM Nevertheless differ-ent urban data have different modalities and are difficult totransfer knowledge directly

22 Cross-Domain Data Fusion The data from differentdomains consist of multiple modalities According to [7]there are three categories of methods that can fuse multipledatasets (1) stage-based fusion methods [8] (2) feature-levelbasedmethods [5 9] and (3) semanticmeaning-basedmeth-ods [10ndash12] Ngiam et al [5] proposed deep autoencodermethod to capture the middle-level feature representationbetween two modalities (eg audio and video)

However there is a lack of research which utilized meth-ods to fuse both social media data and physical sensor datafor a general application framework for urban computingThis is mainly due to the lack of (1) systematic approachesfor collecting modeling and analyzing such information and(2) efficient machine learning framework which can combinefeatures from both social and physical views

23 Urban Computing The urban data of a city (eg humanmobility and the number of changes in a POI category) mayindicate real conditions and trends in the cities [13]

For instance the tweets from social media and meteoro-logical elements may indicate the existence of smog-relatedhealth hazards Chen et al [14] modeled smog-related healthhazards and smog severity through mining raw microblog-ging text and network information diffusion data and devel-oped an artificial neural network- (ANN-) based model toforecast smog-related health hazard with the current healthhazard and smog severity observationsThis paper is a follow-upwork and as socialmedia and physical sensors are differentmodalities we adopt urban knowledge fusion and propose ageneral framework

Furthermore human mobility with POIs may have con-tributions to the placement of some businesses Scellato etal [15] used the location based social network to study theoptimal retail store problem They collected and analyzeddata from Foursquare to understand the popularity of thethreemajor retail stores inNewYorkThey evaluated a diverseset of data mining features and analyzed the spatial andsemantic information about location andmode of user actionin the surrounding area The problem arises when we cannotobtain sufficient online data in some small cities This paperis also follow-up work of [16]

3 Approach

31 Key Challenges According to [1] urban computing hasthree main challenges urban sensing and data acquisitioncomputing with heterogeneous data and hybrid systemsblending the physical and virtual worlds In real situations

Computational Intelligence and Neuroscience 3

Datasets Knowledge fusion

Stacked ELM

Social view

Domain adaptation ELM

Applications

Physical view

Air quality

Waterlogging

Store placement

Figure 1 General application framework of ELM for urban computing

many conventional machine learning methods suffer keychallenges summarized as follows

(1) Poor Learning Efficiency As there are huge urban datagenerated from social media and physical sensors it is greatchallenge for the training efficiency ofmachine learning Seri-ously many applications of urban computing have timelinessrequirements such as disaster and traffic monitor It is veryurgent to train a model as fast as possible and get the finalresult

(2) Complex Model to Handle Heterogeneous Data Datafrom different sources consist of multiple modalities eachof which has different representation distribution scale anddensity For example text is usually represented as discretesparse word count vectors whereas an image is representedby pixel intensities or outputs of feature extractors whichare real-valued and dense POIs are represented by spatialpoints associated with a static category whereas air quality isrepresented using a geotagged time series Human mobilitydata is represented by trajectories whereas a road network isdenoted as a spatial graph Treating different datasets equallyor simply concatenating the features from disparate datasetscannot achieve a good performance in data mining tasksAs a result fusing data across modalities becomes a newchallenge in big data research calling for advanced datafusion technology

(3) Sparsity of Data and Hard for Implementation It is quiteeasy to retrieve kinds of data in a metropolis Neverthelesssome relatively small cities have small populations andhence relatively low socialmedia activity Specifically we facethe following two challenges (1) the label scarcity problemand (2) the data insufficiency problemTherefore it is difficultto train and implement in real situations

32 General Framework As Figure 1 shows our frameworkconsists of two parts urban knowledge fusion and trainingwhich is based on ELM We firstly fuse the data obtainedfrom both social media and physical sensors using deepautoencoder [5] For those with abundant data we usestacked ELM [4] to train For those with sparse data weuse DAELM [6] and transfer knowledge from regions withabundant data

Figure 2 Segmented regions

33 Typical Technologies

331 Map Segmentation We divide a city into disjointedblocks (httpsgithubcomzxlzrSegment-Maps) assumingthat placement in a block 119892 is uniform Road network isusually composed of a number of major roads such as thering road the city is divided into areas [13] We map theprojection of the vector-based road network on a planeThen the road network is converted into a raster model [17]Actually each pixel of a projected map image can be viewedas a block element of a raster map Consequently the roadnetwork is converted into a binary imageThenwe extract theskeleton of the road while retaining the original two-valueimage topology Figure 2 shows the result of the proceduredescribed above for Beijingrsquos road network Finally we get theblocks 119892 of cities

34 Location Based Feature Selection For each block wecan access the features over its neighbourhood and simplycalculate the average value However there exist problemsif ignoring the location information of the surroundingfeatures There are two reasons we need to consider (1) Fromthe distance perspective if the selected location is far from thetarget location it may have few impacts on the target locationand vice versa (2) From the density perspective if one of theneighbour blocks has a lot of stores while the others have fewmaybe this block has greater influence on the target locationSo we propose a location based feature selection (LBFS)

4 Computational Intelligence and Neuroscience

method to calculate features considering neighbourhoodrsquosimpact Suppose we want to calculate a feature of target block119892 119892 has 119905 neighbours 1198731 1198732 119873119905 119891119894 is the feature vectorof each block119873119894 119889119894 is the distance between119873119894 and 119892 119899119894 is thenumber of feature points (which means there are 119899 stores inblock 119894 and119898119894 is the measure of119873119894)(1) Distance Related Features Some kind of features mayreduce the impact of the storersquos score with the increase ofdistance So we weigh the features with distances Formallywe have

1198911198921 = sum 119891119894119889119894 (1)

(2) Density Related Features Several features may have fewerimpacts on the storersquos score with the increase of density ofsurrounding shops So we weigh the features with densityFormally we have

1198911198922 = sum 119891119894119899119894 (2)

(3) Measure Related Features There may exist some featuresthat may have fewer impacts on the storersquos score with theincrease of measure of surrounding regions So we weigh thefeatures with measure Formally we have

1198911198923 = sum 119891119894119898119894

(3)

In the real situation we use all three kinds of featureselection results Normally each kind of feature in a grid hasthree final feature values as a vector Formally we have

119891119892 = LBFS (119891 119889 119899119898) (4)

where 119891 are the feature vectors of the neighbours 119889 arethe distances between the middle of neighbours and middleof the target block 119899 are the numbers of feature points inneighbours and119898 are the measures of neighbours

341 Urban Knowledge Fusion For each block in cities weobtain the social view and physical view separately Each viewis represented as a feature vector Social view is composedof social media text user comment texts user ratios and soon according to different application requirements Physicalview is composed of physical sensor values like the flowof taxis and buses traffic congestion index real estatePOIs road network air quality meteorological elementsand so on regarding different application requirements Weadopt the deep autoencoder [5] to capture the middle-levelfeature representation between social view and physical viewAs Figure 3 depicts deep learning effectively learns (1) abetter single modality representation with the help of othermodalities and (2) the shared representations capturing thecorrelations across multiple modalities

342 Training Practically we obtain the middle-level fea-ture representation of each block For those with abundant

Social view reconstruction Physical view reconstruction

Social view input Physical view input

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middotmiddot middot middot

middot middot middot

middot middot middot

Figure 3 Deep autoencoder of social view and physical view

data we use stacked ELM [4] to train It can be used directlyto solve regression binary and multiclass classificationproblems regarding different applications For those withsparse data we use DAELM [6] to transfer knowledge fromregions with abundant data Actually we can treat differentcities as different domains because data from different citiesmay have different distributions in feature and label spaces[18]We use cities with abounded data as sources and transferknowledge to target ones with sparse data

4 Case Studies

41 Smog-Related Health Hazard Prediction In fact smog isa terrible health hazard that affect peoplersquos health accordingto recent research [19] It is necessary to analyze monitorand forecast smog-related health hazards in a timely mannerIn recent times social media has become an increasinglyimportant channel to observe sentiment trends and eventsFurthermore there are various physical sensors monitoringsmog status such as air quality stations weather stationsand earth observation satellites generating amounts of dataabout severity of smog In this paper we model the smog-related health hazards and smog severity by two indexes(PHI and SSI) using social media The urban smog-relatedhealth hazard prediction problem is a classification problemto assign appropriate class labels (Public Health Index) to theblocks of cities

Public Health Index (PHI) is the sum of total relativefrequencies of smog-related health hazard phrases in thecurrent tweets D-PHI is an enhanced Public Health Indexthat includes consideration of diffusion in social networks

Definition Smog Severity Index (SSI) is the weighted sumof total relative frequencies of smog severity phrases in thecurrent tweets D-SSI is an enhanced Smog Severity Indexthat includes consideration of diffusion in social networks

Firstly we extract both smog-related health hazardphrases and smog severity phrases Secondly we gather rawtweets with time and location tags fromWeibo (a twitter-like

Computational Intelligence and Neuroscience 5

website in China) Thirdly we calculate the daily relativefrequency rf of each phrase

rf (119901) = af (119901 119879119862) times idf (119901 119879119867)

af (119901 119879119862) =sum119889isin119879119862 f (119901 119889)

119879119862

idf (119901 119879119867) = log100381610038161003816100381611987911986710038161003816100381610038161003816100381610038161003816119889 isin 119879119867 119901 isin 1198891003816100381610038161003816

(5)

where 119879119867 and 119879119862 represent historical and current tweet setsrespectively 119901 represents a phrase 119889 represents a tweetf(119901 119889) represents the frequency of phrase 119901 in tweet 119889af(119901 119879119862) represents the average frequency of phrase 119901 in thecurrent tweet set 119879119862 and idf(119901 119879119867) represents the inverseddocument frequency of the tweets with phrase 119901 in thehistorical tweet set 119879119867 The logarithm function is to scale upthe fraction of rare tweets The above algorithm is derivedfrom the typical tf-idf algorithm [20] The difference lies inthe replacement of the largest word frequency in currenttweet set with the size of current tweet set which aims ateliminating the influence of other heat phrases Then PHIand SSI are calculated with the relative frequencies of all thephrases

PHI = sum119901isin1198751

rf (119901)

SSI = sum119901isin1198752

rf (119901) times order (119901) (6)

where 1198751 stands for the set of smog-related health hazardphrases 1198752 stands for the set of smog severity phrases Thensocial network diffusion is considered to calculate D-PHIand D-SSI We calculate the network diffusion-based averagefrequency

daf (119901 119879119862) =sum119889isin119879119862 (f (119901 119889) times (119892 (119889) + 1))

10038161003816100381610038161198791198621003816100381610038161003816 + sum119889isin119879119862 119892 (119889) (7)

where 119892(119889) represents a tweetrsquos total number of retweets andlikes Once daf is calculated we use it to replace the averagefrequency af to calculate the relative frequency rf and furthercompute the value of D-PHI and D-SSI Finally the PHI SSID-PHI and D-SSI make up the feature vectors of social viewin this problem

Moreover we also extract features from air qualityincluding both air pollution concentrations (CO NO2 SO2O3 PM25 and PM10) and air quality index (AQI) whichcomprehensively evaluates the air quality We extract recordsfrom various meteorological elements including humiditycloud value pressure temperature and wind speed all ofwhich have been proven to affect smog disasters greatly Forexample high wind speed and low cloud value usually leadssmog pollution to decrease in the next day

42 Optimal Retain Store Placement The optimal placementof a retail store has been of prime importance For example

a new restaurant set up in a street corner may attractlots of customers but it may close months later if it islocated in a few hundred meters down the road In thispaper the optimal retain store placement problem is a rankproblem We calculate scores for each of the candidate areasand rank them The top-ranked areas will be the optimalregion for placing We get the label data from Dianpingscore (httpwwwdianpingcom) and assume that the dataobserved can evaluate the popularity of a place For thisproblem we analyze the data obtained and build social andphysical view and then build a classifier according to ourframework

A strong regional economy usually indicates highdemand according to recent studies [15 21] Therefore wemined the blockrsquos neighbourrsquos user reviews from httpwwwdianpingcom

(1) Dianping Score For each region 119892 we will retrieveservice quality overall satisfaction environment class andconsumption level by mining the reviews of business venuesneighbourhood regions Region 119892 has 119905 neighbourhoods Weaccess the usersrsquo opinions over the neighbourhood region119873119894to form features

Overall Satisfaction Since the overall rating of a businessvenue in block 119892 represents the satisfaction of users we usethe LBFS to get the overall ratings of all business venueslocated in 119873119894 as a numeric score of overall satisfactionFormally we have

119865119904 = LBFS (overall satisfaction 119889 119899 119898) (8)

Service Quality Similarly we use the LBFS to compute theservice rating of business venues in119873119894 and express the servicequality of the neighbourhood as

119865119902 = LBFS (Service Rating 119889 119899 119898) (9)

Environment ClassThe level of the surrounding environmentcan reflect the current environment class of the area There-fore we use LBFS to get environment rating as

119865119890 = LBFS (environment rating 119889 119899 119898) (10)

Consumption Cost The average value of the consumerrsquosbehavior of the surrounding regions can reflect the currentconsumption of the area So we use LBFS to get the consump-tion cost of business venues of neighbourhood as a feature

119865119888 = LBFS (average cost 119889 119899 119898) (11)

Bus transits are slow and cheap and aremainly distributedin areas having a large number of IT and educational estab-lishments The price of real estate and the traffic congestionindex indicate whether the facility planning is balanced Weexploit these features to uncover the implicit preferences fora neighbourhood

Bus-Related Features Medium income residents choose busBecause most of the cityrsquos residents are from the backgroundof the middle class bus traffic may represent the bulk of

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

2 Computational Intelligence and Neuroscience

of data in each region We formalize these data and buildstandard feature vectors including social view from onlinesocialmedia and physical view fromphysical sensors throughlocation based feature selection Social view andphysical vieware treated as different views on an object We adopt deepautoencoder method [5] to capture the middle-level featurerepresentation between two modalities (eg social view andphysical view) Then we train the data flexibly which meansthat the data with different sparse degree will be trained bydifferent kinds of ELM based methods

Furthermore we propose several case studies of ourframework For different real applications we can easily addor remove data sources and adjust parameters of the modelWe use the social media meteorological elements and airquality tomake smog-related health hazard prediction Trulythese data are heterogeneous while stacked ELM [4] canflexibly train these dataWe use user comments social mediathe flow of taxis and buses traffic congestion index realestate POIs and road network to help optimally retain storeplacement In real situations the user comments data sufferthe data sparsity problem We handle this by transferringknowledge from regions with huge data by domain adapta-tion transfer ELM [6]

The major contributions of this paper are as follows

(1) We propose a general application framework of ELMfor urban computing

(2) We propose several case studies for our frameworksuch as smog-related health hazard prediction andoptimal retain store placement

(3) We evaluate our framework in real urban datasets andreveal the advantages in precision and learning speed

The remainder of this paper is organized as followsSection 2 briefly reviews existing studies on ELM and urbancomputing Section 3 presents the framework and key chal-lenges Section 4 describes the case studies Section 5 presentsthe results of our experiments Finally Section 6 summarizesour findings and concludes the paper with a brief discussionon the scope for future work

2 Related Work

21 ELM Extreme learning machine (ELM) has recentlyattracted many researchersrsquo interest due to its very fastlearning speed good generalization ability and ease ofimplementation [3] Zhou et al [4] proposed stacked ELMs(S-ELMs) which are specially designed for solving largeand complex data problems The S-ELMs serially connectstacked small ELMs divided by a single large ELM networkand can approximate a very large ELM network with smallmemory requirement Moreover the S-ELMs support theELM autoencoder during each iteration which significantlyimproves the accuracy on big data problems Therefore ithas great chance to improve efficiency and precision using S-ELMs to train heterogeneous data However it is necessary tohave a standard data modal for it

With the rapid growth of cities huge amounts of data canbe obtained while there still exist data sparsity problems as

a result of sensor coverage difference of human activitiesand so on L Zhang and D Zhang [6] proposed a uni-fied framework referred to as Domain Adaptation ExtremeLearningMachine (DAELM) which learns a robust classifierby leveraging a limited number of labeled data from targetdomain for drift compensation as well as gases recognition inE-nose systems without loss of the computational efficiencyand learning ability of traditional ELM Nevertheless differ-ent urban data have different modalities and are difficult totransfer knowledge directly

22 Cross-Domain Data Fusion The data from differentdomains consist of multiple modalities According to [7]there are three categories of methods that can fuse multipledatasets (1) stage-based fusion methods [8] (2) feature-levelbasedmethods [5 9] and (3) semanticmeaning-basedmeth-ods [10ndash12] Ngiam et al [5] proposed deep autoencodermethod to capture the middle-level feature representationbetween two modalities (eg audio and video)

However there is a lack of research which utilized meth-ods to fuse both social media data and physical sensor datafor a general application framework for urban computingThis is mainly due to the lack of (1) systematic approachesfor collecting modeling and analyzing such information and(2) efficient machine learning framework which can combinefeatures from both social and physical views

23 Urban Computing The urban data of a city (eg humanmobility and the number of changes in a POI category) mayindicate real conditions and trends in the cities [13]

For instance the tweets from social media and meteoro-logical elements may indicate the existence of smog-relatedhealth hazards Chen et al [14] modeled smog-related healthhazards and smog severity through mining raw microblog-ging text and network information diffusion data and devel-oped an artificial neural network- (ANN-) based model toforecast smog-related health hazard with the current healthhazard and smog severity observationsThis paper is a follow-upwork and as socialmedia and physical sensors are differentmodalities we adopt urban knowledge fusion and propose ageneral framework

Furthermore human mobility with POIs may have con-tributions to the placement of some businesses Scellato etal [15] used the location based social network to study theoptimal retail store problem They collected and analyzeddata from Foursquare to understand the popularity of thethreemajor retail stores inNewYorkThey evaluated a diverseset of data mining features and analyzed the spatial andsemantic information about location andmode of user actionin the surrounding area The problem arises when we cannotobtain sufficient online data in some small cities This paperis also follow-up work of [16]

3 Approach

31 Key Challenges According to [1] urban computing hasthree main challenges urban sensing and data acquisitioncomputing with heterogeneous data and hybrid systemsblending the physical and virtual worlds In real situations

Computational Intelligence and Neuroscience 3

Datasets Knowledge fusion

Stacked ELM

Social view

Domain adaptation ELM

Applications

Physical view

Air quality

Waterlogging

Store placement

Figure 1 General application framework of ELM for urban computing

many conventional machine learning methods suffer keychallenges summarized as follows

(1) Poor Learning Efficiency As there are huge urban datagenerated from social media and physical sensors it is greatchallenge for the training efficiency ofmachine learning Seri-ously many applications of urban computing have timelinessrequirements such as disaster and traffic monitor It is veryurgent to train a model as fast as possible and get the finalresult

(2) Complex Model to Handle Heterogeneous Data Datafrom different sources consist of multiple modalities eachof which has different representation distribution scale anddensity For example text is usually represented as discretesparse word count vectors whereas an image is representedby pixel intensities or outputs of feature extractors whichare real-valued and dense POIs are represented by spatialpoints associated with a static category whereas air quality isrepresented using a geotagged time series Human mobilitydata is represented by trajectories whereas a road network isdenoted as a spatial graph Treating different datasets equallyor simply concatenating the features from disparate datasetscannot achieve a good performance in data mining tasksAs a result fusing data across modalities becomes a newchallenge in big data research calling for advanced datafusion technology

(3) Sparsity of Data and Hard for Implementation It is quiteeasy to retrieve kinds of data in a metropolis Neverthelesssome relatively small cities have small populations andhence relatively low socialmedia activity Specifically we facethe following two challenges (1) the label scarcity problemand (2) the data insufficiency problemTherefore it is difficultto train and implement in real situations

32 General Framework As Figure 1 shows our frameworkconsists of two parts urban knowledge fusion and trainingwhich is based on ELM We firstly fuse the data obtainedfrom both social media and physical sensors using deepautoencoder [5] For those with abundant data we usestacked ELM [4] to train For those with sparse data weuse DAELM [6] and transfer knowledge from regions withabundant data

Figure 2 Segmented regions

33 Typical Technologies

331 Map Segmentation We divide a city into disjointedblocks (httpsgithubcomzxlzrSegment-Maps) assumingthat placement in a block 119892 is uniform Road network isusually composed of a number of major roads such as thering road the city is divided into areas [13] We map theprojection of the vector-based road network on a planeThen the road network is converted into a raster model [17]Actually each pixel of a projected map image can be viewedas a block element of a raster map Consequently the roadnetwork is converted into a binary imageThenwe extract theskeleton of the road while retaining the original two-valueimage topology Figure 2 shows the result of the proceduredescribed above for Beijingrsquos road network Finally we get theblocks 119892 of cities

34 Location Based Feature Selection For each block wecan access the features over its neighbourhood and simplycalculate the average value However there exist problemsif ignoring the location information of the surroundingfeatures There are two reasons we need to consider (1) Fromthe distance perspective if the selected location is far from thetarget location it may have few impacts on the target locationand vice versa (2) From the density perspective if one of theneighbour blocks has a lot of stores while the others have fewmaybe this block has greater influence on the target locationSo we propose a location based feature selection (LBFS)

4 Computational Intelligence and Neuroscience

method to calculate features considering neighbourhoodrsquosimpact Suppose we want to calculate a feature of target block119892 119892 has 119905 neighbours 1198731 1198732 119873119905 119891119894 is the feature vectorof each block119873119894 119889119894 is the distance between119873119894 and 119892 119899119894 is thenumber of feature points (which means there are 119899 stores inblock 119894 and119898119894 is the measure of119873119894)(1) Distance Related Features Some kind of features mayreduce the impact of the storersquos score with the increase ofdistance So we weigh the features with distances Formallywe have

1198911198921 = sum 119891119894119889119894 (1)

(2) Density Related Features Several features may have fewerimpacts on the storersquos score with the increase of density ofsurrounding shops So we weigh the features with densityFormally we have

1198911198922 = sum 119891119894119899119894 (2)

(3) Measure Related Features There may exist some featuresthat may have fewer impacts on the storersquos score with theincrease of measure of surrounding regions So we weigh thefeatures with measure Formally we have

1198911198923 = sum 119891119894119898119894

(3)

In the real situation we use all three kinds of featureselection results Normally each kind of feature in a grid hasthree final feature values as a vector Formally we have

119891119892 = LBFS (119891 119889 119899119898) (4)

where 119891 are the feature vectors of the neighbours 119889 arethe distances between the middle of neighbours and middleof the target block 119899 are the numbers of feature points inneighbours and119898 are the measures of neighbours

341 Urban Knowledge Fusion For each block in cities weobtain the social view and physical view separately Each viewis represented as a feature vector Social view is composedof social media text user comment texts user ratios and soon according to different application requirements Physicalview is composed of physical sensor values like the flowof taxis and buses traffic congestion index real estatePOIs road network air quality meteorological elementsand so on regarding different application requirements Weadopt the deep autoencoder [5] to capture the middle-levelfeature representation between social view and physical viewAs Figure 3 depicts deep learning effectively learns (1) abetter single modality representation with the help of othermodalities and (2) the shared representations capturing thecorrelations across multiple modalities

342 Training Practically we obtain the middle-level fea-ture representation of each block For those with abundant

Social view reconstruction Physical view reconstruction

Social view input Physical view input

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middotmiddot middot middot

middot middot middot

middot middot middot

Figure 3 Deep autoencoder of social view and physical view

data we use stacked ELM [4] to train It can be used directlyto solve regression binary and multiclass classificationproblems regarding different applications For those withsparse data we use DAELM [6] to transfer knowledge fromregions with abundant data Actually we can treat differentcities as different domains because data from different citiesmay have different distributions in feature and label spaces[18]We use cities with abounded data as sources and transferknowledge to target ones with sparse data

4 Case Studies

41 Smog-Related Health Hazard Prediction In fact smog isa terrible health hazard that affect peoplersquos health accordingto recent research [19] It is necessary to analyze monitorand forecast smog-related health hazards in a timely mannerIn recent times social media has become an increasinglyimportant channel to observe sentiment trends and eventsFurthermore there are various physical sensors monitoringsmog status such as air quality stations weather stationsand earth observation satellites generating amounts of dataabout severity of smog In this paper we model the smog-related health hazards and smog severity by two indexes(PHI and SSI) using social media The urban smog-relatedhealth hazard prediction problem is a classification problemto assign appropriate class labels (Public Health Index) to theblocks of cities

Public Health Index (PHI) is the sum of total relativefrequencies of smog-related health hazard phrases in thecurrent tweets D-PHI is an enhanced Public Health Indexthat includes consideration of diffusion in social networks

Definition Smog Severity Index (SSI) is the weighted sumof total relative frequencies of smog severity phrases in thecurrent tweets D-SSI is an enhanced Smog Severity Indexthat includes consideration of diffusion in social networks

Firstly we extract both smog-related health hazardphrases and smog severity phrases Secondly we gather rawtweets with time and location tags fromWeibo (a twitter-like

Computational Intelligence and Neuroscience 5

website in China) Thirdly we calculate the daily relativefrequency rf of each phrase

rf (119901) = af (119901 119879119862) times idf (119901 119879119867)

af (119901 119879119862) =sum119889isin119879119862 f (119901 119889)

119879119862

idf (119901 119879119867) = log100381610038161003816100381611987911986710038161003816100381610038161003816100381610038161003816119889 isin 119879119867 119901 isin 1198891003816100381610038161003816

(5)

where 119879119867 and 119879119862 represent historical and current tweet setsrespectively 119901 represents a phrase 119889 represents a tweetf(119901 119889) represents the frequency of phrase 119901 in tweet 119889af(119901 119879119862) represents the average frequency of phrase 119901 in thecurrent tweet set 119879119862 and idf(119901 119879119867) represents the inverseddocument frequency of the tweets with phrase 119901 in thehistorical tweet set 119879119867 The logarithm function is to scale upthe fraction of rare tweets The above algorithm is derivedfrom the typical tf-idf algorithm [20] The difference lies inthe replacement of the largest word frequency in currenttweet set with the size of current tweet set which aims ateliminating the influence of other heat phrases Then PHIand SSI are calculated with the relative frequencies of all thephrases

PHI = sum119901isin1198751

rf (119901)

SSI = sum119901isin1198752

rf (119901) times order (119901) (6)

where 1198751 stands for the set of smog-related health hazardphrases 1198752 stands for the set of smog severity phrases Thensocial network diffusion is considered to calculate D-PHIand D-SSI We calculate the network diffusion-based averagefrequency

daf (119901 119879119862) =sum119889isin119879119862 (f (119901 119889) times (119892 (119889) + 1))

10038161003816100381610038161198791198621003816100381610038161003816 + sum119889isin119879119862 119892 (119889) (7)

where 119892(119889) represents a tweetrsquos total number of retweets andlikes Once daf is calculated we use it to replace the averagefrequency af to calculate the relative frequency rf and furthercompute the value of D-PHI and D-SSI Finally the PHI SSID-PHI and D-SSI make up the feature vectors of social viewin this problem

Moreover we also extract features from air qualityincluding both air pollution concentrations (CO NO2 SO2O3 PM25 and PM10) and air quality index (AQI) whichcomprehensively evaluates the air quality We extract recordsfrom various meteorological elements including humiditycloud value pressure temperature and wind speed all ofwhich have been proven to affect smog disasters greatly Forexample high wind speed and low cloud value usually leadssmog pollution to decrease in the next day

42 Optimal Retain Store Placement The optimal placementof a retail store has been of prime importance For example

a new restaurant set up in a street corner may attractlots of customers but it may close months later if it islocated in a few hundred meters down the road In thispaper the optimal retain store placement problem is a rankproblem We calculate scores for each of the candidate areasand rank them The top-ranked areas will be the optimalregion for placing We get the label data from Dianpingscore (httpwwwdianpingcom) and assume that the dataobserved can evaluate the popularity of a place For thisproblem we analyze the data obtained and build social andphysical view and then build a classifier according to ourframework

A strong regional economy usually indicates highdemand according to recent studies [15 21] Therefore wemined the blockrsquos neighbourrsquos user reviews from httpwwwdianpingcom

(1) Dianping Score For each region 119892 we will retrieveservice quality overall satisfaction environment class andconsumption level by mining the reviews of business venuesneighbourhood regions Region 119892 has 119905 neighbourhoods Weaccess the usersrsquo opinions over the neighbourhood region119873119894to form features

Overall Satisfaction Since the overall rating of a businessvenue in block 119892 represents the satisfaction of users we usethe LBFS to get the overall ratings of all business venueslocated in 119873119894 as a numeric score of overall satisfactionFormally we have

119865119904 = LBFS (overall satisfaction 119889 119899 119898) (8)

Service Quality Similarly we use the LBFS to compute theservice rating of business venues in119873119894 and express the servicequality of the neighbourhood as

119865119902 = LBFS (Service Rating 119889 119899 119898) (9)

Environment ClassThe level of the surrounding environmentcan reflect the current environment class of the area There-fore we use LBFS to get environment rating as

119865119890 = LBFS (environment rating 119889 119899 119898) (10)

Consumption Cost The average value of the consumerrsquosbehavior of the surrounding regions can reflect the currentconsumption of the area So we use LBFS to get the consump-tion cost of business venues of neighbourhood as a feature

119865119888 = LBFS (average cost 119889 119899 119898) (11)

Bus transits are slow and cheap and aremainly distributedin areas having a large number of IT and educational estab-lishments The price of real estate and the traffic congestionindex indicate whether the facility planning is balanced Weexploit these features to uncover the implicit preferences fora neighbourhood

Bus-Related Features Medium income residents choose busBecause most of the cityrsquos residents are from the backgroundof the middle class bus traffic may represent the bulk of

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Computational Intelligence and Neuroscience 3

Datasets Knowledge fusion

Stacked ELM

Social view

Domain adaptation ELM

Applications

Physical view

Air quality

Waterlogging

Store placement

Figure 1 General application framework of ELM for urban computing

many conventional machine learning methods suffer keychallenges summarized as follows

(1) Poor Learning Efficiency As there are huge urban datagenerated from social media and physical sensors it is greatchallenge for the training efficiency ofmachine learning Seri-ously many applications of urban computing have timelinessrequirements such as disaster and traffic monitor It is veryurgent to train a model as fast as possible and get the finalresult

(2) Complex Model to Handle Heterogeneous Data Datafrom different sources consist of multiple modalities eachof which has different representation distribution scale anddensity For example text is usually represented as discretesparse word count vectors whereas an image is representedby pixel intensities or outputs of feature extractors whichare real-valued and dense POIs are represented by spatialpoints associated with a static category whereas air quality isrepresented using a geotagged time series Human mobilitydata is represented by trajectories whereas a road network isdenoted as a spatial graph Treating different datasets equallyor simply concatenating the features from disparate datasetscannot achieve a good performance in data mining tasksAs a result fusing data across modalities becomes a newchallenge in big data research calling for advanced datafusion technology

(3) Sparsity of Data and Hard for Implementation It is quiteeasy to retrieve kinds of data in a metropolis Neverthelesssome relatively small cities have small populations andhence relatively low socialmedia activity Specifically we facethe following two challenges (1) the label scarcity problemand (2) the data insufficiency problemTherefore it is difficultto train and implement in real situations

32 General Framework As Figure 1 shows our frameworkconsists of two parts urban knowledge fusion and trainingwhich is based on ELM We firstly fuse the data obtainedfrom both social media and physical sensors using deepautoencoder [5] For those with abundant data we usestacked ELM [4] to train For those with sparse data weuse DAELM [6] and transfer knowledge from regions withabundant data

Figure 2 Segmented regions

33 Typical Technologies

331 Map Segmentation We divide a city into disjointedblocks (httpsgithubcomzxlzrSegment-Maps) assumingthat placement in a block 119892 is uniform Road network isusually composed of a number of major roads such as thering road the city is divided into areas [13] We map theprojection of the vector-based road network on a planeThen the road network is converted into a raster model [17]Actually each pixel of a projected map image can be viewedas a block element of a raster map Consequently the roadnetwork is converted into a binary imageThenwe extract theskeleton of the road while retaining the original two-valueimage topology Figure 2 shows the result of the proceduredescribed above for Beijingrsquos road network Finally we get theblocks 119892 of cities

34 Location Based Feature Selection For each block wecan access the features over its neighbourhood and simplycalculate the average value However there exist problemsif ignoring the location information of the surroundingfeatures There are two reasons we need to consider (1) Fromthe distance perspective if the selected location is far from thetarget location it may have few impacts on the target locationand vice versa (2) From the density perspective if one of theneighbour blocks has a lot of stores while the others have fewmaybe this block has greater influence on the target locationSo we propose a location based feature selection (LBFS)

4 Computational Intelligence and Neuroscience

method to calculate features considering neighbourhoodrsquosimpact Suppose we want to calculate a feature of target block119892 119892 has 119905 neighbours 1198731 1198732 119873119905 119891119894 is the feature vectorof each block119873119894 119889119894 is the distance between119873119894 and 119892 119899119894 is thenumber of feature points (which means there are 119899 stores inblock 119894 and119898119894 is the measure of119873119894)(1) Distance Related Features Some kind of features mayreduce the impact of the storersquos score with the increase ofdistance So we weigh the features with distances Formallywe have

1198911198921 = sum 119891119894119889119894 (1)

(2) Density Related Features Several features may have fewerimpacts on the storersquos score with the increase of density ofsurrounding shops So we weigh the features with densityFormally we have

1198911198922 = sum 119891119894119899119894 (2)

(3) Measure Related Features There may exist some featuresthat may have fewer impacts on the storersquos score with theincrease of measure of surrounding regions So we weigh thefeatures with measure Formally we have

1198911198923 = sum 119891119894119898119894

(3)

In the real situation we use all three kinds of featureselection results Normally each kind of feature in a grid hasthree final feature values as a vector Formally we have

119891119892 = LBFS (119891 119889 119899119898) (4)

where 119891 are the feature vectors of the neighbours 119889 arethe distances between the middle of neighbours and middleof the target block 119899 are the numbers of feature points inneighbours and119898 are the measures of neighbours

341 Urban Knowledge Fusion For each block in cities weobtain the social view and physical view separately Each viewis represented as a feature vector Social view is composedof social media text user comment texts user ratios and soon according to different application requirements Physicalview is composed of physical sensor values like the flowof taxis and buses traffic congestion index real estatePOIs road network air quality meteorological elementsand so on regarding different application requirements Weadopt the deep autoencoder [5] to capture the middle-levelfeature representation between social view and physical viewAs Figure 3 depicts deep learning effectively learns (1) abetter single modality representation with the help of othermodalities and (2) the shared representations capturing thecorrelations across multiple modalities

342 Training Practically we obtain the middle-level fea-ture representation of each block For those with abundant

Social view reconstruction Physical view reconstruction

Social view input Physical view input

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middotmiddot middot middot

middot middot middot

middot middot middot

Figure 3 Deep autoencoder of social view and physical view

data we use stacked ELM [4] to train It can be used directlyto solve regression binary and multiclass classificationproblems regarding different applications For those withsparse data we use DAELM [6] to transfer knowledge fromregions with abundant data Actually we can treat differentcities as different domains because data from different citiesmay have different distributions in feature and label spaces[18]We use cities with abounded data as sources and transferknowledge to target ones with sparse data

4 Case Studies

41 Smog-Related Health Hazard Prediction In fact smog isa terrible health hazard that affect peoplersquos health accordingto recent research [19] It is necessary to analyze monitorand forecast smog-related health hazards in a timely mannerIn recent times social media has become an increasinglyimportant channel to observe sentiment trends and eventsFurthermore there are various physical sensors monitoringsmog status such as air quality stations weather stationsand earth observation satellites generating amounts of dataabout severity of smog In this paper we model the smog-related health hazards and smog severity by two indexes(PHI and SSI) using social media The urban smog-relatedhealth hazard prediction problem is a classification problemto assign appropriate class labels (Public Health Index) to theblocks of cities

Public Health Index (PHI) is the sum of total relativefrequencies of smog-related health hazard phrases in thecurrent tweets D-PHI is an enhanced Public Health Indexthat includes consideration of diffusion in social networks

Definition Smog Severity Index (SSI) is the weighted sumof total relative frequencies of smog severity phrases in thecurrent tweets D-SSI is an enhanced Smog Severity Indexthat includes consideration of diffusion in social networks

Firstly we extract both smog-related health hazardphrases and smog severity phrases Secondly we gather rawtweets with time and location tags fromWeibo (a twitter-like

Computational Intelligence and Neuroscience 5

website in China) Thirdly we calculate the daily relativefrequency rf of each phrase

rf (119901) = af (119901 119879119862) times idf (119901 119879119867)

af (119901 119879119862) =sum119889isin119879119862 f (119901 119889)

119879119862

idf (119901 119879119867) = log100381610038161003816100381611987911986710038161003816100381610038161003816100381610038161003816119889 isin 119879119867 119901 isin 1198891003816100381610038161003816

(5)

where 119879119867 and 119879119862 represent historical and current tweet setsrespectively 119901 represents a phrase 119889 represents a tweetf(119901 119889) represents the frequency of phrase 119901 in tweet 119889af(119901 119879119862) represents the average frequency of phrase 119901 in thecurrent tweet set 119879119862 and idf(119901 119879119867) represents the inverseddocument frequency of the tweets with phrase 119901 in thehistorical tweet set 119879119867 The logarithm function is to scale upthe fraction of rare tweets The above algorithm is derivedfrom the typical tf-idf algorithm [20] The difference lies inthe replacement of the largest word frequency in currenttweet set with the size of current tweet set which aims ateliminating the influence of other heat phrases Then PHIand SSI are calculated with the relative frequencies of all thephrases

PHI = sum119901isin1198751

rf (119901)

SSI = sum119901isin1198752

rf (119901) times order (119901) (6)

where 1198751 stands for the set of smog-related health hazardphrases 1198752 stands for the set of smog severity phrases Thensocial network diffusion is considered to calculate D-PHIand D-SSI We calculate the network diffusion-based averagefrequency

daf (119901 119879119862) =sum119889isin119879119862 (f (119901 119889) times (119892 (119889) + 1))

10038161003816100381610038161198791198621003816100381610038161003816 + sum119889isin119879119862 119892 (119889) (7)

where 119892(119889) represents a tweetrsquos total number of retweets andlikes Once daf is calculated we use it to replace the averagefrequency af to calculate the relative frequency rf and furthercompute the value of D-PHI and D-SSI Finally the PHI SSID-PHI and D-SSI make up the feature vectors of social viewin this problem

Moreover we also extract features from air qualityincluding both air pollution concentrations (CO NO2 SO2O3 PM25 and PM10) and air quality index (AQI) whichcomprehensively evaluates the air quality We extract recordsfrom various meteorological elements including humiditycloud value pressure temperature and wind speed all ofwhich have been proven to affect smog disasters greatly Forexample high wind speed and low cloud value usually leadssmog pollution to decrease in the next day

42 Optimal Retain Store Placement The optimal placementof a retail store has been of prime importance For example

a new restaurant set up in a street corner may attractlots of customers but it may close months later if it islocated in a few hundred meters down the road In thispaper the optimal retain store placement problem is a rankproblem We calculate scores for each of the candidate areasand rank them The top-ranked areas will be the optimalregion for placing We get the label data from Dianpingscore (httpwwwdianpingcom) and assume that the dataobserved can evaluate the popularity of a place For thisproblem we analyze the data obtained and build social andphysical view and then build a classifier according to ourframework

A strong regional economy usually indicates highdemand according to recent studies [15 21] Therefore wemined the blockrsquos neighbourrsquos user reviews from httpwwwdianpingcom

(1) Dianping Score For each region 119892 we will retrieveservice quality overall satisfaction environment class andconsumption level by mining the reviews of business venuesneighbourhood regions Region 119892 has 119905 neighbourhoods Weaccess the usersrsquo opinions over the neighbourhood region119873119894to form features

Overall Satisfaction Since the overall rating of a businessvenue in block 119892 represents the satisfaction of users we usethe LBFS to get the overall ratings of all business venueslocated in 119873119894 as a numeric score of overall satisfactionFormally we have

119865119904 = LBFS (overall satisfaction 119889 119899 119898) (8)

Service Quality Similarly we use the LBFS to compute theservice rating of business venues in119873119894 and express the servicequality of the neighbourhood as

119865119902 = LBFS (Service Rating 119889 119899 119898) (9)

Environment ClassThe level of the surrounding environmentcan reflect the current environment class of the area There-fore we use LBFS to get environment rating as

119865119890 = LBFS (environment rating 119889 119899 119898) (10)

Consumption Cost The average value of the consumerrsquosbehavior of the surrounding regions can reflect the currentconsumption of the area So we use LBFS to get the consump-tion cost of business venues of neighbourhood as a feature

119865119888 = LBFS (average cost 119889 119899 119898) (11)

Bus transits are slow and cheap and aremainly distributedin areas having a large number of IT and educational estab-lishments The price of real estate and the traffic congestionindex indicate whether the facility planning is balanced Weexploit these features to uncover the implicit preferences fora neighbourhood

Bus-Related Features Medium income residents choose busBecause most of the cityrsquos residents are from the backgroundof the middle class bus traffic may represent the bulk of

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

4 Computational Intelligence and Neuroscience

method to calculate features considering neighbourhoodrsquosimpact Suppose we want to calculate a feature of target block119892 119892 has 119905 neighbours 1198731 1198732 119873119905 119891119894 is the feature vectorof each block119873119894 119889119894 is the distance between119873119894 and 119892 119899119894 is thenumber of feature points (which means there are 119899 stores inblock 119894 and119898119894 is the measure of119873119894)(1) Distance Related Features Some kind of features mayreduce the impact of the storersquos score with the increase ofdistance So we weigh the features with distances Formallywe have

1198911198921 = sum 119891119894119889119894 (1)

(2) Density Related Features Several features may have fewerimpacts on the storersquos score with the increase of density ofsurrounding shops So we weigh the features with densityFormally we have

1198911198922 = sum 119891119894119899119894 (2)

(3) Measure Related Features There may exist some featuresthat may have fewer impacts on the storersquos score with theincrease of measure of surrounding regions So we weigh thefeatures with measure Formally we have

1198911198923 = sum 119891119894119898119894

(3)

In the real situation we use all three kinds of featureselection results Normally each kind of feature in a grid hasthree final feature values as a vector Formally we have

119891119892 = LBFS (119891 119889 119899119898) (4)

where 119891 are the feature vectors of the neighbours 119889 arethe distances between the middle of neighbours and middleof the target block 119899 are the numbers of feature points inneighbours and119898 are the measures of neighbours

341 Urban Knowledge Fusion For each block in cities weobtain the social view and physical view separately Each viewis represented as a feature vector Social view is composedof social media text user comment texts user ratios and soon according to different application requirements Physicalview is composed of physical sensor values like the flowof taxis and buses traffic congestion index real estatePOIs road network air quality meteorological elementsand so on regarding different application requirements Weadopt the deep autoencoder [5] to capture the middle-levelfeature representation between social view and physical viewAs Figure 3 depicts deep learning effectively learns (1) abetter single modality representation with the help of othermodalities and (2) the shared representations capturing thecorrelations across multiple modalities

342 Training Practically we obtain the middle-level fea-ture representation of each block For those with abundant

Social view reconstruction Physical view reconstruction

Social view input Physical view input

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middot

middot middot middotmiddot middot middot

middot middot middot

middot middot middot

Figure 3 Deep autoencoder of social view and physical view

data we use stacked ELM [4] to train It can be used directlyto solve regression binary and multiclass classificationproblems regarding different applications For those withsparse data we use DAELM [6] to transfer knowledge fromregions with abundant data Actually we can treat differentcities as different domains because data from different citiesmay have different distributions in feature and label spaces[18]We use cities with abounded data as sources and transferknowledge to target ones with sparse data

4 Case Studies

41 Smog-Related Health Hazard Prediction In fact smog isa terrible health hazard that affect peoplersquos health accordingto recent research [19] It is necessary to analyze monitorand forecast smog-related health hazards in a timely mannerIn recent times social media has become an increasinglyimportant channel to observe sentiment trends and eventsFurthermore there are various physical sensors monitoringsmog status such as air quality stations weather stationsand earth observation satellites generating amounts of dataabout severity of smog In this paper we model the smog-related health hazards and smog severity by two indexes(PHI and SSI) using social media The urban smog-relatedhealth hazard prediction problem is a classification problemto assign appropriate class labels (Public Health Index) to theblocks of cities

Public Health Index (PHI) is the sum of total relativefrequencies of smog-related health hazard phrases in thecurrent tweets D-PHI is an enhanced Public Health Indexthat includes consideration of diffusion in social networks

Definition Smog Severity Index (SSI) is the weighted sumof total relative frequencies of smog severity phrases in thecurrent tweets D-SSI is an enhanced Smog Severity Indexthat includes consideration of diffusion in social networks

Firstly we extract both smog-related health hazardphrases and smog severity phrases Secondly we gather rawtweets with time and location tags fromWeibo (a twitter-like

Computational Intelligence and Neuroscience 5

website in China) Thirdly we calculate the daily relativefrequency rf of each phrase

rf (119901) = af (119901 119879119862) times idf (119901 119879119867)

af (119901 119879119862) =sum119889isin119879119862 f (119901 119889)

119879119862

idf (119901 119879119867) = log100381610038161003816100381611987911986710038161003816100381610038161003816100381610038161003816119889 isin 119879119867 119901 isin 1198891003816100381610038161003816

(5)

where 119879119867 and 119879119862 represent historical and current tweet setsrespectively 119901 represents a phrase 119889 represents a tweetf(119901 119889) represents the frequency of phrase 119901 in tweet 119889af(119901 119879119862) represents the average frequency of phrase 119901 in thecurrent tweet set 119879119862 and idf(119901 119879119867) represents the inverseddocument frequency of the tweets with phrase 119901 in thehistorical tweet set 119879119867 The logarithm function is to scale upthe fraction of rare tweets The above algorithm is derivedfrom the typical tf-idf algorithm [20] The difference lies inthe replacement of the largest word frequency in currenttweet set with the size of current tweet set which aims ateliminating the influence of other heat phrases Then PHIand SSI are calculated with the relative frequencies of all thephrases

PHI = sum119901isin1198751

rf (119901)

SSI = sum119901isin1198752

rf (119901) times order (119901) (6)

where 1198751 stands for the set of smog-related health hazardphrases 1198752 stands for the set of smog severity phrases Thensocial network diffusion is considered to calculate D-PHIand D-SSI We calculate the network diffusion-based averagefrequency

daf (119901 119879119862) =sum119889isin119879119862 (f (119901 119889) times (119892 (119889) + 1))

10038161003816100381610038161198791198621003816100381610038161003816 + sum119889isin119879119862 119892 (119889) (7)

where 119892(119889) represents a tweetrsquos total number of retweets andlikes Once daf is calculated we use it to replace the averagefrequency af to calculate the relative frequency rf and furthercompute the value of D-PHI and D-SSI Finally the PHI SSID-PHI and D-SSI make up the feature vectors of social viewin this problem

Moreover we also extract features from air qualityincluding both air pollution concentrations (CO NO2 SO2O3 PM25 and PM10) and air quality index (AQI) whichcomprehensively evaluates the air quality We extract recordsfrom various meteorological elements including humiditycloud value pressure temperature and wind speed all ofwhich have been proven to affect smog disasters greatly Forexample high wind speed and low cloud value usually leadssmog pollution to decrease in the next day

42 Optimal Retain Store Placement The optimal placementof a retail store has been of prime importance For example

a new restaurant set up in a street corner may attractlots of customers but it may close months later if it islocated in a few hundred meters down the road In thispaper the optimal retain store placement problem is a rankproblem We calculate scores for each of the candidate areasand rank them The top-ranked areas will be the optimalregion for placing We get the label data from Dianpingscore (httpwwwdianpingcom) and assume that the dataobserved can evaluate the popularity of a place For thisproblem we analyze the data obtained and build social andphysical view and then build a classifier according to ourframework

A strong regional economy usually indicates highdemand according to recent studies [15 21] Therefore wemined the blockrsquos neighbourrsquos user reviews from httpwwwdianpingcom

(1) Dianping Score For each region 119892 we will retrieveservice quality overall satisfaction environment class andconsumption level by mining the reviews of business venuesneighbourhood regions Region 119892 has 119905 neighbourhoods Weaccess the usersrsquo opinions over the neighbourhood region119873119894to form features

Overall Satisfaction Since the overall rating of a businessvenue in block 119892 represents the satisfaction of users we usethe LBFS to get the overall ratings of all business venueslocated in 119873119894 as a numeric score of overall satisfactionFormally we have

119865119904 = LBFS (overall satisfaction 119889 119899 119898) (8)

Service Quality Similarly we use the LBFS to compute theservice rating of business venues in119873119894 and express the servicequality of the neighbourhood as

119865119902 = LBFS (Service Rating 119889 119899 119898) (9)

Environment ClassThe level of the surrounding environmentcan reflect the current environment class of the area There-fore we use LBFS to get environment rating as

119865119890 = LBFS (environment rating 119889 119899 119898) (10)

Consumption Cost The average value of the consumerrsquosbehavior of the surrounding regions can reflect the currentconsumption of the area So we use LBFS to get the consump-tion cost of business venues of neighbourhood as a feature

119865119888 = LBFS (average cost 119889 119899 119898) (11)

Bus transits are slow and cheap and aremainly distributedin areas having a large number of IT and educational estab-lishments The price of real estate and the traffic congestionindex indicate whether the facility planning is balanced Weexploit these features to uncover the implicit preferences fora neighbourhood

Bus-Related Features Medium income residents choose busBecause most of the cityrsquos residents are from the backgroundof the middle class bus traffic may represent the bulk of

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Computational Intelligence and Neuroscience 5

website in China) Thirdly we calculate the daily relativefrequency rf of each phrase

rf (119901) = af (119901 119879119862) times idf (119901 119879119867)

af (119901 119879119862) =sum119889isin119879119862 f (119901 119889)

119879119862

idf (119901 119879119867) = log100381610038161003816100381611987911986710038161003816100381610038161003816100381610038161003816119889 isin 119879119867 119901 isin 1198891003816100381610038161003816

(5)

where 119879119867 and 119879119862 represent historical and current tweet setsrespectively 119901 represents a phrase 119889 represents a tweetf(119901 119889) represents the frequency of phrase 119901 in tweet 119889af(119901 119879119862) represents the average frequency of phrase 119901 in thecurrent tweet set 119879119862 and idf(119901 119879119867) represents the inverseddocument frequency of the tweets with phrase 119901 in thehistorical tweet set 119879119867 The logarithm function is to scale upthe fraction of rare tweets The above algorithm is derivedfrom the typical tf-idf algorithm [20] The difference lies inthe replacement of the largest word frequency in currenttweet set with the size of current tweet set which aims ateliminating the influence of other heat phrases Then PHIand SSI are calculated with the relative frequencies of all thephrases

PHI = sum119901isin1198751

rf (119901)

SSI = sum119901isin1198752

rf (119901) times order (119901) (6)

where 1198751 stands for the set of smog-related health hazardphrases 1198752 stands for the set of smog severity phrases Thensocial network diffusion is considered to calculate D-PHIand D-SSI We calculate the network diffusion-based averagefrequency

daf (119901 119879119862) =sum119889isin119879119862 (f (119901 119889) times (119892 (119889) + 1))

10038161003816100381610038161198791198621003816100381610038161003816 + sum119889isin119879119862 119892 (119889) (7)

where 119892(119889) represents a tweetrsquos total number of retweets andlikes Once daf is calculated we use it to replace the averagefrequency af to calculate the relative frequency rf and furthercompute the value of D-PHI and D-SSI Finally the PHI SSID-PHI and D-SSI make up the feature vectors of social viewin this problem

Moreover we also extract features from air qualityincluding both air pollution concentrations (CO NO2 SO2O3 PM25 and PM10) and air quality index (AQI) whichcomprehensively evaluates the air quality We extract recordsfrom various meteorological elements including humiditycloud value pressure temperature and wind speed all ofwhich have been proven to affect smog disasters greatly Forexample high wind speed and low cloud value usually leadssmog pollution to decrease in the next day

42 Optimal Retain Store Placement The optimal placementof a retail store has been of prime importance For example

a new restaurant set up in a street corner may attractlots of customers but it may close months later if it islocated in a few hundred meters down the road In thispaper the optimal retain store placement problem is a rankproblem We calculate scores for each of the candidate areasand rank them The top-ranked areas will be the optimalregion for placing We get the label data from Dianpingscore (httpwwwdianpingcom) and assume that the dataobserved can evaluate the popularity of a place For thisproblem we analyze the data obtained and build social andphysical view and then build a classifier according to ourframework

A strong regional economy usually indicates highdemand according to recent studies [15 21] Therefore wemined the blockrsquos neighbourrsquos user reviews from httpwwwdianpingcom

(1) Dianping Score For each region 119892 we will retrieveservice quality overall satisfaction environment class andconsumption level by mining the reviews of business venuesneighbourhood regions Region 119892 has 119905 neighbourhoods Weaccess the usersrsquo opinions over the neighbourhood region119873119894to form features

Overall Satisfaction Since the overall rating of a businessvenue in block 119892 represents the satisfaction of users we usethe LBFS to get the overall ratings of all business venueslocated in 119873119894 as a numeric score of overall satisfactionFormally we have

119865119904 = LBFS (overall satisfaction 119889 119899 119898) (8)

Service Quality Similarly we use the LBFS to compute theservice rating of business venues in119873119894 and express the servicequality of the neighbourhood as

119865119902 = LBFS (Service Rating 119889 119899 119898) (9)

Environment ClassThe level of the surrounding environmentcan reflect the current environment class of the area There-fore we use LBFS to get environment rating as

119865119890 = LBFS (environment rating 119889 119899 119898) (10)

Consumption Cost The average value of the consumerrsquosbehavior of the surrounding regions can reflect the currentconsumption of the area So we use LBFS to get the consump-tion cost of business venues of neighbourhood as a feature

119865119888 = LBFS (average cost 119889 119899 119898) (11)

Bus transits are slow and cheap and aremainly distributedin areas having a large number of IT and educational estab-lishments The price of real estate and the traffic congestionindex indicate whether the facility planning is balanced Weexploit these features to uncover the implicit preferences fora neighbourhood

Bus-Related Features Medium income residents choose busBecause most of the cityrsquos residents are from the backgroundof the middle class bus traffic may represent the bulk of

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

6 Computational Intelligence and Neuroscience

the cityrsquos flow We try to measure the arrival departure andthe volume of the transition on the streets of each block Fora region 119892 we try to use BT as the set of bus trajectories of acity each of which is denoted by a tuple ⟨119901 119889⟩ where 119901 is apickup bus stop and 119889 is a drop-off bus stop [21]

Bus Arriving Departing and Transition Volume We extractthe arriving departing and transition volumes of buses fromsmart card transactions Formally

119865BAV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 notin 119892 119889 isin 1198921003816100381610038161003816 119865BLV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 notin 1198921003816100381610038161003816

119865BTV119887 = 1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(12)

Density Recent studies have reported that price premiums ofup to 10 are estimated for retail stores within 400m of alarge number of bus stops The density of the bus station ispositively correlatedwith the value of the retail storeHere weuse smart card transactions and propose alternative methodsand strategies for density estimation of bus stop In factthe number of bus stops in a trip can be approximated byfare Therefore we calculate the distance to the ratio of theestimated density in the vicinity of the bus station

119865BSD119894 = sum119901isin119892119889isin119892 dist (119901 119889) fare (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(13)

Balance The balance of smart cards can show the pattern ofconsumption and supply behavior If some residents alwaysmaintain a high balance on the smart card the huge cost ofbus travel may mean (1) these residents are more dependenton the bus which shows the lack of subway and taxi nearbyand (2) these residents have to go to work in places far awayIn other words these places may be far away So we try to usethe smart card balance as a feature

119865SCB119894 = sum119901isin119892119889isin119892 balance (119901 119889)1003816100381610038161003816⟨119901 119889⟩ isin BT 119901 isin 119892 119889 isin 1198921003816100381610038161003816

(14)

Real Estate Features The real estate prices may reflect thepurchasing power and economic index of this region Firstwe collect the historical prices of each estate andwe use LBFSto calculate estate price of the neighbourhood of each blockFormally we have

119865RE119894 = LBFS (estate price 119889 119899 119898) (15)

Traffic Index Features The flexibility of a region like conve-nient traffic may contribute to the popularity of a regionWe obtain the traffic index from httpnitrafficindexcomFormally we have

119865TI119894 = LBFS (traffic index 119889 119899 119898) (16)

Intraclass Competitiveness We measure the proportion ofneighbouring places of the same type 120574119897 with respect to thetotal number of nearby placesThen we rank areas in reverse

order assuming that the least competitive area is the mostpromising one

119865Intra119894 = minus119873120574119897 (119897 119903)119873 (119897 119903) (17)

However it is worth noting that the retail industryrsquos com-petitive stores and marketing can have a positive or negativeimpact For example one would expect to place a bar ina region with plenty of nightlife locations because therealready exists a service system and there are a lot ofpeople attracted to the area However being surrounded bycompetitors also means to share customers

Interclass Competitiveness In order to consider the iterationsbetween different place categories we adopt the metricsdefined by Jensen [22] In practice we use the intercategorycoefficients described to weigh the desirability of the placesobserved in the area around the object that is the greater thenumber of the places in the area that attracts the object thebetter the quality of the locationMore formally we define thequality of location for a venue of type 120574119897 as

119865Inter119894 = sum

120574119901isinΓ

log (120594120574119901120574119897) times (119873120574119901(119897119903) minus 119873120574119901(119897119903)) (18)

where 119873120574119901(119897119903) means how many venues of type 120574119901 areobserved on average around the places of type 120574119897 Γ is theset of place types and 120594120574119901120574119897 are the intertype attractivenesscoefficients Formally we get

120594120574119901120574119897 =119873 minus119873120574119901119873120574119901 times 119873120574119897

sum119901

119873120574119897 (119901 119903)119873 (119901 119903) minus 119873120574119901

(19)

POIs The POIs indicate the latent patterns of this regionwhich may have contributed to the placement Moreover thecategory of a POI may have a causal relation to it Let ♯(119894 119888)denote the number of POIs of category 119888 isin 119862 located in119892119894 and let ♯(119894) be the total number of POIs of all categorieslocated in 119892119894 The entropy is defined as

119865POI119894 = minussum

119888isin119862

♯ (119894 119888)♯ (119894) times log ♯ (119894 119888)

♯ (119894) (20)

Business Areas Business areas have important influence onoptimal placement The locations will get more customersand resources in the business areas We retrieve the geo-graphic information of business areas from Baidu map apiFinally we build a feature related to business areas Formallywe have

1198651198891 = LBFS (is neighbour business areas 119889 119899 119898)

1198651198892 =

1 if 119892119894 isin Business Area

0 if 119892119894 notin Business Area

119865119889 = [1198651198891 1198651198892]

(21)

where ldquois neighbour business areasrdquo is a Boolean valuewhichmeanswhether the blockrsquos neighbour is a business area

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Computational Intelligence and Neuroscience 7

Table 1 Details of the datasets

Datasets Size (M) SourcesComments 2523 httpdianpingcomTweets 11023 httpweibocom httptwittercomBuses 254 httpchelailenetcnTraffic 119 httpwwwnitrafficindexcomReal estate 35 httpwwwsoufuncomAir 534 httpwwwpm25inPOI business areas 10 httpmapbaiducomRoad network 9 httpopenstreetmaporgMeteorological 98 httpforecastio httpnoaagov

5 Experiments

51 UrbanData Table 1 lists the data sources For social viewwe crawl online business reviews from httpwwwdianpingcom which is a site for reviewing business establishmentsin China Each review includes the shop ID name addresslatitude longitude consumption cost star rating POI cat-egory city environment service overall ratings and com-ments We also crawl huge data from Weibo (a twitter-like website in China) and Twitter with geotags For phys-ical view we extract features from smart card transactionsfrom open web API Moreover we crawl the traffic indexfrom httpwwwnitrafficindexcom which is open to thepublic Finally we crawl the estate data from httpwwwsoufuncom which is the largest real estate online systemin China Moreover we crawl huge air quality data fromhttpwwwpm25in We collected POI and business areasdata from Baidu maps for each city The road network datawas gathered from OpenStreetMaps We collected meteo-rological data from the National Oceanic and AtmosphericAdministrationrsquos (NOAA) web service and httpforecastioevery hour All the experiments can be obtained from openweb service The source code of this paper can be obtainedfrom httpsgithubcomzxlzrELM-for-Urban-Computing

52 Metrics In total we have two main metrics for ourframework the precision and efficiency For specific casestudies for smog-related health hazard prediction we havethe following

Precision and Recall The precision and recall are given asfollows

Precision =1003816100381610038161003816cap (prediction set reference set)10038161003816100381610038161003816100381610038161003816prediction set1003816100381610038161003816

Recall =1003816100381610038161003816cap (prediction set reference set)1003816100381610038161003816

|reference set| (22)

For optimal retain store placement we have the following

Normalized Discounted Cumulative Gain The discountedcumulative gain (DCG119873) is given by

DCG [119899] =

rel1 if 119899 = 1

DCG [119899 minus 1] + rel119899log2119899

if 119899 ge 2 (23)

Later given the ideal discounted cumulative gain DCG1015840NDCG at the 119899th position can be computed as NDCG[119899] =DCG[119899]DCG1015840[119899] The larger the value of NDCG119873 thehigher the top-119873 ranking accuracy

Precision and Recall We choose to use a four-level ratingsystem as (3 gt 2 gt 1 gt 0) To simplify our evaluation taskwe treat the ratings less than 2 as low values and ratings of 3as high values In a top-119873 block list 119864119873 sorted in descendingorder of the prediction values we define the precision andrecall as precision119873 = (119864119873cap119864ge2)119873 and recall119873 = (119864119873cap119864ge2)119864ge2 where119864ge2 are blocks whose ratings are greater thanor equal to 2 For efficiency we recorded the training time andcompare with baselines

53 Results For smog-related health hazard prediction thestacked ELM is trained hidden layer ANN with 10 to 40hidden nodes the BP is trained ANN with 2 to 3 hiddenlayers and 8 to 15 nodes in each hidden layer Two classicSVM regression methods nu-SVR and epsilon-SVR areprovided by LIBSVM [23] Random forest regressionmethodis provided by sklearn Meanwhile The accuracies of thehealth hazard prediction models using our framework nu-SVR epsilon-SVR and random forest are shown in Table 2119901(nf) means the precision of methods without knowledgefusion We can find that two ANNsrsquo methods outperform theSVM regression methods and the random forest regressionmethod in forecasting the next day PHI and the ELMachieves slightly higher prediction accuracy than themultiplehidden layers ANNs trained by BP

For optimal retain store placement the models trainedwith a single cityrsquos data are used as baselines Our datasetshave the data obtained from five cities in China The blocksin each city have a label 0 1 2 3 from httpwwwdianpingcom based on the values observed by users For examplewe have ldquo2131rdquo ldquo2rdquo means the id of city ldquo13rdquo means the idof block and the final ldquo2rdquo is the label We treat each city as asingle domain containing hundreds of blocks For exampleif we want to do optimal retain store placement in Hangzhouwe use 75 of data in Hangzhou and all the other citiesrsquo datato train The remaining 25 of data in Hangzhou are fortesting Actually Hangzhou is treated as target domain whilethe other cities are source domains

Figure 4 shows the results of concatenation knowledgefusion and sampling methods for Starbucks The concate-nation method only concatenates social view and physicalview into one single view to adapt to the learning settingThe results show that knowledge fusion works better Thesampling method means we manually filter some negativesamples based on knowledge fusion which will make betterresults as figure shows

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

8 Computational Intelligence and Neuroscience

Table 2 The results of smog-related health hazard prediction

Cities Stacked ELM BP nu-SVR Epsilon-SVR Random forest119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time 119901 (nf) 119901 Time

Beijing 065 080 3 066 079 15 078 069 9 056 053 9 010 015 7Tianjin 092 089 3 056 059 13 058 073 10 066 073 11 015 014 8Shanghai 053 088 4 055 069 16 055 064 11 053 053 10 012 030 8Hangzhou 031 086 3 063 065 10 054 055 9 066 083 9 015 034 7Guangzhou 053 054 5 065 069 15 053 066 11 048 053 21 010 034 8Average 075 073 4 056 056 15 059 065 10 059 073 10 020 034 7

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(a) NDCG119873

3 5 7 100

50

100

ConcatenationKnowledge fusionSampling

(b) Precision119873

3 5 7 10

ConcatenationKnowledge fusionSampling

0

2

4

6

8

(c) Recall119873

Figure 4 NDCG precision and recall of 119873 for Starbucks in Beijing

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Computational Intelligence and Neuroscience 9

3 5 7 100

50

100

LBFSInflection rulesAll

(a) NDCG119873

3 5 7 100

50

100

LBFSInflection rulesAll

(b) Precision119873

3 5 7 10

LBFSInflection rulesAll

0

5

10

(c) Recall119873

Figure 5 NDCG precision and recall of 119873 for Starbucks in Beijing

Figure 5 shows the results of LBFS inflection rules andall methods for Starbucks The LBFS method means weuse location based feature selection to build features whilenot using the average value based on the methods in thepast paragraph Actually LBFS works better The inflectionrules method means we use rules to calculate the finalscore if the area has spatial news such as ldquonew subwaydemolitionrdquo The inflection rules consider the situations thatour algorithm may not cover and make our method morerobust The all method includes DAELM inflection rulesLBFS sampling and knowledge fusion There is no doubtthat the all method works best Moreover our frameworkperforms more efficiently In Table 3 we present the resultsobtained for the NDCG10 metric and training time for all

features across the three chains The numbers in the bracketsare minutes of training In all cases we observe a significantimprovement in precision and efficiency with respect to thebaseline

6 Conclusion

In this paper we propose a general application frameworkof ELM for urban computing and list three case studiesExperimental results showed that our approach is applicableand efficient compared with baselines

In the future we plan to apply our approach to moreapplications Moreover we would like to study the distribu-tion of our framework so it can handle more massive data

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

10 Computational Intelligence and Neuroscience

Table 3 The best average NDCG10 results of optimal retain storeplacement

Cities Starbucks TrueKungFu YongheKingMART (single city)

Beijing 0743 (15) 0643 (14) 0725 (14)Shanghai 0712 (15) 0689 (14) 0712 (12)Hangzhou 0576 (13) 0611 (12) 0691 (10)Guangzhou 0783 (15) 0691 (13) 0721 (12)Shenzhen 0781 (16) 0711 (15) 0722 (13)

RankBoost (single city)Beijing 0752 (23) 0678 (20) 0712 (19)Shanghai 0725 (25) 0667 (20) 0783 (21)Hangzhou 0723 (21) 0575 (19) 0724 (15)Guangzhou 0812 (22) 0782 (19) 0812 (18)Shenzhen 0724 (22) 0784 (17) 0712 (12)

BP (Single city)Beijing 0753 (45) 0658 (40) 0702 (41)Shanghai 0724 (44) 0657 (42) 07283 (44)Hangzhou 0725 (41) 0555 (40) 0714 (45)Guangzhou 0832 (45) 0772 (42) 0512 (41)Shenzhen 0722 (47) 0754 (42) 0702 (45)

DAELMBeijing 0755 (3) 0712 (2) 0755 (2)Shanghai 0745 (5) 0751 (3) 0783 (1)Hangzhou 0755 (5) 0711 (2) 0752 (5)Guangzhou 0810 (9) 0810 (5) 0823 (4)Shenzhen 0780 (9) 0783 (5) 0723 (8)

Competing Interests

The authors declare that they have no competing interests

References

[1] Y Zheng L Capra O Wolfson and H Yang ldquoUrban comput-ing concepts methodologies and applicationsrdquo ACM Transac-tions on Intelligent Systems and Technology vol 5 no 3 article38 2014

[2] Y Zheng F Liu andH-P Hsieh ldquoU-air when urban air qualityinference meets big datardquo KDD 2013

[3] G-B Huang Q-Y Zhu and C-K Siew ldquoExtreme learningmachine theory and applicationsrdquoNeurocomputing vol 70 no1ndash3 pp 489ndash501 2006

[4] H Zhou G-B Huang Z Lin HWang and Y C Soh ldquoStackedextreme learning machinesrdquo IEEE Transactions on Cyberneticsvol 45 no 9 pp 2013ndash2025 2015

[5] J Ngiam A Khosla M Kim J Nam H Lee and A Y NgldquoMultimodal deep learningrdquo in Proceedings of the 28th Inter-national Conference on Machine Learning (ICML rsquo11) BellevueWash USA 2011

[6] L Zhang and D Zhang ldquoDomain adaptation extreme learningmachines for drift compensation in e-nose systemsrdquo IEEETransactions on Instrumentation and Measurement vol 64 no7 pp 1790ndash1801 2015

[7] Y Zheng ldquoMethodologies for cross-domain data fusion anoverviewrdquo IEEE Transactions on Big Data vol 1 no 1 pp 16ndash342015

[8] Y Zheng Y Liu J Yuan and X Xie ldquoUrban computing withtaxicabsrdquo HUC 2011

[9] N Srivastava and R Salakhutdinov ldquoMultimodal learningwith deep Boltzmann machinesrdquo Journal of Machine LearningResearch vol 15 pp 2949ndash2980 2014

[10] A Blum and T Mitchell ldquoCombining labeled and unlabeleddata with co-trainingrdquo in Proceedings of the 11th Annual Confer-ence on Computational LearningTheory (COLT rsquo98) pp 92ndash100Madison Wis USA July 1998

[11] M Gonen and E Alpaydın ldquoMultiple kernel learning algo-rithmsrdquo Journal of Machine Learning Research vol 12 pp 2211ndash2268 2011

[12] Y Zheng X YiM Li et al ldquoForecasting fine-grained air qualitybased on big datardquo in Proceedings of the 21st ACM SIGKDDConference onKnowledge Discovery andDataMining (KDD rsquo15)pp 2267ndash2276 August 2015

[13] N J Yuan Y Zheng X Xie Y Wang K Zheng and H XiongldquoDiscovering urban functional zones using latent activity trajec-toriesrdquo IEEE Transactions on Knowledge and Data Engineeringvol 27 no 3 pp 712ndash725 2015

[14] J Chen H Chen Z Wu D Hu and J Z Pan ldquoForecastingsmog-related health hazard based on social media and physicalsensorrdquo Information Systems 2016

[15] S Scellato A Noulas and C Mascolo ldquoExploiting place fea-tures in link prediction on location-based social networksrdquo inProceedings of the 17th ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining (KDD rsquo11) pp 1046ndash1054 August 2011

[16] N Zhang H Chen X Chen and J Chen ldquoELM meets urbancomputing ensemble urban data for smart city applicationrdquo inProceedings of ELM-2015 Volume 1 pp 51ndash63 Springer 2016

[17] J Yuan Y Zheng and X Xie ldquoDiscovering regions of differentfunctions in a city using human mobility and POIsrdquo in Pro-ceedings of the 18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo12) pp 186ndash194ACM Beijing China August 2012

[18] Y Wei Y Zheng and Q Yang ldquoTransfer knowledge betweencitiesrdquo KDD 2016

[19] V Hughes ldquoWhere therersquos smokerdquoNature vol 489 no 7417 ppS18ndashS20 2012

[20] J Ramos ldquoUsing tf-idf to determine word relevance in doc-ument queriesrdquo in Proceedings of the 1st Instructional Confer-ence on Machine Learning 2003 httpswwwsemanticscholarorgpaperUsing-Tf-idf-to-Determine-Word-Relevance-in-Ramosb3bf6373ff41a115197cb5b30e57830c16130c2c

[21] Y Fu Y Ge Y Zheng et al ldquoSparse real estate rankingwith online user reviews and offline moving behaviorsrdquo inProceedings of the 14th IEEE International Conference on DataMining (ICDM rsquo14) pp 120ndash129 Shenzhen China December2014

[22] P Jensen ldquoNetwork-based predictions of retail store commer-cial categories and optimal locationsrdquo Physical Review E vol 74no 3 Article ID 035101 2006

[23] C-C Chang and C-J Lin ldquoLIBSVM a Library for supportvector machinesrdquo ACM Transactions on Intelligent Systems andTechnology vol 2 no 3 article 27 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article ELM Meets Urban Big Data …downloads.hindawi.com/journals/cin/2016/4970246.pdfResearch Article ELM Meets Urban Big Data Analysis: Case Studies NingyuZhang, 1 HuajunChen,

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014