big aggregatedata

19
20.016 Urban Analysis BIG, AGGREGATE DATA Tanjong Pagar, Arab Street and Jalan Besar Clifford Mario Kosasih (1000294) Goh Pei Xuan (1000286) Kevin Josiah Neo Jun Hao (1000133) Oor Eiffel (1000293) Sharon Ho Jia Jia (100091)

Upload: clifford-mario-kosasih

Post on 15-Jul-2016

214 views

Category:

Documents


1 download

DESCRIPTION

Big Data in Architecture and Urbanism

TRANSCRIPT

Page 1: Big AggregateData

20.016 Urban Analysis

BIG, AGGREGATE DATA Tanjong Pagar, Arab Street and Jalan BesarClifford Mario Kosasih (1000294)

Goh Pei Xuan (1000286)

Kevin Josiah Neo Jun Hao (1000133)

Oor Eiffel (1000293)

Sharon Ho Jia Jia (100091)

Page 2: Big AggregateData

TABLE OF CONTENTS

1. INTRODUCTION 3

2. RESEARCH QUESTION 3

3. LITERATURE REVIEW 4

4. METHODOLOGY 5

5. HYPOTHESES 6

6. MAPS 7

7. BIG, AGGREGATE DATA 10

8. ANALYSIS 11

9. CONCLUSION 17

10. APPLICATION AND DESIGN RECOMMENDATIONS 18

Page 3: Big AggregateData

1. INTRODUCTIONThe rapid rise of digital devices’ usage and geo-tagging technology have aff ected the fi eld of human geography, social science and urban planning in recent decades. The ease of accessing large amounts of data from diverse samples of people has alleviated the bane of using manual surveys to collect data. This experiment capitalizes on this data mining technology and tries to predict and analyze collective human behavior and activities in three diff erent cafe districts in Singapore: Tanjong Pagar, Arab Street and Jalan Besar. By focusing on Foursquare check-in and Instagram posts data, we are comparing temporal variation of the human activity in the area, how those collective activities are related to the diversity of the area, as well as its proximity and accessibility to tourist destinations.

2. RESEARCH QUESTION

We are interested to understand the human activity spatial-temporal patterns generated from social media activity in diff erent cafe districts with respect to their urban attributes and characteristics. The three sites that we have chosen, Tanjong Pagar, Arab Street and Jalan Besar, are dedicated commercial districts, fi lled with various eating places such as restaurants and cafes.

3

Page 4: Big AggregateData

3. LITERATURE REVIEW

Utilising location based services for the purpose of organising public life and planning urban spaces are increasingly becoming a reality with the availability and use of such services. Ahas R., Ülar M. (2005) discusses the implications and possibiltiites of communication technology. They state that through the social positioning, the use of real-time data would allow for a better understanding of the results and implications of policy decisions made. This is made more signifi cant through the increasing precision of mobile positioning, which allows for more accurate studies of patterns in the space-time movement of society.

In exploring the use of check-ins of location sharing services to study the social and temporal characteristics to model patterns of mobility, the extraction of data to analyse temporal and geographic characteristics is necessary . Cheng, Z., Caverlee, J., Lee, K., & Sui, D. Z. (n.d.) have highlighted a clear process of formatting data and fi ltering noise from check-in collections, to prepare information for analysis. They also analysed user’s spatial activity through the use of gyration radius to indicate the spatial extent of activity as one of the methodologies of increasing the informativeness of available metadata information.

To properly utilise big data information as a research method to better understand urban settings, it is essential to process information in relation to hypotheses of the urban envrionment leading to quantifi able results providing relevant understanding and circumstances for urban intervention. Chen and Zhang (2012) demostrates the relation of social media patterns with culture, community and food diversity.

4

Page 5: Big AggregateData

4. METHODOLOGY

We used both Foursquare and Instagram data for this experiment. The Foursquare data is already given, from which we are obtaining the Instagram data. Using the Foursquare location ID given, we can retrieve Instagram location ID which can be used to track number of media (images and videos) posted, as well as their attributes such as number of likes, number of comments, its hashtags as well as the media itself.

Foursquare ID: 4eaa97b68b8180c90b73fa03

Key in: https://api.instagram.com/v1/locations/search?foursquare_v2_id=4eaa97b68b8180c90b73fa03

Output: “latitude”: 1.302992956, “id”: “9339214”, “longitude”: 103.859214062, “name”: “Hara Village Restaurant”

Key in: https://api.instagram.com/v1/locations/9339214/media/recent

Output: “attribution”: null, “tags”: [ “thecurrypuffi ncident” ], “type”: “image”, “location”: { “latitude”: 1.302992956, “name”: “Hara Village Restaurant”, “longitude”: 103.859214062, “id”: 9339214 }, “comments”: { “count”: 0,

“data”: [] }, “fi lter”: “X-Pro II”, “created_time”: “1388222018”, “link”: “https://instagram.com/p/idbm4JPCrX/”, “likes”: { “count”: 0, “data”: [] }, “images”: { “low_resolution”: { “url”: “https://scontent.cdninstagram.com/hphotos-xfp1/t51.2885-15/s306x306/e15/1515649_190096791186808_1241163523_n.jpg”, “width”: 306, “height”: 306 }, “users_in_photo”: [], “caption”: { “created_time”: “1388222018”, “text”: “very crispy but thick crusty cp.\n#thecurrypuffi ncident”, “from”: { “username”: “iotz”, “profi le_picture”: “https://igcdn-photos-d-a.akamaihd.net/hphotos-ak-xaf1/t51.2885-19/10860232_787456637956747_1693700934_a.jpg”, “id”: “2338849”, “full_name”: “Lifei” }, 5

Page 6: Big AggregateData

5. HYPOTHESES

Our fi rst hypothesis states that cafe districts that are of close proximity to tourist destinations district have lower user to check-in ratio. A lower user to check-in ratio would mean that the number of check-ins to that place is more unique (lesser instances of repeated check-ins due to frequenting the place). This is also based on the assumption that no one user is responsible for making multiple or repeated check-ins to the same location as compared to the others, but rather that every single user has an equal chance of making repeated check-ins to a particular location.

Our second hypothesis states that cafe districts with larger quantity and diversity of food options have a higher number of check-ins. Having more food choices attract more consumers to the area, as they recognise these areas as food districts and tend to go there more frequently for their meals. This would in turn result in a higher chance of people visiting the other nearby attractions before or after their meal and possibly check-in there too, therefore having more total check-ins to the whole cafe district.

Our third hypothesis states that food outlets that are of closer proximity to transportation nodes have a higher patronage. With greater accessibility, it makes it easier for customers to get to these food outlets. This also increases the frequency of passers-by to the food outlet, creating a higher chance of them patronizing the store.

6

Page 7: Big AggregateData

6. MAPS

Tanjong Pagar

3316 points1111 Food

10160 Instagram media posts4701 Instagram food-related media posts

7

Page 8: Big AggregateData

Arab Street

2270 points563 Food

6408 Instagram media posts2267 Instagram food-related media posts

8

Page 9: Big AggregateData

Jalan Besar

1001 points224 Food

2413 Instagram media posts648 Instagram food-related media posts

9

Page 10: Big AggregateData

7. BIG, AGGREGATE DATA

Instagram posts in Cafe districts (13 April - 19 April 2015)http://cdb.io/1OuoLe8

Instagram posts count http://cdb.io/1D2Tc4k

Jalan Besar Maphttp://cdb.io/1ItEWGL

Arab Street Maphttp://cdb.io/1ItF1KL

Tanjong Pagar Maphttp://cdb.io/1ItF4WS

10

Page 11: Big AggregateData

8. ANALYSIS

01234567

Use

r to

Chec

k-

Categories

-Jalan Besar

01234567

Use

r to

Chec

k-

Categories

-Tanjong Pagar

01234567

Use

r to

Chec

k-

Categories

-Arab Street

11

RELATIONSHIP BETWEEN PROXIMITY TO TOURIST DESTINATION WITH USER AND CHECK-IN RATIO

Among our 3 sites, we have identifi ed Arab Street as the cafe district which is located nearest to a tourist destination - Kampong Glam. A lower user to check-in ratio would mean that there are more unique check-ins to the area. Comparing the 3 graphs against each other, we calculated that Arab Street really does have the lowest total user to check-in ratio of 22.3, whereas that of Jalan Besar is 28.9 and Tanjong Pagar is 35.0.

Hence, this proves our hypothesis right as users checking in to Arab Street are more unique as they are more likely to be tourists who are visiting the nearby Kampong Glam. Tourists are also normally fi rst-time visitors and are unlikely to make multiple or repeated trips back to the same location as they are only here for a short period of time and would move on to check out other tourist attractions. Furthermore, the only peak in the graph in Arab Street belongs to the Professional category (offi ces). It makes sense for the user to check-in ratio of that to be higher as people working in the area would defi nitely make multiple check-ins to the same location as it is their work place.

Page 12: Big AggregateData

RELATIONSHIP BETWEEN DIVERSITY OF FOOD OPTIONS AND NUMBER OF CHECK-INS

Diversity is defi ned by the number of diff erent types of food options, as well as the quantity of each option. In this analysis, the diversity index is calculated using the Simpson’s Index of Diversity method1, which takes into account the number of diff erent food outlet types, as well as the relative abundance of each food option. An area with a higher diversity of food options would be one that has more choices and with each choice having a similar quantity. A higher index value represents a greater food diversity present in the cafe district.

Simpson’s Index of Diversity (D) Formula:

wheren: quantity of each food optionN: total number of food outlets

Looking at the respective food diversity indices of the 3 cafe districts, Arab Street has the highest food diversity index, followed by Tanjong Pagar, and lastly, Jalan Besar. However, this does not show a clear correlation between the diversity of food options with the total number of check-ins to the districts. Jalan Besar corresponds to this hypothesis, having the lowest food diversity index and number of check-ins. However, for Arab Street and Tanjong Pagar, even though the former has the highest food diversity index, it does not result in the most check-ins. This could be due to the sheer number of attractions in Tanjong Pagar, which is able to accommodate more people in the area, thus resulting in the greatest number of check-ins to the cafe district.

1 Simpson’s Diversity Index. (2013, May 13). Retrieved April 19, 2015, from http://geographyfi eldwork.com/Simpson’sDiversityIndex.htm

12

Page 13: Big AggregateData

However, there is a strong correlation (R2=1) between the food diversity index and the percentage of check-ins to food attractions in each cafe district. The greater the food diversity, the larger the proportion of total check-ins due to food. This shows that with a wider variety of food options, it would attract proportionately more people to the district for food, as the area would be more known for their food options.

An interesting observation during the analysis was the disparity between the results from Instagram as compared to Foursquare. The proportion of check-ins to food attractions was much higher from Instagram than from Foursquare. This could be due to the nature of the social media, with Instagram being a more visual-based platform as compared to Foursquare, which relates to the features of a food outlet, pictures are more enticing on a social media platform. This could also be due to the higher usage of Instagram2 as compared to Foursquare in Singapore.

2 Social Media in Singapore 2014 [Infographic] | Social Media Statistics. (n.d.). Retrieved April 19, 2015, from http://www.hashmeta.com/social-media-singa-pore-infographic/

13

Page 14: Big AggregateData

.

HIPSTER CAFES + ANCHORS

Mapping the food clusters with a heat map of bus check-in locations, there is little correlation between the areas of highest accessibility and those of higher patronage.

Instead, competitive clustering of specialty food outlets increases the amount of patron-age to the area. Clustering of cafes facilitates cafe hopping, where patrons visit cafes after cafes in the vicinity, which might also be a main contributor to the areas with higher patronage in Jalan Besar.

FOOD AND TRANSPORT (JALAN BESAR)

Specialty Food Outlets

14

Page 15: Big AggregateData

FOOD AND TRANSPORT (ARAB STREET)

In the case of Arab Street’s vicinity, we can see that while the area of highest patronage does not coincide with a specifi c transportation node, it is how-ever in the centre of several transporta-tion nodes, making it highly accessible.

The most clustered area is along Bus-sorah Street, where the cuisine por-trays the rich culture of the area.

Rich Culture

15

Page 16: Big AggregateData

Patronage in the Tanjong Pagar area is generally well distributed with both food and transporta-tion nodes dispersed in the area.

A slightly higher patronage is ob-served along Smith Street, which is a small food street with a vari-ety of international cuisine.

FOOD AND TRANSPORT (TANJONG PAGAR)

Diversity

16

Page 17: Big AggregateData

9. CONCLUSIONCafe districts that are of close proximity to tourist destination districts have lower user to check-in ratio.There is a positive correlation between the proximity of cafe districts to tourist destination districts and a lower user to check-in ratio. This illustrates the extent of uniqueness of users to such cafe districts and the implied gravity of tourist attractions as anchors in districts.

Cafe districts with larger quantity and diversity of food options have a higher number of check-ins.This was proven to a limited extent whereby there is a positive correlation between a larger diversity of food options and a higher number of check-ins to food attractions in the same district. However, this does not apply to the general patronage to the cafe district, as there are other attractor points present in these areas.

Food outlets that are of closer proximity to transportation nodes do not necessarily have a higher patronage. This disproved our initial hypothesis that food outlets that are more accessible have a greater number of check-ins, as from the data collected, we observe that the location of the food outlet within the cafe district has a greater impact on the patronage, due to the positive externalities of business clustering. Additionally, customers may have walked or taken their own private transport to the area, which is less infl uenced by the position of transportation nodes present in the district.

Our experiment utilised data gathered from Instagram and Foursquare, which may not have captured the entire demographics of visitors to these cafe districts, especially those of the older generation. However, it highlights the key features of a particular location, and is a representation of the patronage trends in these cafe districts. Since the younger generation visit these areas more regularly, it is assumed that they form the majority of the customers to the area. Perhaps check-in data could be gathered from other forms of social media too, such as Facebook and Twitter, to get a more comprehensive idea of the check-in trends in these districts.

17

Page 18: Big AggregateData

10. DESIGN RECOMMENDATIONS

18

As stated by Ahas R., Ülar M. (2005) Social Positioning Methods (SPM) has the capacity to infl uence the development of the urban environment through planning.

1. Advertising potential of cafes in cafe districts closer to tourist destinations

Our studies have indicated the correlation between lower user to check-in ratios and proximity to tourist destinations. This indicates the uniqueness of human traffi c at such locations, of which could be better serviced in the urban setting. More advertising platforms could be introduced to these locations which may be able to reach out to a more diverse spread of users, based on the implied diversity of human traffi c present such areas.

2. Wayfi nding methods by using tourist destinations as primary nodes and food outlets at secondary nodes

Informal directional instructions for wayfi nding in neighbourhoods near tourist destinations might increasingly utilise food outlets as secondary placemakers to the primary placemakers of key tourist destinations. These might increasingly be utilised in signage and directional pamphets to guide potential cilentele and public to and around specfi c businesses in the location.

3. Business strategies for businesses in clusters near tourist destinations

Collaborative business strategies could be executed by cafes located in such clusters near tourist destinations. An urban identity could be forged based on this existing demographic diversity of users around tourist destinations. Thus creating a stronger link in the urban neighbourhood to create a more cohesive an integrated urban environment.

Page 19: Big AggregateData

11. BIBLIOGRAPHYAhas R., Ülar M., 2005, Location based services—new challenges for planning and public administration?, Futures, 37:6 547-561

Cheng, Z., Caverlee, J., Lee, K., & Sui, D. Z. (n.d.). Exploring millions of footprints in location sharing services. ICWSM, 2011

Eagle, N., & Pentland, S. (2007). Eigenbehaviors: Identifying Structure in Routine. Behavior. Ecology and Sociobiology.

González, C. M., & Barabási, A.-L. (2008). Understanding individual human mobility patterns. Nature, 453, 779–782.

Liu, Liu (2014) C-IMAGE : city cognitive mapping through geo-tagged photos. MIT Thesis.

Sevtsuk, A., Ratti, C., 2010, Does Urban Mobility Have a Daily Routine? Learning from the Aggregate Data of Mobile Networks, Journal of Urban Technology, Volume: 17, Issue: 1, Pages: 41-60.

Simpson’s Diversity Index. (2013, May 13). Retrieved April 19, 2015, from http://geographyfi eldwork.com/Simpson’sDiversityIndex.htm

Social Media in Singapore 2014 [Infographic] | Social Media Statistics. (n.d.). Retrieved April 19, 2015, from http://www.hashmeta.com/social-media-singapore-infographic/

19