discoveringthegraph-basedflowpatternsofcartouristsusing...

15
ResearchArticle Discovering the Graph-Based Flow Patterns of Car Tourists Using License Plate Data: A Case Study in Shenzhen, China He Bing, 1 Kong Bo , 2 Yin Ling, 1 Wu Qin, 3 Hu Jinxing, 1 Huang Dian, 4 and Ma Zhanwu 1 1 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China 2 Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, China 3 Chengdu University of Information Technology, School of Computer Science, Chengdu 610225, China 4 National Supercomputing Center in Shenzhen, Shenzhen 518055, China Correspondence should be addressed to Kong Bo; [email protected] Received 2 January 2020; Revised 16 March 2020; Accepted 18 July 2020; Published 6 August 2020 Academic Editor: Young-Ji Byon Copyright©2020HeBingetal.isisanopenaccessarticledistributedundertheCreativeCommonsAttributionLicense,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Identifying flow patterns from massive trajectories of car tourists is considered a promising way to improve the management of tourism traffic. Previous researches have mainly focused on tourist movements at the macro-scale, such as inbound, domestic, and urban tourism using flow maps. Compared with modeling the flow patterns of tourists at the macro-scale, modeling tourist flow at the microscale is more complicated. is paper takes Dapeng Island located in Shenzhen as the study area and uses the car recognition devices to collect traffic flow. Firstly, car tourists are separated from the mixed traffic flow after analyzing the spatial- temporal characteristics of tourists and residents. Next, daily graphs of tourist movements between road segments and tourist attractions are constructed. Finally, a frequent subgraph mining algorithm is used to extract the flow patterns of car tourists. e experimental results show that (1) car tourists have obvious preferences in the selection of trip time and tourist attractions; (2) the intercity tourists tend to take multidestination trips rather than a single destination trip in the same type of attractions; (3) car tourists are inclined to park their cars in an easy-to-access place, even if the attractions visited are changed. e main contribution of this paper is to present a new method for discovering the flow patterns of car tourists hidden in massive amounts of license plate data. 1. Introduction Due to the flexibility and convenience of road trans- portation, car-based tourism (travel in owned or rented cars, also named driving tours [1], car tourism [2], and self- driving tours [3]; for simplicity, this paper uses the term car tourism) has been one of the popular forms for leisure and recreation. Recently, car tourism has been growing rapidly in China, and its scale is continuing to expand with the im- provement of road infrastructure and the growth of car ownership. A statistical report indicated that by 2015, there were 2.34 billion car tourists in China, accounting for more than 58.5% of the total domestic tourists [4]. It can be foreseen that the percentage of car tourists will increase over time. However, car tourists need to share roads in urban areas or tourist attractions with residents, and they depend on the road network to achieve circulation between the places of origin and multiple tourist attractions. Currently, urban roads are heavily crowded. When a large number of tourist cars enter the road network during peak tourist season, the pressure on road traffic management may be increased. It is worth noting that this phenomenon is severe for coastal tourist attractions. Coastal islands are one of the favorite tourist destina- tions. e development of road network on the islands often precedes the development of tourist attractions, and new infrastructures and facilities are being built to handle the increase in tourist traffic. us, tourism activities tend to be superimposed on a spatial system and infrastructure net- work that was not explicitly designed to cater to them and tourism activities can be unevenly distributed [5]. Addi- tionally, some islands are connected to the mainland. e roads entering and leaving these islands have become bot- tlenecks for tourism transportation, which poses challenges Hindawi Journal of Advanced Transportation Volume 2020, Article ID 4795830, 15 pages https://doi.org/10.1155/2020/4795830

Upload: others

Post on 09-Aug-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

Research ArticleDiscovering the Graph-Based Flow Patterns of Car Tourists UsingLicense Plate Data A Case Study in Shenzhen China

He Bing1 Kong Bo 2 Yin Ling1 Wu Qin3 Hu Jinxing1 Huang Dian4 and Ma Zhanwu1

1Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen 518055 China2Institute of Mountain Hazards and Environment Chinese Academy of Sciences Chengdu 610041 China3Chengdu University of Information Technology School of Computer Science Chengdu 610225 China4National Supercomputing Center in Shenzhen Shenzhen 518055 China

Correspondence should be addressed to Kong Bo kongbo827imdeaccn

Received 2 January 2020 Revised 16 March 2020 Accepted 18 July 2020 Published 6 August 2020

Academic Editor Young-Ji Byon

Copyright copy 2020He Bing et al+is is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Identifying flow patterns from massive trajectories of car tourists is considered a promising way to improve the management oftourism traffic Previous researches have mainly focused on tourist movements at the macro-scale such as inbound domestic andurban tourism using flowmaps Compared with modeling the flow patterns of tourists at the macro-scale modeling tourist flow atthe microscale is more complicated +is paper takes Dapeng Island located in Shenzhen as the study area and uses the carrecognition devices to collect traffic flow Firstly car tourists are separated from the mixed traffic flow after analyzing the spatial-temporal characteristics of tourists and residents Next daily graphs of tourist movements between road segments and touristattractions are constructed Finally a frequent subgraph mining algorithm is used to extract the flow patterns of car tourists +eexperimental results show that (1) car tourists have obvious preferences in the selection of trip time and tourist attractions (2) theintercity tourists tend to take multidestination trips rather than a single destination trip in the same type of attractions (3) cartourists are inclined to park their cars in an easy-to-access place even if the attractions visited are changed+emain contributionof this paper is to present a new method for discovering the flow patterns of car tourists hidden in massive amounts of licenseplate data

1 Introduction

Due to the flexibility and convenience of road trans-portation car-based tourism (travel in owned or rented carsalso named driving tours [1] car tourism [2] and self-driving tours [3] for simplicity this paper uses the term cartourism) has been one of the popular forms for leisure andrecreation Recently car tourism has been growing rapidly inChina and its scale is continuing to expand with the im-provement of road infrastructure and the growth of carownership A statistical report indicated that by 2015 therewere 234 billion car tourists in China accounting for morethan 585 of the total domestic tourists [4] It can beforeseen that the percentage of car tourists will increase overtime However car tourists need to share roads in urbanareas or tourist attractions with residents and they dependon the road network to achieve circulation between the

places of origin and multiple tourist attractions Currentlyurban roads are heavily crowded When a large number oftourist cars enter the road network during peak touristseason the pressure on road traffic management may beincreased It is worth noting that this phenomenon is severefor coastal tourist attractions

Coastal islands are one of the favorite tourist destina-tions +e development of road network on the islands oftenprecedes the development of tourist attractions and newinfrastructures and facilities are being built to handle theincrease in tourist traffic +us tourism activities tend to besuperimposed on a spatial system and infrastructure net-work that was not explicitly designed to cater to them andtourism activities can be unevenly distributed [5] Addi-tionally some islands are connected to the mainland +eroads entering and leaving these islands have become bot-tlenecks for tourism transportation which poses challenges

HindawiJournal of Advanced TransportationVolume 2020 Article ID 4795830 15 pageshttpsdoiorg10115520204795830

to the coordination of traffic on and off islands Moreover incontrast to commuting transportation tourism trans-portation has different characteristics in time and space andit requires more comfort and convenience +e problemsmentioned above show that if tourism transportation is nottaken seriously in traffic management it is likely to increasetravel difficulties for residents and it will affect the travelwillingness of tourists and the sustainable development oftourism transportation

+e recording and analysis of trajectories are essentialfor understanding the movement of tourists and themanagement of tourist traffic such as the optimal locationand development of transportation facilities and the re-distribution of tourists However the lack of practicalapproaches for the collection of relevant data limits thedetailed exploration of tourist mobility +e traditionalmethod involves paper-and-pencil or computer interviewswhich are expensive and time-consuming +e collecteddata are also typically limited in terms of personal infor-mation such as family composition age structure andfavorite tourist attractions [6] Recently with the devel-opment of sensors such as GPS tracker video recognitiondevice and RFID which can capture movement data inreal-time and with spatial and temporal details the tra-jectory-based data analysis methods have been widely usedin transportation research +e analysis results providereal-time and future traffic information for road trafficmanagers and travelers as well as technical support for therelief of traffic jams However current observations of roadtraffic are limited to statistical information such as trafficvolume occupancy and speed Movement patterns aredepicted in a flow graph or reported by visual descriptionsrather than exploring flow patterns Additionally roadtraffic has the characteristics of variability and correlationin time and space Previous researches have demonstratedthat sectional traffic flow is interrelated to the distances andlocations of monitoring points and the topology of roadnetwork +erefore it is necessary to consider the structureof road network and the correlation between time andspace in the analysis of tourist traffic +is consideration ismore useful in explaining the deeper behavior of touristtraffic

+is study is an attempt to investigate the flow patternsof car tourists by applying a frequent subgraph miningalgorithm +is algorithm can take into account thecorrelation of traffic flows captured by video sensorsFrom the graph and flow perspective a coastal island isused as an experimental area to explore the dynamicrelationship between multiple tourist attractions and keyroad segments+is paper is organized as follows+e nextsection reviews related work on movement pattern miningand the methods for analyzing trajectory-based data ontourists and traffic flows Section 3 introduces the studyarea (Dapeng Island Shenzhen China) Section 4 de-scribes the distribution of the monitoring points in detailSection 5 introduces the data and methods used in thispaper Section 6 presents the results of flow patternsgenerated by car tourists Finally we finish with a dis-cussion and conclusion

2 Related Work

In recent years movement patterns have been analyzedfrequently from transportation to tourism such as theresearch of movement patterns hidden in taxis [7ndash11]buses [12 13] railways [14] tourist movements [15ndash20]and even in geo-tagged media datasets [21 22] In terms oftourism transportation research the efficient manage-ment of tourist traffic requires a sound understanding ofcar touristsrsquo spatial movement patterns because thesepatterns provide critical information eg the flow vol-ume and spatial transfer direction for the planning of newtransportation facilities and the redistribution of touristflow As is well known movement is an intrinsic attributeof traffic flow that changes over time with respect to thespatial location of people goods and cars +e patternsimplied in moving datasets are not repeatedly producedby a single car tourist but rather by a huge number of carsthat appear in the same area In most cases the collectedmoving datasets of traffic entities are relatively large involume and complex in structure +erefore it is neces-sary to use data mining algorithms and visual analyticstechniques to extract useful and relevant informationregularities and structures from massive movementdatasets +e data mining algorithms used in trans-portation are varied +ese algorithms focus on clustering[8] density and sequential characteristics [9 10] in timeand space +e leisure activities of car tourists are carriedout in a road network +e activity sequences can bemodeled in a graph that consists of different nodes (forexample parking lots and cultural sites) and edges withdirection that are the order of locations visited For thiskind of dataset graph mining is a widely used method thatfinds interesting patterns in graph representation data[23] +e detected patterns are typically expressed asgraphs which may be subgraphs of graphical data or moreabstract expressions of the trends reflected in data [24]One form of graph mining is frequent subgraph miningwhich is used to identify frequently occurring patterns(subgraphs) across a collection of ldquosmallrdquo graphs or in aldquolargerdquo graph [25] Various subgraph mining algorithmshave been proposed +ese algorithms can be furtherclassified based on the search strategies ie eitherbreadth-first or depth-first searches +e depth-firstsearch strategy is more computationally efficient such asin gSpan (graph-based Substructure pattern mining) [26]MoFa (Molecule Fragment Miner) [27] FFSM (FastFrequent Subgraph Mining) [28] and Gaston (GrAphSequenceTree extractiON) [29] SPIN (Spanning treebased maximal graph mining) [30] However FFSM andGaston cannot be used for directed graphs without majorchanges Only MoFa is suitable for finding directed fre-quent subgraphs and for gSpan only minor changes arenecessary [31] Other related works include significantpattern mining Leap [32] maximal frequent subgraphmining Margin [33] and frequent subgraphs in multi-graphs [34]

During the past few years trajectory-based methodshave been used to analyze transportation systems

2 Journal of Advanced Transportation

[10 11 13 14] In many applications moving entities areconsidered moving points whose trajectories (ie pathsthrough space and time) can be visualized and analyzed Intransportation the collected trajectory data can be presentedin origin-destination (OD) data with aggregation methods[35] Such OD data can be visualized with a set of techniquesincluding flow maps [36 37] and OD maps [38] Never-theless the study of the spatial dimensions of tourism re-mains amostly underexplored area of research although thisresearch area is expanding due to the advances of new in-formation and communication technologies (ICT) Tradi-tional approaches in tourism can be divided into twocategories which are direct observation techniques (eginterviews trip diaries and recall diaries) and non-observation techniques (eg GPS tracking and videotracking) but the use of nonparticipant observation only isthe best technique for privacy reasons [5] Even with ICTsupport this technique has difficulties in data collection andthe large-scale sampling of passenger data is costly Mostpublished studies related to movement patterns are stilldescriptive and they employ small sample sizes that arehighly controlled Moreover this kind of research is focusedon the human movement in tourist intradestination +erehave been a few studies that have explored car-related spatialmovement patterns in tourist destinations Even so theresearch was aimed at large-scale car touristsrsquo activities [39]or used the questionnaire method which is prone to biasesand errors [40] +erefore given the requirements of pro-tecting privacy increasing data volume and avoiding in-vestigator biases it is necessary to conduct flow patternresearch based on continuous time-series data acquiredfrom sensors

3 Study Area

Dapeng Island located in the east of Shenzhen (as shown inFigure 1) is an essential node in the ldquoGuangdong-HongKong-Macao Greater Bay Areardquo and it is the only pioneerzone of national tourism reform and innovation in Guang-dong province Dapeng has abundant tourism resources suchas Dapeng Ancient City National Geological Park and FolkVillage +e ldquoShenzhen Tourism Statistics Bulletinrdquo showedthat a total of 139 million tourists visited this city in 2018+eincrease was 597 percent each year of which only one-tenthwas group tourists +is indicates that most tourism activitiesare carried out by individual visitors Because of topo-graphical constraints tourism transportation on the islandhas not been fully developed Owned and rented car tours arethe main modes of visiting the island for individual tourists+e Dapeng Transportation Bureau has analyzed the trend ofmotorized travel demand on the island and predicted that thetotal annual traffic flow would be 504000 carsyear At thepeak time of ldquoGolden Weekrdquo about 35000 carsday enteredinto the island of which 79 were car tourists

4 Distribution of Monitoring Points

Urban transportation systems usually employ GPS tech-nology to capture taxi and bus tracks Different from this

kind of public transportation research this paper aims toanalyze the flow patterns of car tourists at multiple attrac-tions It is difficult to install GPS device on each personal car+erefore we chose roadside monitoring devices to collecttraffic flow In addition urban road network includes ex-pressways ordinary roads and community roads It has alarge number of nodes and complex structure In order tomonitor each road segment many devices will need to bedeployed on roads So the key road segments and touristattractions were selected as the locations of monitoringpoints Five video devices were deployed at key roads andtwo video devices were deployed in parking lots +e labelsof monitoring devices are A1 A2 A5 A7 A10 P1 and P2(as shown in Figure 2)

+e detected tourist flow at each monitoring point isshown in Table 1

5 Methods

In this study a cloud-based database system was establishedto store traffic data after they were uploaded via 4G com-munication technologies +e collected data includes licenseplate numbers time of passage and labels for monitoringpoints In order to protect touristsrsquo private informationlicense plate numbers were changed into car IDs and onlythe registration places of cars were extracted After datacollection period the license plate numbers will be deletedfrom the database

+e proposed approach is outlined in Figure 3 +issection introduces the detailed steps for the mining offrequent flow patterns of car tourists at the microscale in atourist intradestination First car tourists were separatedfrom mixed traffic flow after analyzing the temporalcharacteristics of collected data Second spatial move-ment graphs of traffic flow were reconstructed for eachday Each movement graph is a connected and directedgraph where vertices are monitoring points and directededges are tourist flow between two monitoring points+en a frequent subgraph mining algorithm (gSpan) wasused to detect the flow patterns between road segmentsand tourist attractions Next in order to reduce thenumber of frequent flow patterns small overlappingsubgraphs were removed from the results Finally weanalyzed the spatial-temporal characteristics of flowpatterns intradestination

51 Preprocessing Source Data When collecting data it isinevitable that problem data will be collected +is can becaused by a problem with the device such as an aging ordamaged camera Additionally a license plate may beblurred blocked or damaged especially in bad weatherwhich can affect the efficiency of car recognition Fur-thermore it is challenging to identify some specialcharacters and confusing numbers on license plates +eabove issues can lead to data distortion To ensure theaccuracy and reliability of the analyzed results the col-lected data need to be processed at first +e rules ofprocessing are as follows

Journal of Advanced Transportation 3

22deg50prime0PrimeN

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

22deg40prime0PrimeN

22deg30prime0PrimeN

22deg50prime0PrimeN

22deg40prime0PrimeN

22deg30prime0PrimeN

Figure 1 +e location of dapeng island Shenzhen

Kilometers

0 05 1 2 3 4

N

LandscapeMonitoring pointRoad network

Figure 2 Distribution of monitoring points for tourist traffic flow

4 Journal of Advanced Transportation

(1) Deleted irrelevant fields +e primary informationsuch as car license plate number label of monitoringdevice and collection time is retained

(2) Removed null values and corrected license plateattributions

(3) Eliminated invalid data such as special car licenseplates and duplicate data

(4) Corrected confusing letters and numbers in caricense plates

52 Identifying Car Tourists In this study tourist flow wasdivided into three types One type consisted of commuterson the island the next consisted of intracity tourists (localweekend tourists in Shenzhen) and the last type consisted ofintercity tourists (leisure tourists from outside of Shenzhen)+e collected data came from the cameras on the roads andin parking lots +ese three types of flows were mixed to-gether in the collected data It is necessary to separate thedifferent types of traffic flows +e detailed steps are shownin Figure 4

+e data collected from the parking lots were classifiedinto intracity tourists and intercity tourists according to theregistration location of car license plates

As we know the number of trips made by tourists andlocal commuters is different Tourists only visit the islandoccasionally on weekends or holidays Local residents on theislandmight drive more times per week+erefore we took aweek as a unit and determined if a car had visited the islandduring that week If so this car would be tagged once +en

we counted the number of weeks a car appeared in eachmonth If the number of weeks visited exceeded a predefinedthreshold this car was considered to be a commuterOtherwise this car was considered to be a tourist

+erefore for the data collected from roads we first set athreshold manually based on the statistics of the number ofweeks visited in one month to distinguish island commutersand tourists Next we categorized visitors into intracity cartourists and intercity car tourists based on where theirlicense plates are registered Finally different types of trafficflows were separated and aggregated

In order to verify the usability and reliability of the proposedmethod the visiting characteristics of all cars were analyzed+eresult is shown in Figure 5 As can be seen the proportion ofweeks in one month in which the car appears is the highestreaching 8886+e percentage of cars appearing on the islandfor less than two weeks is 95 In addition we counted thepercentage of cars in the parking lots relative to the totalnumber of cars +e ratio is 786 which is close to the ratio of79 counted by Dapeng Transportation Bureau during peaktourist periods +e ratio of the number of weeks visited in onemonth (8886) is greater than the statistical result of Trans-portation Bureau (79) We think that the TransportationBureau only considered the tourists in parking lots +ereforethe threshold in this study is oneweek for extracting car tourists

53 Reconstructing the SpatialMovementGraphs In order tomodel the flow patterns of car tourists labeled direct graphsare used to construct movement relationships betweenmonitoring points In particular each vertex of the direct

Table 1 Car tourists detected at monitoring points

Label Location Detected car touristsA1 Kuinan Road Dapeng Community and Xinda CommunityA2 Dongchong Road Dongchong CommunityA5 Pengfei Road Dapeng community and Dapeng Ancient City Jiaochangwei Folk Village Dongshan TempleA7 Fumin Road Nanrsquoao CommunityA10 Nanxi Road Xichong CommunityP1 Jiaochangwei Park Lots Jiaochangwei Folk VillageP2 Ancient City Park Lots Dapeng Ancient City Dongshan Temple

Recognition devices forcars

Cloud databasesystem

Preprocess data

Extract the valid recordsExtract continuous data during

this period all monitoringdevice is normal operation

Analyze the time characteristicsof source data

Find the vehicles in parking lotduring the research period

Find the vehicles that visit theisland each week

Count the number of weeks thateach car appears in a month

Create graph dataset

Create the traffic spatialmovement graph from the

sequence data each day

Construct the sequence data ofmonitoring points that each

vehicle passes each day inchronological order

Separate car tourists andcommuting residents based onthe weekly threshold and the

historical vehicles in theparking lots

Frequent sub-graph mining

Remove small overlappingsub-graphs

Spatio-temporal charateristicsof vehicles

Analyze spatio-temporalcharacteristics and tourist flow

patterns

Spatial flow patterns of cartourists

Figure 3 +e detailed steps for mining frequent patterns in tourist flow

Journal of Advanced Transportation 5

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 2: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

to the coordination of traffic on and off islands Moreover incontrast to commuting transportation tourism trans-portation has different characteristics in time and space andit requires more comfort and convenience +e problemsmentioned above show that if tourism transportation is nottaken seriously in traffic management it is likely to increasetravel difficulties for residents and it will affect the travelwillingness of tourists and the sustainable development oftourism transportation

+e recording and analysis of trajectories are essentialfor understanding the movement of tourists and themanagement of tourist traffic such as the optimal locationand development of transportation facilities and the re-distribution of tourists However the lack of practicalapproaches for the collection of relevant data limits thedetailed exploration of tourist mobility +e traditionalmethod involves paper-and-pencil or computer interviewswhich are expensive and time-consuming +e collecteddata are also typically limited in terms of personal infor-mation such as family composition age structure andfavorite tourist attractions [6] Recently with the devel-opment of sensors such as GPS tracker video recognitiondevice and RFID which can capture movement data inreal-time and with spatial and temporal details the tra-jectory-based data analysis methods have been widely usedin transportation research +e analysis results providereal-time and future traffic information for road trafficmanagers and travelers as well as technical support for therelief of traffic jams However current observations of roadtraffic are limited to statistical information such as trafficvolume occupancy and speed Movement patterns aredepicted in a flow graph or reported by visual descriptionsrather than exploring flow patterns Additionally roadtraffic has the characteristics of variability and correlationin time and space Previous researches have demonstratedthat sectional traffic flow is interrelated to the distances andlocations of monitoring points and the topology of roadnetwork +erefore it is necessary to consider the structureof road network and the correlation between time andspace in the analysis of tourist traffic +is consideration ismore useful in explaining the deeper behavior of touristtraffic

+is study is an attempt to investigate the flow patternsof car tourists by applying a frequent subgraph miningalgorithm +is algorithm can take into account thecorrelation of traffic flows captured by video sensorsFrom the graph and flow perspective a coastal island isused as an experimental area to explore the dynamicrelationship between multiple tourist attractions and keyroad segments+is paper is organized as follows+e nextsection reviews related work on movement pattern miningand the methods for analyzing trajectory-based data ontourists and traffic flows Section 3 introduces the studyarea (Dapeng Island Shenzhen China) Section 4 de-scribes the distribution of the monitoring points in detailSection 5 introduces the data and methods used in thispaper Section 6 presents the results of flow patternsgenerated by car tourists Finally we finish with a dis-cussion and conclusion

2 Related Work

In recent years movement patterns have been analyzedfrequently from transportation to tourism such as theresearch of movement patterns hidden in taxis [7ndash11]buses [12 13] railways [14] tourist movements [15ndash20]and even in geo-tagged media datasets [21 22] In terms oftourism transportation research the efficient manage-ment of tourist traffic requires a sound understanding ofcar touristsrsquo spatial movement patterns because thesepatterns provide critical information eg the flow vol-ume and spatial transfer direction for the planning of newtransportation facilities and the redistribution of touristflow As is well known movement is an intrinsic attributeof traffic flow that changes over time with respect to thespatial location of people goods and cars +e patternsimplied in moving datasets are not repeatedly producedby a single car tourist but rather by a huge number of carsthat appear in the same area In most cases the collectedmoving datasets of traffic entities are relatively large involume and complex in structure +erefore it is neces-sary to use data mining algorithms and visual analyticstechniques to extract useful and relevant informationregularities and structures from massive movementdatasets +e data mining algorithms used in trans-portation are varied +ese algorithms focus on clustering[8] density and sequential characteristics [9 10] in timeand space +e leisure activities of car tourists are carriedout in a road network +e activity sequences can bemodeled in a graph that consists of different nodes (forexample parking lots and cultural sites) and edges withdirection that are the order of locations visited For thiskind of dataset graph mining is a widely used method thatfinds interesting patterns in graph representation data[23] +e detected patterns are typically expressed asgraphs which may be subgraphs of graphical data or moreabstract expressions of the trends reflected in data [24]One form of graph mining is frequent subgraph miningwhich is used to identify frequently occurring patterns(subgraphs) across a collection of ldquosmallrdquo graphs or in aldquolargerdquo graph [25] Various subgraph mining algorithmshave been proposed +ese algorithms can be furtherclassified based on the search strategies ie eitherbreadth-first or depth-first searches +e depth-firstsearch strategy is more computationally efficient such asin gSpan (graph-based Substructure pattern mining) [26]MoFa (Molecule Fragment Miner) [27] FFSM (FastFrequent Subgraph Mining) [28] and Gaston (GrAphSequenceTree extractiON) [29] SPIN (Spanning treebased maximal graph mining) [30] However FFSM andGaston cannot be used for directed graphs without majorchanges Only MoFa is suitable for finding directed fre-quent subgraphs and for gSpan only minor changes arenecessary [31] Other related works include significantpattern mining Leap [32] maximal frequent subgraphmining Margin [33] and frequent subgraphs in multi-graphs [34]

During the past few years trajectory-based methodshave been used to analyze transportation systems

2 Journal of Advanced Transportation

[10 11 13 14] In many applications moving entities areconsidered moving points whose trajectories (ie pathsthrough space and time) can be visualized and analyzed Intransportation the collected trajectory data can be presentedin origin-destination (OD) data with aggregation methods[35] Such OD data can be visualized with a set of techniquesincluding flow maps [36 37] and OD maps [38] Never-theless the study of the spatial dimensions of tourism re-mains amostly underexplored area of research although thisresearch area is expanding due to the advances of new in-formation and communication technologies (ICT) Tradi-tional approaches in tourism can be divided into twocategories which are direct observation techniques (eginterviews trip diaries and recall diaries) and non-observation techniques (eg GPS tracking and videotracking) but the use of nonparticipant observation only isthe best technique for privacy reasons [5] Even with ICTsupport this technique has difficulties in data collection andthe large-scale sampling of passenger data is costly Mostpublished studies related to movement patterns are stilldescriptive and they employ small sample sizes that arehighly controlled Moreover this kind of research is focusedon the human movement in tourist intradestination +erehave been a few studies that have explored car-related spatialmovement patterns in tourist destinations Even so theresearch was aimed at large-scale car touristsrsquo activities [39]or used the questionnaire method which is prone to biasesand errors [40] +erefore given the requirements of pro-tecting privacy increasing data volume and avoiding in-vestigator biases it is necessary to conduct flow patternresearch based on continuous time-series data acquiredfrom sensors

3 Study Area

Dapeng Island located in the east of Shenzhen (as shown inFigure 1) is an essential node in the ldquoGuangdong-HongKong-Macao Greater Bay Areardquo and it is the only pioneerzone of national tourism reform and innovation in Guang-dong province Dapeng has abundant tourism resources suchas Dapeng Ancient City National Geological Park and FolkVillage +e ldquoShenzhen Tourism Statistics Bulletinrdquo showedthat a total of 139 million tourists visited this city in 2018+eincrease was 597 percent each year of which only one-tenthwas group tourists +is indicates that most tourism activitiesare carried out by individual visitors Because of topo-graphical constraints tourism transportation on the islandhas not been fully developed Owned and rented car tours arethe main modes of visiting the island for individual tourists+e Dapeng Transportation Bureau has analyzed the trend ofmotorized travel demand on the island and predicted that thetotal annual traffic flow would be 504000 carsyear At thepeak time of ldquoGolden Weekrdquo about 35000 carsday enteredinto the island of which 79 were car tourists

4 Distribution of Monitoring Points

Urban transportation systems usually employ GPS tech-nology to capture taxi and bus tracks Different from this

kind of public transportation research this paper aims toanalyze the flow patterns of car tourists at multiple attrac-tions It is difficult to install GPS device on each personal car+erefore we chose roadside monitoring devices to collecttraffic flow In addition urban road network includes ex-pressways ordinary roads and community roads It has alarge number of nodes and complex structure In order tomonitor each road segment many devices will need to bedeployed on roads So the key road segments and touristattractions were selected as the locations of monitoringpoints Five video devices were deployed at key roads andtwo video devices were deployed in parking lots +e labelsof monitoring devices are A1 A2 A5 A7 A10 P1 and P2(as shown in Figure 2)

+e detected tourist flow at each monitoring point isshown in Table 1

5 Methods

In this study a cloud-based database system was establishedto store traffic data after they were uploaded via 4G com-munication technologies +e collected data includes licenseplate numbers time of passage and labels for monitoringpoints In order to protect touristsrsquo private informationlicense plate numbers were changed into car IDs and onlythe registration places of cars were extracted After datacollection period the license plate numbers will be deletedfrom the database

+e proposed approach is outlined in Figure 3 +issection introduces the detailed steps for the mining offrequent flow patterns of car tourists at the microscale in atourist intradestination First car tourists were separatedfrom mixed traffic flow after analyzing the temporalcharacteristics of collected data Second spatial move-ment graphs of traffic flow were reconstructed for eachday Each movement graph is a connected and directedgraph where vertices are monitoring points and directededges are tourist flow between two monitoring points+en a frequent subgraph mining algorithm (gSpan) wasused to detect the flow patterns between road segmentsand tourist attractions Next in order to reduce thenumber of frequent flow patterns small overlappingsubgraphs were removed from the results Finally weanalyzed the spatial-temporal characteristics of flowpatterns intradestination

51 Preprocessing Source Data When collecting data it isinevitable that problem data will be collected +is can becaused by a problem with the device such as an aging ordamaged camera Additionally a license plate may beblurred blocked or damaged especially in bad weatherwhich can affect the efficiency of car recognition Fur-thermore it is challenging to identify some specialcharacters and confusing numbers on license plates +eabove issues can lead to data distortion To ensure theaccuracy and reliability of the analyzed results the col-lected data need to be processed at first +e rules ofprocessing are as follows

Journal of Advanced Transportation 3

22deg50prime0PrimeN

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

22deg40prime0PrimeN

22deg30prime0PrimeN

22deg50prime0PrimeN

22deg40prime0PrimeN

22deg30prime0PrimeN

Figure 1 +e location of dapeng island Shenzhen

Kilometers

0 05 1 2 3 4

N

LandscapeMonitoring pointRoad network

Figure 2 Distribution of monitoring points for tourist traffic flow

4 Journal of Advanced Transportation

(1) Deleted irrelevant fields +e primary informationsuch as car license plate number label of monitoringdevice and collection time is retained

(2) Removed null values and corrected license plateattributions

(3) Eliminated invalid data such as special car licenseplates and duplicate data

(4) Corrected confusing letters and numbers in caricense plates

52 Identifying Car Tourists In this study tourist flow wasdivided into three types One type consisted of commuterson the island the next consisted of intracity tourists (localweekend tourists in Shenzhen) and the last type consisted ofintercity tourists (leisure tourists from outside of Shenzhen)+e collected data came from the cameras on the roads andin parking lots +ese three types of flows were mixed to-gether in the collected data It is necessary to separate thedifferent types of traffic flows +e detailed steps are shownin Figure 4

+e data collected from the parking lots were classifiedinto intracity tourists and intercity tourists according to theregistration location of car license plates

As we know the number of trips made by tourists andlocal commuters is different Tourists only visit the islandoccasionally on weekends or holidays Local residents on theislandmight drive more times per week+erefore we took aweek as a unit and determined if a car had visited the islandduring that week If so this car would be tagged once +en

we counted the number of weeks a car appeared in eachmonth If the number of weeks visited exceeded a predefinedthreshold this car was considered to be a commuterOtherwise this car was considered to be a tourist

+erefore for the data collected from roads we first set athreshold manually based on the statistics of the number ofweeks visited in one month to distinguish island commutersand tourists Next we categorized visitors into intracity cartourists and intercity car tourists based on where theirlicense plates are registered Finally different types of trafficflows were separated and aggregated

In order to verify the usability and reliability of the proposedmethod the visiting characteristics of all cars were analyzed+eresult is shown in Figure 5 As can be seen the proportion ofweeks in one month in which the car appears is the highestreaching 8886+e percentage of cars appearing on the islandfor less than two weeks is 95 In addition we counted thepercentage of cars in the parking lots relative to the totalnumber of cars +e ratio is 786 which is close to the ratio of79 counted by Dapeng Transportation Bureau during peaktourist periods +e ratio of the number of weeks visited in onemonth (8886) is greater than the statistical result of Trans-portation Bureau (79) We think that the TransportationBureau only considered the tourists in parking lots +ereforethe threshold in this study is oneweek for extracting car tourists

53 Reconstructing the SpatialMovementGraphs In order tomodel the flow patterns of car tourists labeled direct graphsare used to construct movement relationships betweenmonitoring points In particular each vertex of the direct

Table 1 Car tourists detected at monitoring points

Label Location Detected car touristsA1 Kuinan Road Dapeng Community and Xinda CommunityA2 Dongchong Road Dongchong CommunityA5 Pengfei Road Dapeng community and Dapeng Ancient City Jiaochangwei Folk Village Dongshan TempleA7 Fumin Road Nanrsquoao CommunityA10 Nanxi Road Xichong CommunityP1 Jiaochangwei Park Lots Jiaochangwei Folk VillageP2 Ancient City Park Lots Dapeng Ancient City Dongshan Temple

Recognition devices forcars

Cloud databasesystem

Preprocess data

Extract the valid recordsExtract continuous data during

this period all monitoringdevice is normal operation

Analyze the time characteristicsof source data

Find the vehicles in parking lotduring the research period

Find the vehicles that visit theisland each week

Count the number of weeks thateach car appears in a month

Create graph dataset

Create the traffic spatialmovement graph from the

sequence data each day

Construct the sequence data ofmonitoring points that each

vehicle passes each day inchronological order

Separate car tourists andcommuting residents based onthe weekly threshold and the

historical vehicles in theparking lots

Frequent sub-graph mining

Remove small overlappingsub-graphs

Spatio-temporal charateristicsof vehicles

Analyze spatio-temporalcharacteristics and tourist flow

patterns

Spatial flow patterns of cartourists

Figure 3 +e detailed steps for mining frequent patterns in tourist flow

Journal of Advanced Transportation 5

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 3: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

[10 11 13 14] In many applications moving entities areconsidered moving points whose trajectories (ie pathsthrough space and time) can be visualized and analyzed Intransportation the collected trajectory data can be presentedin origin-destination (OD) data with aggregation methods[35] Such OD data can be visualized with a set of techniquesincluding flow maps [36 37] and OD maps [38] Never-theless the study of the spatial dimensions of tourism re-mains amostly underexplored area of research although thisresearch area is expanding due to the advances of new in-formation and communication technologies (ICT) Tradi-tional approaches in tourism can be divided into twocategories which are direct observation techniques (eginterviews trip diaries and recall diaries) and non-observation techniques (eg GPS tracking and videotracking) but the use of nonparticipant observation only isthe best technique for privacy reasons [5] Even with ICTsupport this technique has difficulties in data collection andthe large-scale sampling of passenger data is costly Mostpublished studies related to movement patterns are stilldescriptive and they employ small sample sizes that arehighly controlled Moreover this kind of research is focusedon the human movement in tourist intradestination +erehave been a few studies that have explored car-related spatialmovement patterns in tourist destinations Even so theresearch was aimed at large-scale car touristsrsquo activities [39]or used the questionnaire method which is prone to biasesand errors [40] +erefore given the requirements of pro-tecting privacy increasing data volume and avoiding in-vestigator biases it is necessary to conduct flow patternresearch based on continuous time-series data acquiredfrom sensors

3 Study Area

Dapeng Island located in the east of Shenzhen (as shown inFigure 1) is an essential node in the ldquoGuangdong-HongKong-Macao Greater Bay Areardquo and it is the only pioneerzone of national tourism reform and innovation in Guang-dong province Dapeng has abundant tourism resources suchas Dapeng Ancient City National Geological Park and FolkVillage +e ldquoShenzhen Tourism Statistics Bulletinrdquo showedthat a total of 139 million tourists visited this city in 2018+eincrease was 597 percent each year of which only one-tenthwas group tourists +is indicates that most tourism activitiesare carried out by individual visitors Because of topo-graphical constraints tourism transportation on the islandhas not been fully developed Owned and rented car tours arethe main modes of visiting the island for individual tourists+e Dapeng Transportation Bureau has analyzed the trend ofmotorized travel demand on the island and predicted that thetotal annual traffic flow would be 504000 carsyear At thepeak time of ldquoGolden Weekrdquo about 35000 carsday enteredinto the island of which 79 were car tourists

4 Distribution of Monitoring Points

Urban transportation systems usually employ GPS tech-nology to capture taxi and bus tracks Different from this

kind of public transportation research this paper aims toanalyze the flow patterns of car tourists at multiple attrac-tions It is difficult to install GPS device on each personal car+erefore we chose roadside monitoring devices to collecttraffic flow In addition urban road network includes ex-pressways ordinary roads and community roads It has alarge number of nodes and complex structure In order tomonitor each road segment many devices will need to bedeployed on roads So the key road segments and touristattractions were selected as the locations of monitoringpoints Five video devices were deployed at key roads andtwo video devices were deployed in parking lots +e labelsof monitoring devices are A1 A2 A5 A7 A10 P1 and P2(as shown in Figure 2)

+e detected tourist flow at each monitoring point isshown in Table 1

5 Methods

In this study a cloud-based database system was establishedto store traffic data after they were uploaded via 4G com-munication technologies +e collected data includes licenseplate numbers time of passage and labels for monitoringpoints In order to protect touristsrsquo private informationlicense plate numbers were changed into car IDs and onlythe registration places of cars were extracted After datacollection period the license plate numbers will be deletedfrom the database

+e proposed approach is outlined in Figure 3 +issection introduces the detailed steps for the mining offrequent flow patterns of car tourists at the microscale in atourist intradestination First car tourists were separatedfrom mixed traffic flow after analyzing the temporalcharacteristics of collected data Second spatial move-ment graphs of traffic flow were reconstructed for eachday Each movement graph is a connected and directedgraph where vertices are monitoring points and directededges are tourist flow between two monitoring points+en a frequent subgraph mining algorithm (gSpan) wasused to detect the flow patterns between road segmentsand tourist attractions Next in order to reduce thenumber of frequent flow patterns small overlappingsubgraphs were removed from the results Finally weanalyzed the spatial-temporal characteristics of flowpatterns intradestination

51 Preprocessing Source Data When collecting data it isinevitable that problem data will be collected +is can becaused by a problem with the device such as an aging ordamaged camera Additionally a license plate may beblurred blocked or damaged especially in bad weatherwhich can affect the efficiency of car recognition Fur-thermore it is challenging to identify some specialcharacters and confusing numbers on license plates +eabove issues can lead to data distortion To ensure theaccuracy and reliability of the analyzed results the col-lected data need to be processed at first +e rules ofprocessing are as follows

Journal of Advanced Transportation 3

22deg50prime0PrimeN

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

22deg40prime0PrimeN

22deg30prime0PrimeN

22deg50prime0PrimeN

22deg40prime0PrimeN

22deg30prime0PrimeN

Figure 1 +e location of dapeng island Shenzhen

Kilometers

0 05 1 2 3 4

N

LandscapeMonitoring pointRoad network

Figure 2 Distribution of monitoring points for tourist traffic flow

4 Journal of Advanced Transportation

(1) Deleted irrelevant fields +e primary informationsuch as car license plate number label of monitoringdevice and collection time is retained

(2) Removed null values and corrected license plateattributions

(3) Eliminated invalid data such as special car licenseplates and duplicate data

(4) Corrected confusing letters and numbers in caricense plates

52 Identifying Car Tourists In this study tourist flow wasdivided into three types One type consisted of commuterson the island the next consisted of intracity tourists (localweekend tourists in Shenzhen) and the last type consisted ofintercity tourists (leisure tourists from outside of Shenzhen)+e collected data came from the cameras on the roads andin parking lots +ese three types of flows were mixed to-gether in the collected data It is necessary to separate thedifferent types of traffic flows +e detailed steps are shownin Figure 4

+e data collected from the parking lots were classifiedinto intracity tourists and intercity tourists according to theregistration location of car license plates

As we know the number of trips made by tourists andlocal commuters is different Tourists only visit the islandoccasionally on weekends or holidays Local residents on theislandmight drive more times per week+erefore we took aweek as a unit and determined if a car had visited the islandduring that week If so this car would be tagged once +en

we counted the number of weeks a car appeared in eachmonth If the number of weeks visited exceeded a predefinedthreshold this car was considered to be a commuterOtherwise this car was considered to be a tourist

+erefore for the data collected from roads we first set athreshold manually based on the statistics of the number ofweeks visited in one month to distinguish island commutersand tourists Next we categorized visitors into intracity cartourists and intercity car tourists based on where theirlicense plates are registered Finally different types of trafficflows were separated and aggregated

In order to verify the usability and reliability of the proposedmethod the visiting characteristics of all cars were analyzed+eresult is shown in Figure 5 As can be seen the proportion ofweeks in one month in which the car appears is the highestreaching 8886+e percentage of cars appearing on the islandfor less than two weeks is 95 In addition we counted thepercentage of cars in the parking lots relative to the totalnumber of cars +e ratio is 786 which is close to the ratio of79 counted by Dapeng Transportation Bureau during peaktourist periods +e ratio of the number of weeks visited in onemonth (8886) is greater than the statistical result of Trans-portation Bureau (79) We think that the TransportationBureau only considered the tourists in parking lots +ereforethe threshold in this study is oneweek for extracting car tourists

53 Reconstructing the SpatialMovementGraphs In order tomodel the flow patterns of car tourists labeled direct graphsare used to construct movement relationships betweenmonitoring points In particular each vertex of the direct

Table 1 Car tourists detected at monitoring points

Label Location Detected car touristsA1 Kuinan Road Dapeng Community and Xinda CommunityA2 Dongchong Road Dongchong CommunityA5 Pengfei Road Dapeng community and Dapeng Ancient City Jiaochangwei Folk Village Dongshan TempleA7 Fumin Road Nanrsquoao CommunityA10 Nanxi Road Xichong CommunityP1 Jiaochangwei Park Lots Jiaochangwei Folk VillageP2 Ancient City Park Lots Dapeng Ancient City Dongshan Temple

Recognition devices forcars

Cloud databasesystem

Preprocess data

Extract the valid recordsExtract continuous data during

this period all monitoringdevice is normal operation

Analyze the time characteristicsof source data

Find the vehicles in parking lotduring the research period

Find the vehicles that visit theisland each week

Count the number of weeks thateach car appears in a month

Create graph dataset

Create the traffic spatialmovement graph from the

sequence data each day

Construct the sequence data ofmonitoring points that each

vehicle passes each day inchronological order

Separate car tourists andcommuting residents based onthe weekly threshold and the

historical vehicles in theparking lots

Frequent sub-graph mining

Remove small overlappingsub-graphs

Spatio-temporal charateristicsof vehicles

Analyze spatio-temporalcharacteristics and tourist flow

patterns

Spatial flow patterns of cartourists

Figure 3 +e detailed steps for mining frequent patterns in tourist flow

Journal of Advanced Transportation 5

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 4: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

22deg50prime0PrimeN

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

113deg50prime0PrimeE 114deg0prime0PrimeE 114deg10prime0PrimeE 114deg20prime0PrimeE 114deg30prime0PrimeE

22deg40prime0PrimeN

22deg30prime0PrimeN

22deg50prime0PrimeN

22deg40prime0PrimeN

22deg30prime0PrimeN

Figure 1 +e location of dapeng island Shenzhen

Kilometers

0 05 1 2 3 4

N

LandscapeMonitoring pointRoad network

Figure 2 Distribution of monitoring points for tourist traffic flow

4 Journal of Advanced Transportation

(1) Deleted irrelevant fields +e primary informationsuch as car license plate number label of monitoringdevice and collection time is retained

(2) Removed null values and corrected license plateattributions

(3) Eliminated invalid data such as special car licenseplates and duplicate data

(4) Corrected confusing letters and numbers in caricense plates

52 Identifying Car Tourists In this study tourist flow wasdivided into three types One type consisted of commuterson the island the next consisted of intracity tourists (localweekend tourists in Shenzhen) and the last type consisted ofintercity tourists (leisure tourists from outside of Shenzhen)+e collected data came from the cameras on the roads andin parking lots +ese three types of flows were mixed to-gether in the collected data It is necessary to separate thedifferent types of traffic flows +e detailed steps are shownin Figure 4

+e data collected from the parking lots were classifiedinto intracity tourists and intercity tourists according to theregistration location of car license plates

As we know the number of trips made by tourists andlocal commuters is different Tourists only visit the islandoccasionally on weekends or holidays Local residents on theislandmight drive more times per week+erefore we took aweek as a unit and determined if a car had visited the islandduring that week If so this car would be tagged once +en

we counted the number of weeks a car appeared in eachmonth If the number of weeks visited exceeded a predefinedthreshold this car was considered to be a commuterOtherwise this car was considered to be a tourist

+erefore for the data collected from roads we first set athreshold manually based on the statistics of the number ofweeks visited in one month to distinguish island commutersand tourists Next we categorized visitors into intracity cartourists and intercity car tourists based on where theirlicense plates are registered Finally different types of trafficflows were separated and aggregated

In order to verify the usability and reliability of the proposedmethod the visiting characteristics of all cars were analyzed+eresult is shown in Figure 5 As can be seen the proportion ofweeks in one month in which the car appears is the highestreaching 8886+e percentage of cars appearing on the islandfor less than two weeks is 95 In addition we counted thepercentage of cars in the parking lots relative to the totalnumber of cars +e ratio is 786 which is close to the ratio of79 counted by Dapeng Transportation Bureau during peaktourist periods +e ratio of the number of weeks visited in onemonth (8886) is greater than the statistical result of Trans-portation Bureau (79) We think that the TransportationBureau only considered the tourists in parking lots +ereforethe threshold in this study is oneweek for extracting car tourists

53 Reconstructing the SpatialMovementGraphs In order tomodel the flow patterns of car tourists labeled direct graphsare used to construct movement relationships betweenmonitoring points In particular each vertex of the direct

Table 1 Car tourists detected at monitoring points

Label Location Detected car touristsA1 Kuinan Road Dapeng Community and Xinda CommunityA2 Dongchong Road Dongchong CommunityA5 Pengfei Road Dapeng community and Dapeng Ancient City Jiaochangwei Folk Village Dongshan TempleA7 Fumin Road Nanrsquoao CommunityA10 Nanxi Road Xichong CommunityP1 Jiaochangwei Park Lots Jiaochangwei Folk VillageP2 Ancient City Park Lots Dapeng Ancient City Dongshan Temple

Recognition devices forcars

Cloud databasesystem

Preprocess data

Extract the valid recordsExtract continuous data during

this period all monitoringdevice is normal operation

Analyze the time characteristicsof source data

Find the vehicles in parking lotduring the research period

Find the vehicles that visit theisland each week

Count the number of weeks thateach car appears in a month

Create graph dataset

Create the traffic spatialmovement graph from the

sequence data each day

Construct the sequence data ofmonitoring points that each

vehicle passes each day inchronological order

Separate car tourists andcommuting residents based onthe weekly threshold and the

historical vehicles in theparking lots

Frequent sub-graph mining

Remove small overlappingsub-graphs

Spatio-temporal charateristicsof vehicles

Analyze spatio-temporalcharacteristics and tourist flow

patterns

Spatial flow patterns of cartourists

Figure 3 +e detailed steps for mining frequent patterns in tourist flow

Journal of Advanced Transportation 5

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 5: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

(1) Deleted irrelevant fields +e primary informationsuch as car license plate number label of monitoringdevice and collection time is retained

(2) Removed null values and corrected license plateattributions

(3) Eliminated invalid data such as special car licenseplates and duplicate data

(4) Corrected confusing letters and numbers in caricense plates

52 Identifying Car Tourists In this study tourist flow wasdivided into three types One type consisted of commuterson the island the next consisted of intracity tourists (localweekend tourists in Shenzhen) and the last type consisted ofintercity tourists (leisure tourists from outside of Shenzhen)+e collected data came from the cameras on the roads andin parking lots +ese three types of flows were mixed to-gether in the collected data It is necessary to separate thedifferent types of traffic flows +e detailed steps are shownin Figure 4

+e data collected from the parking lots were classifiedinto intracity tourists and intercity tourists according to theregistration location of car license plates

As we know the number of trips made by tourists andlocal commuters is different Tourists only visit the islandoccasionally on weekends or holidays Local residents on theislandmight drive more times per week+erefore we took aweek as a unit and determined if a car had visited the islandduring that week If so this car would be tagged once +en

we counted the number of weeks a car appeared in eachmonth If the number of weeks visited exceeded a predefinedthreshold this car was considered to be a commuterOtherwise this car was considered to be a tourist

+erefore for the data collected from roads we first set athreshold manually based on the statistics of the number ofweeks visited in one month to distinguish island commutersand tourists Next we categorized visitors into intracity cartourists and intercity car tourists based on where theirlicense plates are registered Finally different types of trafficflows were separated and aggregated

In order to verify the usability and reliability of the proposedmethod the visiting characteristics of all cars were analyzed+eresult is shown in Figure 5 As can be seen the proportion ofweeks in one month in which the car appears is the highestreaching 8886+e percentage of cars appearing on the islandfor less than two weeks is 95 In addition we counted thepercentage of cars in the parking lots relative to the totalnumber of cars +e ratio is 786 which is close to the ratio of79 counted by Dapeng Transportation Bureau during peaktourist periods +e ratio of the number of weeks visited in onemonth (8886) is greater than the statistical result of Trans-portation Bureau (79) We think that the TransportationBureau only considered the tourists in parking lots +ereforethe threshold in this study is oneweek for extracting car tourists

53 Reconstructing the SpatialMovementGraphs In order tomodel the flow patterns of car tourists labeled direct graphsare used to construct movement relationships betweenmonitoring points In particular each vertex of the direct

Table 1 Car tourists detected at monitoring points

Label Location Detected car touristsA1 Kuinan Road Dapeng Community and Xinda CommunityA2 Dongchong Road Dongchong CommunityA5 Pengfei Road Dapeng community and Dapeng Ancient City Jiaochangwei Folk Village Dongshan TempleA7 Fumin Road Nanrsquoao CommunityA10 Nanxi Road Xichong CommunityP1 Jiaochangwei Park Lots Jiaochangwei Folk VillageP2 Ancient City Park Lots Dapeng Ancient City Dongshan Temple

Recognition devices forcars

Cloud databasesystem

Preprocess data

Extract the valid recordsExtract continuous data during

this period all monitoringdevice is normal operation

Analyze the time characteristicsof source data

Find the vehicles in parking lotduring the research period

Find the vehicles that visit theisland each week

Count the number of weeks thateach car appears in a month

Create graph dataset

Create the traffic spatialmovement graph from the

sequence data each day

Construct the sequence data ofmonitoring points that each

vehicle passes each day inchronological order

Separate car tourists andcommuting residents based onthe weekly threshold and the

historical vehicles in theparking lots

Frequent sub-graph mining

Remove small overlappingsub-graphs

Spatio-temporal charateristicsof vehicles

Analyze spatio-temporalcharacteristics and tourist flow

patterns

Spatial flow patterns of cartourists

Figure 3 +e detailed steps for mining frequent patterns in tourist flow

Journal of Advanced Transportation 5

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 6: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

graph corresponds to a monitoring point and each edgecorresponds to a directed connection between two moni-toring points passed by car tourists +e related definitionsare as follows

Definition 1 Label Graph Given a set of verticesV v1 v2 vk1113864 1113865 a set of edges connecting two vertex inV E eh (vi vj)|vi vj isin V1113966 1113967 a set of vertex labelsL(V) lb(vi)|forallvi isin V1113864 1113865 and a set of edge labelsL(E) lb(eh) |foralleh isin E1113864 1113865 eh is a direct edge that has the startvertex vi and end vertex vj then a label graph G is repre-sented as

G (V E L(V) L(E)) (1)

In the graph dataset the label of an edge is represented by alabel pair of twomonitoring points in the tourist visiting orderA graph consists of edges connecting the monitoring pointsvisited by each tourist in one day +e advantage of using thevertex label pair as an edge label is that it maintains thetemporal and spatial order of the two monitoring points thattourists pass through When mining a labeled graph thespatial-temporal order of monitoring points in results could be

preserved Using this representation the problem of findingfrequent flow patterns of car tourists becomes a problem ofmining frequent subgraphs in all movement graphs

54 Mining Frequent Subgraphs Some definitions related tofrequent subgraph mining are given below

Definition 2 Subgraph A subgraph g2 (V2 E2) of graphg1 is a graph in which V2 subeV1 E2 E1 cap (V2 times V1)

Definition 3 Support of a subgraph g Given a labeled graphdataset GD g1 g2 gn1113864 1113865 the support or frequency of asubgraph g is the percentage (or number) of graphs in GD

Definition 4 Frequent subgraph A frequent subgraph is agraph whose support is not less than a minimum supportthreshold +e minimum support threshold represents theminimum number of occurrences of a subgraph To obtainthe frequent patterns we chose to manually set the value ofminimum threshold

Definition 5 Mini Code First a depth-first search is per-formed on the graph to form a DFS (depth-first search) treeand then this tree is scanned +e order of the scanned edgesconstitutes a sequence called the DFS Code +e DFS Codesare sorted in a lexicographic order to find the smallest DFScode that uniquely identifies the graph +is minimum DFSCode is called the mini Code

After constructing the movement graphs of tourist flowthe subgraphminingmethodwas used to explore the frequentpatterns As demonstrated in related works there are manykinds of subgraph mining algorithms +e AGM (Apriori-based GraphMining) [41] can discover all frequent subgraphs(both connected and disconnected) in a graph database thatsatisfy a specific minimum support constraint+is algorithmuses an approach similar to Apriori and it requires 40minutes to 8 days to find the result subgraphs in a datasetcontaining 300 chemical compounds +e algorithm FSG(finding frequently occurring subgraphs in large graph) [42]adopts an adjacent representation of a graph and an edge-

100

80

60

40

20

0

Ratio

()

1 2 3 4 5Number of weeks a car appeared in one month

8886

Figure 5 Number of weeks visited by tourists in one month

e data collected from parking lots

e data collected from roads

Tourists

Inter-city car touristsoutside Shenzhen

Inter-city car tourists inShenzhen

Registration place

Number of weeksvisited

Commuterson the island

Figure 4 +e workflow of traffic flow separation

6 Journal of Advanced Transportation

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 7: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

growing strategy to find all of the connected subgraphs thatfrequently appear in a graph database+e results have shownthat FSG can be finished in 600 seconds gSpan [26] isdesigned to reduce or avoid the candidate generation and

pruning false positives used in AGM and FSG gSpan cancomplete the same task in 10 seconds Considering the ef-ficiency in this study gSpan was used to mine frequentsubgraphs in a directed graph dataset and then find the

Num

ber o

f car

s at e

ach

poin

t1000

02500

01000

02500

0

01000

5000

02500

00 50 100 150 200 250 300

Days

A1

A5

A7

A10

P1

P2

A2

(a)

Num

ber o

f car

s at e

ach

poin

t

A1

A5

A7

A10

P1

P2

A2

0 10 20 30 40 50 60 70Days

10000

10000

500

02500

01000

5003000200010002500

0

(b)

Figure 6 Daily traffic flow at each monitoring point

C1

C216000

14000

12000

10000

8000

6000

4000

2000

0

Traffi

c vol

ume

A5P1 A5P2 P2P1 A10A2 A1P1A1A10 A1P2 P1A10 A5A1 A2A1 A2P1A10A7A7P1 A10P2 A7A2 A1A7 A2P2 P2A7 A5A10A7A5 A5A2

Links

Kilometers

0 05 1 2 3 4

Figure 7 Flow map of tourism traffic

Table 2 Types of car tourist visiting the island

Types Number of records Number of graphsAll car tourists 361693 76Intracity car tourists in Shenzhen 240452 76Intercity car tourists outside Shenzhen 121241 75

Journal of Advanced Transportation 7

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 8: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

frequent subgraphs withmaximum length in the results as theflow patterns of car tourists +e algorithmic details of gSpanare available in reference [26] An extended instruction isgiven below +e method consisted of two steps (1) Findingthe frequent subgraphs using gSpan First the frequencies ofedges and nodes of all graphs was calculated Second thefrequencies were compared with the minimum supportthreshold and the infrequent edges and nodes were removed+en the remaining nodes and edges were reorderedaccording to the frequency And the frequency of each edgewas calculated again Finally the subgraphs of the restoredgraph were mined according to mini Code and it was de-termined whether the current DFS encoding is the minimumcode or not If so current edges were added to the results and

further attempts were made to add possible edges If not themining process was finished (2) Finding frequent subgraphswithmaximum length+ere are a large number of subgraphsin the obtained results and some subgraphs are partial graphsof the others +erefore this kind of subgraph was deleted bycomparing the labels of nodes and edges and the final resultswere the maximum frequent subgraphs

6 Results

In this section we first analyzed the spatial and temporaldistribution of tourist flows by statistical methods and maps+en we divided the tourist flows into intracity car touristsand intercity car tourists and used a frequent subgraphmining algorithm for pattern recognition Finally wesummarized the movement patterns of all tourists

61 Spatial-Temporal Characteristics of the TouristTraffic Flow

611 Temporal Characteristics Figure 6(a) depicts 295days of data before any preprocessing was applied Due to

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 74

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C1

C2

S = 73

(d)

Figure 8 Flow patterns inferred from intracity tourists

Table 3 +e list of flow patterns inferred from intracity tourists

Figures PatternsFigure 8(a) (A1⟶ P2 andA1hArrP1)

Figure 8(b) A1⟶ P2⟶ P1⟶ A1Figure 8(c) A1⟶ A2 andA1hArrP1hArrP2Figure 8(d) A1⟶ A10 andA1hArrP1hArrP2

8 Journal of Advanced Transportation

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 9: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 62

C1

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 61

C1

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 57

C2

C1

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

S = 56

C1

C2

(e)

Figure 9 Flow patterns inferred from intercity tourists

Journal of Advanced Transportation 9

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 10: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

the failure of device communication or power the con-structed movements graphs may be incomplete whichwould lead to the loss of frequent subgraphs +ereforewe selected 76 days of valid data as the dataset for frequentpattern mining Figure 6(b) shows that (1) the touristflows at A2 and A10 near the sandy beaches has similartemporal characteristics +eir peak hours of touristvolume occur both on holidays and weekends while onweekdays the flow curves are relatively stable (2) P1 andP2 are located in two parking lots Although there aredifferences in the tourist volumes the trends are similar(3) For the two entrances to the tourist attractions theaverage daily volume of tourist traffic at A1 is 07 timesthat of A5 After sorting the volume of tourist trafficPengcheng Community has the most car tourists followedby two tourist attractions with a sandy beach (ieDongchong and Xichong Community) and the leastvisited attraction is Nanao Community

612 Spatial Characteristics Figure 7 shows the flowmap ofthe aggregated tourists transferring in multiple monitoringpoint pairs +e tourist volumes are depicted and sorted inthe left bar chart in the figure and the link thickness rep-resents the traffic volume As can be seen the links with thelargest traffic volume are A5P1 A5P2 P2P1 A10A2A1P1 and A1P2 (A5P1 is a simplified form for A5harrP1which represents the forth and back tourist flow between A5and P1) +ese links could be divided into two areasC1(A1 A5 P1 P2) and C2(A2 A10) Further inspectionreveals that the number of tourists transferring between P1and P2 in area C1 is close to the value between A2 and A10in area C2 but the number of tourists visiting P1 and P2 is312 times that of A2 and A10

62 Analyses of the Detected Flow Patterns +e flow map isintuitive but it suffers from serious visual clutter and it isdifficult to read because of overlapping flows It can also beseen in Figure 7 that the flow map only shows the trafficvolumes between the monitoring points in the touristdestination But it could not express the spatial transferdirections of car tourists +erefore these facts motivate usto find a new approach to solve these problems +is sectiondescribes the use of frequent subgraph mining algorithm toexplore the spatial flow patterns with directions in touristtraffic and to obtain the maximum frequent patterns fromthe daily tourist movement graphs according to a predefinedminimum support threshold By using the identificationmethod of car tourists introduced in Section 5 two types of

data were extracted and used for the subsequent flow patternmining (listed in Table 2)

621 Flow Patterns of Intracity Tourists in Shenzhen+e experiments were conducted with the dataset of intracitycar tourists in Shenzhen (as shown in Table 2) +e mini-mum support threshold for gSpan was set to 73 +e spatialflow patterns are represented in Figure 8 and listed in Ta-ble 3+e inferred patterns could be divided into two groupsOne group consists of the patterns shown in Figures 8(a) and8(b) which show the tourist spatial transfer process in areaC1 +e other group contains the patterns shownFigures 8(c) and 8(d) which represent the tourists trans-ferring between area C1 and area C2 +e two groupsdemonstrate that the intracity car tourists who arrived at P1and P2 preferred to choose Kuinan Road at A1 instead ofPengfei Road at A5 +e difference between Figures 8(a) and8(b) is the existence of the circle tourist flow +e differencebetween Figures 8(c) and 8(d) is the presence of tourists flowback and forth+e tourist flow from A1 to A2 have only onedirection One of the reasons may be that this study failed tofind a suitable monitoring point on Pingxi Road resulting inthe loss of directionality for this part of tourist flow

622 Flow Patterns of Intercity Car Tourists Figure 9 showsflow patterns of intercity car tourists+e dataset used in thissection is composed of intercity tourists (as shown in Ta-ble 2) +e minimum support threshold for gSpan was set to56 +e result flow patterns are sorted by the supportthreshold as listed in Table 4

From these patterns the following could be con-cluded (1) +e most frequent patterns are shown inFigures 9(a)ndash9(c) +e frequencies of the discovered flowpatterns are 62 62 and 61 respectively +ese threepatterns describe the preference of intercity tourists forarea C1+is kind of tourists first visited one of attractionsnear a parking lot P1 or P2 and then visited anotherattraction or just visited one tourist attraction near P1 orP2 (2)+e above three patterns are different from those ofthe intracity tourists in Shenzhen +ere is no circle tourbetween P1 P2 and A1 +e reason for this is that sometourists chose to continue drive to area C2 (3) +eminimum support threshold of these two patterns asshown in Figures 9(d) and 9(e) is significantly smallerthan that shown in Figures 9(a)ndash9(e) but it reflects thespatial transfer preferences of tourists from neighboringcities in areas C1 and C2 In Figure 9(d) the car touriststended to drive directly from A5 to A10 after visiting

Table 4 +e list of the flow patterns inferred from intercity tourists

Figures PatternsFigure 9(a) A1⟶ P2⟶ P1Figure 9(b) A1hArrP1hArrP2Figure 9(c) A1⟶ P1 andA1⟶ P2Figure 9(d) A5⟶ P1⟶ A10Figure 9(e) A10⟶ P1hArrP2 andA10hArrA2

10 Journal of Advanced Transportation

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 11: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

attractions near P1 As shown in Figure 9(e) the cartourists that flowed between A10 and A2 went back to P1and changed car parking lots between P1 and P2

However there is no tourist flow to A1 and A5 +e reasonfor this may be that the traffic flow of Pingxi Road (theexpressway in and out of the peninsula) was not

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(a)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(b)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(c)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(d)

2265

2260

2255

2250

2245

Latit

ude

11435 11440 11445 11450 11455 11460Longitude

C2

C1

S = 73

(e)

Figure 10 Flow patterns of all tourists

Journal of Advanced Transportation 11

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 12: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

monitored +is part of traffic flow could directly arriveand then leave from A10

623 Flow Patterns of All Tourists +is section presents anexploration of the spatial flow patterns of all car tourists +eresults are shown in Figure 10 +e minimum supportthreshold for gSpan was set to 73 which means that 73 of the76 graphs contained the discovered flow pattern +e resultpatterns are listed in Table 5

All of the patterns shown in Figure 10 have touristflow between area C1 and area C2 +is is consistent withthe trend in the flow map (shown in Figure 7) As thepicture shows A1 is the main entrance for tourists toenter and exit Dapeng Island Some of the car touristsdrove to P1 or P2 and the rest flowed to A10 or A2Figures 10(a)ndash10(c) show the directions of tourist flowsbetween the Dapeng Community Pengcheng Commu-nity and Xichong Community Figure 10(d) shows thedirections of tourist flows between Dapeng CommunityPengcheng Community and Dongchong CommunityFigure 10(e) represents only the directions of tourist flowsbetween Dapeng Community and Dongchong Commu-nity Furthermore the directions of tourist flows in areaC1 and area C2 or between area C1 and area C2 aredifferent Taking Figures 10(a)ndash10(c) as examples al-though the orders of P1 and P2 that are accessed from A1have a lack of regularity they have their own charac-teristics when considering the accompanying paths toA10 +ese examples indicate that if there is a tourist flowbetween A1 and A10 it could be divided into two cases Inone case some of the tourists returned directly and in theother case some of the tourists flowed to P1 In the firstcase either the tourist flow passing by A1 first accessed P2and then visited P1 or both P1 and P2 had tourists at thesame time In the second case some of the touristsreturned from A10 visited P1 and P2 and then left theisland

7 Conclusions and Discussions

71 Conclusions In order to facilitate the management oftourist traffic flow the car tourists in Dapeng Island weretaken as a research case +e experimental analysis used realdata captured by video devices in the research area Due to thelack of suitable device installation locations the capturedpicture from the video device had a certain distance from theroad+e catch rate of the touristsrsquo cars was low However thedetailed time-series data in one day could be collectedCompared with the manual survey the collected data was

improved in terms of reliability and richness A day waschosen as the time unit for frequent pattern mining Afterselecting the available data we divided the data of cars intointracity and intercity tourists Next the license plate datawere transformed into movement graphs according to thevisited location sequence between multiple monitoringpoints +e intricate flow patterns of the car tourists werediscovered by gSpan algorithm which had the best perfor-mance in terms of the quality of the results and the executiontime and which had already proven to be efficient for frequentsubgraph mining +e conclusions are as follows

(1) +e car tourists had obvious preferences in the se-lection of trip time and tourist attractions (shown inFigures 6 and 7) In terms of time the curves oftourist flow at each monitoring point were similar+ere were a large number of car tourists at variousattractions on holidays but the volume of car touristswas relatively lower on weekdays +e same types ofattractions had the similar trends in tourist flowsuch as the two attractions with a sandy beach(Dongchong Community and Xichong Community)and the ancient city and cultural attractions (DapengAncient City and Dongshan Temple) In terms ofspace the attractions that are close to the entrance ofa scenic area and rich in tourist resources were morepopular with tourists However due to the terrainbarrier in the scenic area the traffic conditions af-fected the movements of tourists between multipleattractions

(2) Different types of car tourists had similar spatialchoices in scenic area (shown in Figures 8ndash10) Forexample different types of tourists had flow patternsthat described the movements in one area (as shownin Figures 8(a) 8(b) and 9(a)ndash9(c)) and the move-ments between different areas (as shown inFigures 8(c) 8(d) 9(d) and 9(e)) +e intercitytourists and intracity tourists had different choices inscenic spots +e intercity tourists would takemultidestination trips instead of single destinationtrips in the same type of attractions As can be seen inFigures 8 and 9 there are the two sandy beach at-tractions in areaC2 the intracity tourists tend to visitone of them while the intercity tourists would visitboth attractions Specifically to save time andmoney intercity tourists would visit multiple at-tractions in one trip instead of making multiple tripsAdditionally another difference was that there wasno circlet in area C1 for intercity tourists and notraffic flow between A2 and A10 for intracity tourists

Table 5 +e list of flow patterns inferred from all tourists

Figures PatternsFigure 10(a) (A1⟶ A10⟶ A1) and (A10⟶ P1hArrP2) and (P1⟶ A1)

Figure 10(b) (A1hArrA10 andA1⟶ P1 andA1⟶ P2)

Figure 10(c) (A1hArrP10 andA1⟶ P2⟶ P1⟶ A1)

Figure 10(d) (A1hArrP1hArrP2 andA1hArrA10hArrA2)

Figure 10(e) (A1hArrP1hArrP2 andA1⟶ A2)

12 Journal of Advanced Transportation

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 13: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

(3) Although the patterns depicted on the map lookcomplicated and messy after the patterns are con-verted to rules they become clear In pattern mapsonly Figures 8(b) 9(a) and 9(d) show a clear uni-directionality +e rest is complex and is difficult tocompare As we can see from Figure 8(a) theintracity tourists passing point A1 could be dividedinto two groups one group flowed to parking lot P1and then leaved the scenic area from P1 +e othergroup flowed to parking lot P2 Two groups havesimultaneity in tourist routes So we could convertthe flow patterns to rules and use ldquoandrdquo to illustratethe simultaneity of different routes in the samepattern (as shown in Tables 3ndash5) By this way allpatterns can be applied to traffic control system forregional tourism

(4) Large primary attractions are more attractive thansmaller secondary attractions Looking from thearrow on the pattern maps tourists always visit areaC1 first and then select area C2 +e main reason isthat area C1 has abundant tourism resources anddiverse tourism activities For example there arecultural attractions (eg Dapeng Ancient City andDongshan Temple) sandy beach entertainmentsand large parking lots for tourists in area C1 +esefactors are also often considered in the evaluation ofthe importance of attractions in tourism network

(5) +e tourists tended to park their cars in an easy-to-access place even if the visited attractions arechanged as shown in Figures 8(a)ndash8(d) 9(b) and9(e) Again here we take area C1 as an example +edistance between the Dapeng Ancient City andDongshan Temple is about 1 km On the patternmaps we can see that there are two-way arrowspointing to the two parking lots P1 and P2 +isindicates that the car tourists tended to park theircars to the nearest parking lot so that they could pickthem up when the tour destination is changed

72 Discussions Tourist flow is the key to traffic man-agement in tourism destinations and it affects the de-velopment of tourism on an island and the experience oftourists +e recent development of transport technolo-gies has shown that traffic flow data will be increasinglycollected and it will be available for data analysis+erefore advanced data analytics should be used tointerpret and depict the complex movements of cartourists +e proposed approach in this study wasintended to find (i) the statistical summaries of the spatial-temporal characteristics of car tourists in the researcharea helping to discover patterns from the mass carlicense plate data (ii) the flow patterns of intercity andintracity tourists helping to illustrate the different pref-erences of the two types of car tourists and (iii) the mostfrequent patterns of all tourists helping to identify the lawof tourist movement and make efficient policy for themanagement of tourist traffic flow +e presented ap-proach enriched the analytical methodology of tourist

traffic flow and suggested a shift from the conventionaland complicated paper or computer interview-basedmethod to a dynamic flow graph-based method Fur-thermore we have shown how transportation data pro-vides hard-to-obtain insights and quantitative results fortourists

In order to illustrate the law of spatial movement oftourists related studies have proposed a variety of macroflow patterns [43ndash45] For example in 2008 McKercher [46]proposed 11 prominent route styles in urban destinations+e macropatterns retain only the main components andsimplify the details and are often used in tourism man-agement to guide destination development Compared withmodeling the flow patterns of tourists at the macrolevelmodeling tourist flow at the microscale is more complicatedLew and McKercher [17] noted that it is a challenge tobalance model effectiveness and usability +e reason is thatsimple patterns may not provide enough details for use andcomplex patterns may be difficult to interpret and apply Inthis study we used the video devices installed at key nodes ofroad network to collect tourist traffic flows and used thefrequent subgraph mining algorithm to discover flow pat-terns at the microscale +e extracted patterns can beconverted into rules and applied to the traffic control systemfor the management of regional tourists +e difficulty offinding and applying patterns at microscale could beovercome by this way

During the peak period of tourism the number of cartourists in the scenic area increases sharply It is easy toresult in road congestion and uneven distribution oftourists between attractions With the use of traffic flowdata the daily monthly and seasonal characteristics oftourists can be analyzed the future tourist flow can bepredicted and the flow patterns can be obtained by datamining methods +us the traffic management departmentcan effectively control the tourist traffic flow and thetourism department can develop attractive tourism prod-ucts to achieve a spatial balance of the distribution oftourists reduce traffic congestion and harmful gas emis-sions and weaken the impact of tourism environment andhuman body

Inside a scenic area attractions form a complex net-work of tourist flow due to the frequent spatial interactionEach attraction is both the origin and destination of cartourists Due to the differences in attractiveness degree ofdevelopment and convenience of transportation touristshave shown special preferences when choosing attractionsand trip routes Accordingly effective identification ofthese preferences will be beneficial to the development oftourist market and will also helpful for tour planners tounderstand how tourists see the spatial connection ofmultiple attractions However traditional manual surveysare time-consuming and laborious and the amount of dataobtained is small It is difficult to reveal tourist preferencesby this way Frequent subgraph mining algorithms providea desirable method for identifying the spatial preferences oftourists +is kind of method could perform well under thesupport of a large amount of movement data betweenattractions

Journal of Advanced Transportation 13

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 14: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

+is study has several limitations Firstly due to lack ofpower supply facilities it was unable to collect the traffic datafor each day When applying the model to the actual controlof tourist traffic flow it is necessary to further co-operatewith the traffic management department to obtain com-prehensive traffic flow data Secondly this study focused onthe mining of flow patterns the influencing factors behindthe identified patterns were not further analyzed As men-tioned by Lew andMcKercher [17] factors related to touristsand destinations can affect how tourists move or travel in adestination +ese factors include family composition in-come and valid information obtained before travelling It isdifficult to obtain these factors by relying only on the trafficflow data collected in the traffic monitoring system+erefore in future work field investigation will benecessary

Data Availability

As the data also form part of an ongoing study the raw dataneeded to reproduce these findings cannot be shared at thistime

Conflicts of Interest

+e authors declare no conflicts of interest

Acknowledgments

+e authors gratefully acknowledge the support of this re-search from the National Science Foundation of China(41701167) and the Basic Research Project of Shenzhen City(JCYJ20170307164104491 and JCYJ20190812171419161)

References

[1] D W Eby and L J Molnar ldquoImportance of scenic byways inroute choice a survey of driving tourists in the United StatesrdquoTransportation Research Part A (Policy and Practice) vol 36no 2 pp 95ndash106 2003

[2] B Wang Z Yang F Han and H Shi ldquoCar tourism inxinjiang the mediation effect of perceived value and touristsatisfaction on the relationship between destination imageand loyaltyrdquo Sustainability vol 9 no 1 pp 1ndash16 2016

[3] X M Zhang and Q Y Pen ldquoA study on tourism informationand self-driving tour trafficrdquo in Proceedings of the First Inter-national Conference on Transportation Engineering SouthwestJiaotong University Chengdu China July 2007

[4] China National Tourism Administration 7e Yearbook ofChina Tourism Statistics 2016 China Travel amp Tourism PressBeijing China 2017

[5] M A Mohamad Toha and H N Ismail ldquoA heritage tourismand tourist flow pattern a perspective on traditional versusmodern technologies in tracking the touristsrdquo InternationalJournal of Built Environment and Sustainability vol 2 no 2pp 85ndash92 2015

[6] K Siła-Nowicka and A S Fotheringham ldquoCalibrating spatialinteraction models from gps tracking data an example ofretail behaviourrdquo Computers Environment and Urban Sys-tems vol 74 pp 136ndash150 2019

[7] F Mao M Ji and T Liu ldquoMining spatiotemporal patterns ofurban dwellers from taxi trajectory datardquo Frontiers of EarthScience vol 10 no 2 pp 205ndash221 2016

[8] L J Zheng D Xia X Zhao and W L Liu ldquoMining tripattractive areas using large-scale taxi trajectory datardquo inProceedings of the 2017 IEEE International Symposium onParallel and Distributed Processing with Applications and 2017IEEE International Conference on Ubiquitous Computing andCommunications pp 1217ndash1222 ISPAIUCC GuangzhouChina December 2017

[9] I Rami and M O Shafiq ldquoDetecting taxi movements usingrandom swap clustering and sequential pattern miningrdquoJournal of Big Data vol 6 no 39 pp 1ndash26 2019

[10] G Kautsar and S Akbar ldquoTrajectory pattern mining using se-quential pattern mining and k-means for predicting future lo-cationrdquo Journal of Physics Conference Series vol 801 no 1 2017

[11] D Guo X Zhu H Jin P Gao and C Andris ldquoDiscoveringspatial patterns in origin-destination mobility datardquo Trans-actions in GIS vol 16 no 3 pp 411ndash429 2012

[12] J D Mazimpaka and S Timpf ldquoA visual and computationalanalysis approach for exploring significant locations and timeperiods along a bus routerdquo in Proceedings of the 9th ACMSIGSPATIAL International Workshop on ComputationalTransportation Science pp 43ndash48 San Francisco CA USAOctober 2016

[13] H Faroqi M Mesbah and J Kim ldquoSpatial-temporal simi-larity correlation between public transit passengers usingsmart card datardquo Journal of Advanced Transportationvol 2017 Article ID 1318945 14 pages 2017

[14] Z Wang Y Hu P Zhu Y Qin and L Jia ldquoRing aggregationpattern of metro passenger trips a study using smart carddatardquo Physica A Statistical Mechanics and Its Applicationsvol 491 pp 471ndash479 2018

[15] X Zhao X Lu Y Liu J Lin and J An ldquoTourist movementpatterns understanding from the perspective of travel partysize using mobile tracking data a case study of Xirsquoan ChinardquoTourism Management vol 69 pp 368ndash383 2018

[16] M Versichele L De Groote M Claeys Bouuaert T NeutensI Moerman and N Van De Weghe ldquoPattern mining intourist attraction visits through association rule learning onbluetooth tracking data a case study of ghent BelgiumrdquoTourism Management vol 44 pp 67ndash81 2014

[17] A Lew and B Mckercher ldquoModeling tourist movementsrdquoAnnals of Tourism Research vol 33 no 2 pp 403ndash423 2006

[18] A Birenboim S Anton-Clave A P Russo and N ShovalldquoTemporal activity patterns of theme park visitorsrdquo TourismGeographies vol 15 no 4 pp 601ndash619 2013

[19] H J Yun and M H Park ldquoTime-space movement of festivalvisitors in rural areas using a smart phone applicationrdquo AsiaPacific Journal of Tourism Research vol 20 no 11pp 1246ndash1265 2015

[20] S Kang ldquoAssociations between space-time constraints andspatial patterns of travelsrdquo Annals of Tourism Researchvol 61 pp 127ndash141 2016

[21] C Comito D Falcone and D Talia ldquoMining human mobilitypatterns from social geo-tagged datardquo Pervasive and MobileComputing vol 33 pp 91ndash107 2016

[22] Y T Zheng Y Li Z J Zha and T S Chua ldquoMining travelpatterns fromGPS-tagged photosrdquo in Lecture Notes in ComputerScience K T LeeWH Tsai H YM Liao T Chen JWHsiehand C C Tseng Eds Springer Berlin Germany 2011

[23] K M Borgwardt H P Kriegel and P WackersreutherldquoPattern mining in frequent dynamic subgraphsrdquo in Pro-ceedings of the 6th IEEE International Conference on Data

14 Journal of Advanced Transportation

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15

Page 15: DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing ...downloads.hindawi.com/journals/jat/2020/4795830.pdfResearchArticle DiscoveringtheGraph-BasedFlowPatternsofCarTouristsUsing

Mining (ICDM 2006) Hong Kong China IEEE December2006

[24] D J Cook and L B HolderMining Graph Data Wiley PressHoboken NJ USA 2015

[25] C Jiang F Coenen andM Zito ldquoFrequent sub-graph miningon edge weighted graphsrdquo in Lecture Notes in ComputerScience T Bach Pedersen M K Mohania and A M TjoaEds Springer Berlin Germany pp 77ndash88 2010

[26] X Yan and J Han ldquogSpan graph-based substructure patternminingrdquo in Proceedings of the 2002 IEEE InternationalConference on Data Mining ICDMrsquo02 pp 721ndash724 IEEEComputer Society Maebashi City Japan December 2002

[27] C Borgelt and M R Berthold ldquoMining molecular fragmentsfinding relevant substructures of moleculesrdquo in Proceedings ofthe 2002 IEEE International Conference on Data MiningICDMrsquo02 pp 51ndash58 IEEE Computer Society Maebashi CityJapan December 2002

[28] J Huan W Wang and J Prins ldquoEfficient mining of frequentsubgraphs in the presence of isomorphismrdquo in Proceedings ofthe 7ird IEEE International Conference on Data MiningICDMrsquo03 pp 549ndash552 IEEE Computer Society MelbourneFL USA November 2003

[29] S Nijssen ldquoA quickstart in frequent structure mining canmake a differencerdquo in Proceedings of the Tenth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining KDDrsquo04 pp 647ndash652 ACM Seattle WA USAAugust 2004

[30] J Huan W Wang J Prins and J Yang ldquoSPIN miningmaximal frequent subgraphs from graph databasesrdquo in Pro-ceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining Seattle WA USAACM August 2004

[31] M Worlein T Meinl I Fischer and M ampPhilippsen ldquoAquantitative comparison of the subgraph miners MoFagSpan FFSM and gastonrdquo in Proceedings of the 9th EuropeanConference on Principles and Practice of Knowledge Discoveryin Databases PKDDrsquo05 Springer Porto Portugal pp 392ndash403 2005

[32] X Yan H Cheng J Han and P ampYu ldquoMining significantgraph patterns by leap searchrdquo in Proceedings of the 2008ACM SIGMOD pp 433ndash444 Vancouver Canada June 2008

[33] L T +omas S R Valluri and K Karlapalem ldquoMarginmaximal frequent subgraph miningrdquo in Proceedings of theSixth International Conference on Data Mining (ICDMrsquo06)Hong Kong China December 2006

[34] V Ingalalli D Ienco and P Poncelet ldquoMining frequentsubgraphs in multigraphsrdquo Information Sciences vol 451-452pp 50ndash66 2018

[35] L Min Z C Wang J Liang and X R Yuan ldquoOD-wheelvisual design to explore OD patterns of a central regionrdquo inProceedings of the Visualization Symposium 2015 HangzhouChina April 2015

[36] B Jenny D M Stephen I Muehlenhaus and B E MarstonldquoDesign principles for origin-destination flowmapsrdquo Car-tography and Geographic Information Science vol 45 no 1pp 1ndash15 2016

[37] S Kim S Jeong I Woo Y Jang R Maciejewski andD S Ebert ldquoData flow analysis and visualization for spa-tiotemporal statistical data without trajectory informationrdquoIEEE Transactions on Visualization and Computer Graphicsvol 24 no 3 pp 1287ndash1300 2018

[38] J Wood J Dykes and A Slingsby ldquoVisualisation of originsdestinations and flows with OD mapsrdquo 7e CartographicJournal vol 47 no 2 pp 117ndash129 2010

[39] M Zillinger ldquoTourist routes a time-geographical approach onGerman car-tourists in Swedenrdquo Tourism Geographies vol 9no 1 pp 64ndash83 2007

[40] S Page and J Connell ldquoExploring the spatial patterns of car-based tourist travel in loch lomond and trossachs nationalpark Scotlandrdquo Tourism Management vol 29 no 3pp 561ndash580 2008

[41] A Inokuchi T Washio and H ampMotoda ldquoAn apriori-basedalgorithm for mining frequent substructures from graphdatardquo in Proceedings of the 4th European Conference onPrinciples of Data Mining and Knowledge DiscoveryPKDDrsquo00 pp 13ndash23 Lyon France September 2000

[42] M Kuramochi and G Karypis ldquoFrequent subgraph discov-eryrdquo in Proceedings of the 2001 IEEE International Conferenceon Data Mining ICDMrsquo01 pp 313ndash320 IEEE ComputerSociety San Jose CA USA August 2001

[43] M Oppermann ldquoA model of travel itinerariesrdquo Journal ofTravel Research vol 33 no 4 pp 57ndash61 1995

[44] N Shoval and A Raveh ldquoCategorization of tourist attractionsand the modeling of tourist cities based on the co-plotmethod of multivariate analysisrdquo Tourism Managementvol 25 no 6 pp 741ndash750 2004

[45] G Lau and B McKercher ldquoUnderstanding tourist movementpatterns in a destination a gis approachrdquo Tourism andHospitality Research vol 9 no 1 pp 39ndash49 2007

[46] B McKercher and G Lau ldquoMovement patterns of touristswithin a destinationrdquo Tourism Geographies vol 10 no 3pp 355ndash374 2008

Journal of Advanced Transportation 15