catch me if you can: detecting pickpocket me if you can: detecting pickpocket suspects from...
Post on 25-Jun-2018
Embed Size (px)
Catch Me If You Can: Detecting Pickpocket Suspects fromLarge-Scale Transit Records
Bowen DuState Key Lab of SoftwareDevelopment Environment
Beihang UniversityBeijing, China
Chuanren LiuDecision Sciences and MISLeBow College of Business
Drexel UniversityPhiladelphia, US
Wenjun ZhouBusiness Analytics & StatisticsHaslam College of Business
University of TennesseeKnoxville, US
Zhenshan HouState Key Lab of SoftwareDevelopment Environment
Beihang UniversityBeijing, China
Management Science andInformation SystemsRutgers University
New Jersey, UShxiong@rutgers.edu
ABSTRACTMassive data collected by automated fare collection (AFC)systems provide opportunities for studying both personaltraveling behaviors and collective mobility patterns in theurban area. Existing studies on the AFC data have primar-ily focused on identifying passengers movement patterns.In this paper, however, we creatively leveraged such datafor identifying thieves in the public transit systems. In-deed, stopping pickpockets in the public transit systems hasbeen critical for improving passenger satisfaction and publicsafety. However, it is challenging to tell thieves from regularpassengers in practice. To this end, we developed a suspectdetection and surveillance system, which can identify pick-pocket suspects based on their daily transit records. Specif-ically, we first extracted a number of features from eachpassengers daily activities in the transit systems. Then,we took a two-step approach that exploits the strengths ofunsupervised outlier detection and supervised classificationmodels to identify thieves, who exhibit abnormal travelingbehaviors. Experimental results demonstrated the effective-ness of our method. We also developed a prototype systemwith a user-friendly interface for the security personnel.
CCS ConceptsInformation systems Spatial-temporal systems;Data mining; Computing methodologiesAnomalydetection;
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from email@example.com.
KDD 16, August 13-17, 2016, San Francisco, CA, USAc 2016 ACM. ISBN 978-1-4503-4232-2/16/08. . . $15.00
KeywordsAutomated Fare Collection; Travel Behaviors; Mobility Pat-terns; Public Safety; Anomaly Detection.
1. INTRODUCTIONPassengers in the public transit systems have been the
main target for pickpockets. In many cities, thefts happenfrequently in public transit systems, because passengers tendto pay less attention to their belongings when they are ina rush or in a crowded environment. For example, duringthe first nine months of 2014, 350 pickpockets were caught inthe subway system and 490 were caught on buses1 in Beijing,China. Many other big cities in the world, such as Barcelona,Prague, Rome, and Paris, are also reported to suffer fromthe pickpocket problem 2, which has led to public safetyconcerns [23, 7]. Indeed, it is challenging to detect theftactivities committed by cunning thieves who know how toescape without being disclosed. Despite the substantial costin manpower and resources, many thieves are still at large.It is critical to provide a smart surveillance and tracking toolfor the security personnel of the transit systems.
With rapid advances in information technology and dataprocessing capacities, transactional records collected by au-tomated fare collection (AFC) systems  have becomevaluable for understanding passengers mobility patterns andthe urban dynamics [6, 3, 20, 29, 18]. However, most of theexisting studies focused on identifying regular, collective mo-bility patterns, such as commute flows and transit networks.Our study is the first to focus on identifying thieves basedon AFC data. In fact, it is possible to detect thieves usingAFC records because behavioral differences are coined inthe mobility footprints, which can help to separate suspectsfrom regular passengers. Examples of such behaviors, whichcan make suspects notable, include traveling for an extendedlength of time, making unnecessary transfers, and/or wan-dering on certain routes while making random stops.
1http://www.bjgaj.gov.cn/web/detail zxftDetail 397242.html
Figure 1: Trajectories of passengers.
However, detecting thieves based on AFC records is nota simple outlier detection problem. Figure 1 shows the dif-ference between a known thief and an outlier. We can seea number of trajectories between hot regions A and B. Bycareful examination, we can see that most passengers movefrom one region to another using near-optimal configuration(e.g., shortest time/distance, or a minimal number of trans-fers). However, a passenger (a known suspect) who tookthe path A C D B looks suspicious because thereis no need to make transfers at C and D in order to reachB. Based on the above observation, passengers who exhibitsuch abnormal behaviors will be selected for further exam-ination. In contrast, another passenger who travels from Eto B is an outlier, since few passengers take the same path.However, this passenger is likely just a regular passengerwho originates from a less crowded area.
In summary, to identify thieves from AFC records, we arefaced with a number of inherent challenges.
The first challenge is how to identify useful features to dis-tinguish thieves from regular passengers. These featuresshould not only help us understand the behaviors of pick-pockets, but also help us build a suspect detection andtracking system for supporting the security personnel.
Second, using regular outlier detection methods tends toresult in a large number of false positives. In particular,not every trip made by a regular passenger looks normal.Regular commuters may occasionally make trips to visitfriends or places of interest, and some of such trips maylook suspicious by how much they deviate from regularbehaviors.
Third, a large number of AFC records are being collectedfrom millions of passengers, only a tiny fraction of whichare pickpockets. Identifying such a small group of peoplein such a large-scale dataset is like looking for a needle inthe haystack.
Finally, we also need to effectively transform our knowl-edge based on model development into a decision supportsystem, so that real-time, personalized deployment rec-ommendations could be made to help to guide securitypersonnel to perform their work more efficiently.
To this end, in this paper, a comprehensive approach istaken to meet the above challenges. Specifically, we firstconstruct a feature representation for profiling passengers.Furthermore, we establish a two-step framework to separatenormal movement patterns from irregular behaviors, and
eventually, distinguish thieves from regular passengers. Fi-nally, we leverage real-world datasets from multiple sourcesfor model training and validation, and implement a proto-type system for end users.
Regular passenger filter
Extract mobilitypattern of civilian Suspect detection
Extract mobilitypattern of
Figure 2: The framework.
Figure 2 shows the overall architecture of our framework.We first partition the city area into regions with functionalcategories. Then, the mobility characteristics of passen-gers are extracted from transit records and incident reports.Moreover, we build an individual mobility database to storethe profile of each passenger. Next, we implement our frame-work by regular passenger filtering and suspect detection.The system is efficient and interactive, with both mobileand desktop clients. Finally, the user feedback information,such as newly confirmed thieves, will be entered as groundtruth for future model training.
The remaining of this paper is organized as follows. Sec-tion 2 provides an overview the AFC datasets, based onwhich we performed the study. A detailed description offeatures that we extract to characterize mobility profiles ofpassengers is presented in Section 3. A two-step frameworkof the suspect identification system is proposed in Section 4.Experimental results are summarized in Section 5, and anoverview of the deployed system is presented in Section 6.Finally, we summarize related work in Section 7, and drawconclusions in Section 8.
2. DATA DESCRIPTIONThe data for our study have been collected from multi-
ple sources. These include transit records, geographical in-formation, and theft incident reports. In this section, weprovide an overview of the data.
2.1 Transit RecordsOur study is based on a large-scale transit records dataset
collected from a public transit system that includes busesand subways. Passengers utilizing the transit service arecharged