2001/11/27ids lab seminar1 adaptive fraud detection advisor: dr. hsu graduate: yung-chu lin source:...
TRANSCRIPT
2001/11/27 IDS Lab Seminar 1
Adaptive Fraud Detection
Advisor: Dr. Hsu Graduate: Yung-Chu Lin
Source: Fawcett, Tom and Foster Provost, Journal of Data Mining and Knowledge Discovery, Volume 1, Issue 3, September 1997, pp. 291-316
2001/11/27 IDS Lab Seminar 2
Outline
Motivation & objective Definition What’s cloning fraud Detriment of cloning fraud Strategies for dealing with cloning fraud The need to be adaptive Problems of learning algorithms The detector constructor framework How framework work Experiments Conclusion
2001/11/27 IDS Lab Seminar 3
Motivation
Cellular fraud costs hundreds of millions of dollars per year
Existing methods are ad hoc
2001/11/27 IDS Lab Seminar 4
Objective
Presenting a framework/system for automatically generating detectors
2001/11/27 IDS Lab Seminar 5
Definition
A customer’s account = MIN + ESN MIN (Mobile Identification Number) ESN (Electronic Serial Number)
Bandit: a cloned phone user Carrier: the cellular service
provider
2001/11/27 IDS Lab Seminar 6
What’s Cloning Fraud
A customer’s MIN and ESN not belonging to the customer
A bandit makes virtually unlimited calls
The attraction of free and untraceable communication popular
2001/11/27 IDS Lab Seminar 7
Detriment of Cloning Fraud
Service to be denied to legitimate customers
Crediting process is costly to the carrier and inconvenient to the customer
Fraud incurs land-line usage charges Cellular carries must pay costs to
other carriers
2001/11/27 IDS Lab Seminar 8
Strategies for Dealing with Cloning Fraud
Pre-call methods Post-call methods User profiling
2001/11/27 IDS Lab Seminar 9
Pre-call Methods
Requiring PIN (Personal Identification Number) PIN is entered before every call
RF Fingerprinting Identifying cellular phones by their
transmission characteristics Authentication
A reliable and secure private-key encryption method
2001/11/27 IDS Lab Seminar 10
Post-call Methods
Collision detection Analyzing call data for temporally
overlapping calls Velocity checking
Analyzing the locations and times for consecutive calls
Dialed digit analysis
2001/11/27 IDS Lab Seminar 11
User Profiling
Analyzing calling behavior to detect usage anomalies suggestive of fraud
Working well with low-usage
2001/11/27 IDS Lab Seminar 12
The Need to Be Adaptive
The patterns of fraud are dynamic Bandits constantly change their
strategies The environment is dynamic in
other ways
2001/11/27 IDS Lab Seminar 13
Problems of Learning Algorithms
Context The discovery of context-sensitive
fraudwhich call features are important? The profiling of individual accountshow
should profiles be created? Granularity
Aggregating customer behavior, smoothing out the variation
Watching for coarser-grained changes that have better predictive powerwhen should alarms be issued?
2001/11/27 IDS Lab Seminar 14
The Detector Constructor Framework
2001/11/27 IDS Lab Seminar 15
How Framework Works
2001/11/27 IDS Lab Seminar 16
Learning Fraud Rules
Rule generation Rule are generated locally for each account Using RL program
Rule selection Most of the rules created by generating step are
specific only to single accounts The rule found in (“covers”) many accounts is
worth using
2001/11/27 IDS Lab Seminar 17
Constructing Profiling Monitors(1/3)
Sensitivity to different users is accomplished
Profiling phase The monitor is applied to a segment of
an account’s typical(non-fraud) usage Use phase
The monitor processes a single account-day at a time
2001/11/27 IDS Lab Seminar 18
Constructing Profiling Monitors(2/3)
Profiling monitors are created by the monitor constructor, which employs a set of templates
2001/11/27 IDS Lab Seminar 19
Constructing Profiling Monitors(3/3)
2001/11/27 IDS Lab Seminar 20
Combining Evidence from the Monitors
The outputs of the monitors are used to a standard learning program
Using Linear Threshold Unit (LTU) In training, the monitors’ outputs are
presented along with the desired output
The evidence combination weights the monitor outputs and learns a threshold on the sum
2001/11/27 IDS Lab Seminar 21
The Data
Records of cellular calls placed over four months by users in the New York City area
Each call is described by 31 attributes
Adding 7 attributes TIME-OF-DAY etc.
Each call is given a class label of legitimate or fraudulent
2001/11/27 IDS Lab Seminar 22
Data Selection
Rule learning: 879 accounts 500,000 calls
Profiling, training, testing: 3600 accounts
30 days (fraud-free)profiling Remaining days96,000 account-days
Randomly selecting 10,000 for training 5000 for testing (20% fraud; 80% non-fraud)
2001/11/27 IDS Lab Seminar 23
Experiments
Rule learning generated 3630 rules The rule selection process, yielded
99 rules Each of the 99 rules was used to
instantiate 2 monitor templates, yielding 198 monitors
The final feature selection step reduced to 7 monitors
2001/11/27 IDS Lab Seminar 24
Experiments
2001/11/27 IDS Lab Seminar 25
Conclusion
Fraud behavior changes frequently, and fraud detection systems should be adaptive as well
To build usage monitors we must know which aspects of customers’ behavior to profile
This framework is not specific to cloning fraud