a.c. chen 2012/07/23 @ adl m zubair rafique muhammad khurram khan khaled alghathbar muddassar farooq...
TRANSCRIPT
A.C. Chen 2012/07/23 @ ADL 1
A FRAMEWORK FOR DETECTING MALFORMED SMS ATTACK
M Zubair Rafique
Muhammad Khurram Khan
Khaled Alghathbar
Muddassar Farooq
The 8th FTRA International Conference on Secure and Trust Computing, data management, and Applications ( STA 2011)
A.C. Chen 2012/07/23 @ ADL 2
Outline
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 3
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 4
SMS Deliver Process
SMS_SUBMIT
SMS_DELIVER
BSC: Base Station Controller
MSC: Mobile Switch Center
GMSC: Gateway MSC
IWMSC: Interworking MSC
A.C. Chen 2012/07/23 @ ADL 5
Short Message Service ( SMS )
A message sent to and from a mobile phone are first sent to an intermediate component called the Short Message Service Center (SMSC)
The SMS message exists in 2 formats SMS_SUBMIT: mobile phone to SMSC SMS_DELIVER: SMSC to mobile phone
A.C. Chen 2012/07/23 @ ADL 6
GSM Modem The SMS received on a mobile phone
is handled through the GSM modem Provides an interface with the GSM network
and the application processor of a smart phone Controlled through standardized AT commands
Apps
Telephony Stack
Modem
AT commands
AT Result Codes
Responsible for cellular communications
Responsible for the communication between application processor and the modem
A.C. Chen 2012/07/23 @ ADL 7
Example: SMS_DELIVER///AT Result Code + the length of SMS
Complete SMS string in hex.
A.C. Chen 2012/07/23 @ ADL 8
Malformed SMS attack
Cause the application processor to reach an undefined state Significant processing delays Unauthorized access Denying legitimate users access …
Apps
Telephony Stack
Modem
However, malformed message detection in mobile phones has received little attention
A.C. Chen 2012/07/23 @ ADL 9
In this Paper…
A malformed message detection framework was proposed Automatically extracts novel syntactical
features to detect a malformed SMS at the access layer of mobile phones
A.C. Chen 2012/07/23 @ ADL 10
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 11
Common Idea
Anomalies are deviations from a learnt normal model [Patrick Dssel, et al.] Learning→Normal model→Anomaly detection Supported by our pilot studies
• The distance values of malformed messages are normally greater than those of benign messages
A.C. Chen 2012/07/23 @ ADL
SMS Detection Framework
MessageAnalyzer
FeatureExtractio
n
FeatureSelection
Classification
12
A.C. Chen 2012/07/23 @ ADL
Message Analyzer
Message dissection Transform incoming SMS messages into a
format from which we can extract intelligent features
Extracts the complete SMS message string i.e. the second line of AT Result code
FeatureExtraction
FeatureSelection
ClassificationMessageAnalyzer 13
A.C. Chen 2012/07/23 @ ADL 14
Extraction of String Features
Mine features from an incoming SMS message Exploit the properties of a suffix tree Use a set of attribute strings to model the
content of the incoming message Entrenching function : Extracts the
( attribute, value ) pair from the suffix tree attribute: a feature string a value: the frequency of a from the nodes of the
suffix tree Example
FeatureExtraction
FeatureSelection
ClassificationMessageAnalyzer
A.C. Chen 2012/07/23 @ ADL 15
Raw Model Vectors For the purpose of training, we
prepared a training data set 𝛫: Set of messages used for training, ={ 𝛫 m1,
…,mk }
After each mi passes through the entrenching function, we have our raw model
FeatureExtraction
FeatureSelection
ClassificationMessageAnalyzer
A.C. Chen 2012/07/23 @ ADL 16
Feature Selection
The high dimensionality of the raw model will result in large processing overheads
Remove redundant features having low classification potential Not at the cost of a high false alarm rate
MessageAnalyzer
FeatureExtraction
ClassificationFeature
Selection
A.C. Chen 2012/07/23 @ ADL 17
Selection Techniques
Use 3 selection mechanisms to obtain 3 distinct model set of attributes Information Gain (IG) Gain Ratio (GR) Chi Squared (CH)
MessageAnalyzer
FeatureExtraction
ClassificationFeature
Selection
A.C. Chen 2012/07/23 @ ADL 18
Distance/Divergence
For a given vector of pairs, compute the deviation ( message score, distance ) of the vector
Use 2 well-known distance measures to obtain the score Manhattan distance (md) Itakura-Saito Divergence (isd)
MessageAnalyzer
FeatureExtraction
FeatureSelection Classification
A.C. Chen 2012/07/23 @ ADL 19
Classification
Threshold value The largest distance score of a message in the
training model
Raise an alarm If the distance score of an incoming SMS is
greater than the threshold value
MessageAnalyzer
FeatureExtraction
FeatureSelection Classification
A.C. Chen 2012/07/23 @ ADL
ReviewTraining is only required in the beginning
20
threshold
message score
A.C. Chen 2012/07/23 @ ADL 21
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 22
Evaluation
Collect real world dataset of SMS message ≥ 5000 benign datasets
• Developed modem terminal interface to collect more than 5000 real world benign SMS dataset
≥ 5000 malformed datasets• SMS injection framework ( Mulliner, C., et al., 2009)
A.C. Chen 2012/07/23 @ ADL 23
Experimental Goal
To select the best feature selection technique and distance measure 3 feature selection modules
• Information Gain (IG)• Gain Ratio (GR) • Chi-squared (CH)
2 distance measures• Manhattan distance (md)• Itakura-Saito Divergence (isd)
A.C. Chen 2012/07/23 @ ADL 24
Parameters and Definitions
Used 4 parameters to define the detection accuracy and the false alarm rate True Positive (TP), False Positive (FP), False
Negative (FN), True Negative (TN) Detection Rate
False Alarm Rate
A.C. Chen 2012/07/23 @ ADL 25
Results: Receiver Operating Characteristic Curves
ROC using Manhattan Distance ROC using Itakura-Saito Divergence
A.C. Chen 2012/07/23 @ ADL 26
Results: Overheads Training and Threshold calculation overheads in ( ms/100 SMS ) Testing overheads in ( ms/1 SMS ) using Information Gain, Gain Ratio
and Chisquared for Manhattan distance and Itakura-Saito Divergence
Average training time = 3.5s/100SMS
Average detection time of a malformed message = 10ms
Provides the best performance
A.C. Chen 2012/07/23 @ ADL 27
Introduction Malformed message detection
framework Evaluation and experimental results Conclusion
A.C. Chen 2012/07/23 @ ADL 28
Conclusion
A real time malformed message detection framework Tested on real datasets of SMS messages Successfully detects malformed messages with
a detection accuracy of more than 98% The future research will focus on
further optimizing and deploying it on real world mobile devices and smart phones
A.C. Chen 2012/07/23 @ ADL 30
Example of a Suffix Tree
Extract feature strings from an incoming message m=0110223 The set of attribute strings is thus generated
FeatureExtraction
FeatureSelection
ClassificationMessageAnalyzer
A.C. Chen 2012/07/23 @ ADL 31
Example of Entrenching Function
Message m=0110223
Set of attribute:
{3, 0, 1, 2, 23, 223, 110223, 10223, 0223,
0110223}
Vector of pairs
=(3, 1), (0, 2), (1, 2), (2, 2), (23, 1), (223, 1)…
FeatureExtraction
FeatureSelection
ClassificationMessageAnalyzer
A.C. Chen 2012/07/23 @ ADL 32
The RIL in the context of Android's Telephony system architecture [ref]