challenges for reliable and large scale evaluation of...
TRANSCRIPT
Introduction Datasets Malware analysis Performances Conclusion
Challenges for Reliable and Large ScaleEvaluation of Android Malware Analysis
Jean-François Lalande Valérie Viet Triem TongMourad Leslous Pierre Graux
INVITED TALK – SHPCS 2018
July 19th 2018
2 / 31
Introduction Datasets Malware analysis Performances Conclusion
Context
Google Play Store: 3.5 million applications (2017)
Malware are uploaded to the Play Store+ Third party markets
Research efforts:Detection, classificationPayload extraction, unpacking, reverseExecution, triggering
Difficulties for experimenting with Android malware samples
3 / 31
Introduction Datasets Malware analysis Performances Conclusion
Analyzing malware
About papers that work on Android malware. . .
CopperDroid [Tam et al. 201]:1365 samples3% of payloads executed
IntelliDroid [Wong et al. 2016]:10 samples90% of payloads executed
GroddDroid [us ! 2015]:100 samples24% of payloads executed
. . .
4 / 31
Introduction Datasets Malware analysis Performances Conclusion
Challenges
About datasetsIs there any datasets?Can we build one?
About analysisWhat are the difficulties?Is it reliable and reproducible?Are samples really malware?
About performancesHow much does it cost?Is it scalable?
5 / 31
Introduction Datasets Malware analysis Performances Conclusion
1 Introduction
2 Datasets
3 Malware analysis
4 Performances
5 Conclusion
6 / 31
Introduction Datasets Malware analysis Performances Conclusion
Why building a dataset ?
Papers with Android malware experiments:use extracts of reference datasets:
The Genome project (stopped !) [Zhou et al. 12]Contagio mobile dataset [Mila Parkour]Hand crafted malicious apps (DroidBench [Artz et al. 14])Some Security Challenges’ apps
need to be significant:Tons of apps (e.g. 1.3 million for PhaLibs [Chen et al. 16])Some apps (e.g. 11 for TriggerScope [Fratantonio et al. 16])
A well documented dataset does not exist !Online services give poor information !
7 / 31
Introduction Datasets Malware analysis Performances Conclusion
Building a dataset
Collect malwarefrom online sources, or researchersstudy the samples manually
Methodology:
manual reverse of 7 samplesmanual triggering (not obvious)execution and information flow capture
By Con-struct + replicant
community [CC BY-SA 3.0]
8 / 31
Introduction Datasets Malware analysis Performances Conclusion
A collection of malware totally reversed
Kharon dataset: 7 malware1:
http://kharon.gforge.inria.fr/dataset
DroidKungFu, BadNews (2011, 2013)WipeLocker (2014)MobiDash (2015)SaveMe, Cajino (2015)SimpleLocker (2014)
1Approved by Inria’s Operational Legal and Ethical Risk AssessmentCommittee: We warn the readers that these samples have to be used forresearch purpose only. We also advise to carefully check the SHA256 hashof the studied malware samples and to manipulate them in a sandboxedenvironment. In particular, the manipulation of these malware impose tofollow safety rules of your Institutional Review Boards.
9 / 31
Introduction Datasets Malware analysis Performances Conclusion
Remote admin Tools
Install malicious apps:
Badnews: Obeys to a remote server + delays attackTriggering: Patch the bytecode + Build a fake server
DroidKungFu1 (well known): Delays attackTriggering: Modify ’start’ to 1 in sstimestamp.xml andreboot the device
10 / 31
Introduction Datasets Malware analysis Performances Conclusion
Blocker / Eraser
Wipes of the SD card and block social apps:
WipeLocker: Delayed AttackTriggering: Launch the app and reboot the device
11 / 31
Introduction Datasets Malware analysis Performances Conclusion
Adware
Displays adds after some days:MobiDash: Delayed AttackTriggering: Launch the application, reboot the device andmodify com.cardgame.durak_preferences.xml
12 / 31
Introduction Datasets Malware analysis Performances Conclusion
Spyware
Steals contacts, sms, IMEI, . . .SaveMe: Verifies the Internet accessTriggering: Enable Internet access and lauch the app
Cajino: Obeys a Baidu remote serverTriggering: Simulate a server command with an Intent
13 / 31
Introduction Datasets Malware analysis Performances Conclusion
Ransomware
Encrypts user’s files and asks for paying:
SimpleLockerWaits the reboot of the deviceTriggering: send a BOOT_COMPLETED intent
More details about SimpleLocker...
14 / 31
Introduction Datasets Malware analysis Performances Conclusion
Example: SimpleLocker
The main malicious functions:
org.simplelocker.MainService.onCreate()org.simplelocker.MainService$4.run()org.simplelocker.TorSender.sendCheck(final Context context)org.simplelocker.FilesEncryptor.encrypt()org.simplelocker.AesCrypt.AesCrypt(final String s)
The encryption loop:
final AesCrypt aesCrypt = new AesCrypt("jndlasf074hr");
for (final String s : this.filesToEncrypt) {aesCrypt.encrypt(s, String.valueOf(s) + ".enc");new File(s).delete();
}
15 / 31
Introduction Datasets Malware analysis Performances Conclusion
Dataset overview
Type Name Protection against dynamic Analysis→ Remediation
RAT BadnewsObeys to a remote server and delays the attack
→ Modify the apk→ Build a fake server
Ransomware SimpleLocker Waits the reboot of the device→ send a BOOT_COMPLETED intent
RAT DroidKungFu Delayed Attack→ Modify the value start to 1 in sstimestamp.xml
Adware MobiDashDelayed Attack
→ Launch the infected application, reboot the deviceand modify com.cardgame.durak_preferences.xml
Spyware SaveMe Verifies the Internet access→ Enable Internet access and launch the application
Eraser+LK WipeLocker Delayed Attack→ Press the icon launcher and reboot the device
Spyware Cajino Obeys to a remote server→ Simulate the remote server by sending an intent
16 / 31
Introduction Datasets Malware analysis Performances Conclusion
New recent datasets
AndroZoo [Allix et al. 2016]3 million appsWith pairs of applications (repackaged ?)
The AMD dataset [Wei et al. 2017]24,650 samplesWith contextual informations (classes, actions, . . . )
We need more contextual information !Where is the payload ?How to trigger the payload ?Which device do I need ?
17 / 31
Introduction Datasets Malware analysis Performances Conclusion
1 Introduction
2 Datasets
3 Malware analysis
4 Performances
5 Conclusion
18 / 31
Introduction Datasets Malware analysis Performances Conclusion
Designing an experiment from scratch
19 / 31
Introduction Datasets Malware analysis Performances Conclusion
Check that a sample is a malware?
Manually. . . for 10 samples ok, but for more ?
Ask VirusTotal!
∼45 antiviruses softwareUse a threshold to decide (e.g. 20 antiviruses)Free upload API (few samples / day)Used by others in papers
Is it a good idea?
20 / 31
Introduction Datasets Malware analysis Performances Conclusion
An experiment with 683 fresh samples
Threshold of x antiviruses recognizing a sample?
21 / 31
Introduction Datasets Malware analysis Performances Conclusion
Designing an experiment from scratch
22 / 31
Introduction Datasets Malware analysis Performances Conclusion
Analyzing malware
Main analysis methods are:
static analysis:⇒ try to recognize knowncharacteristics of malware in thecode/resources of studied applications
dynamic analysis:⇒ try to execute the malware
22 / 31
Introduction Datasets Malware analysis Performances Conclusion
Analyzing malware
Main analysis methods are:
static analysis:⇒ try to recognize knowncharacteristics of malware in thecode/resources of studied applications
dynamic analysis:⇒ try to execute the malware
Countermeasures: reflec-tion, obfuscation, dynamicloading, encryption, native
Countermeasures:logic bomb, time
bomb, remote server
23 / 31
Introduction Datasets Malware analysis Performances Conclusion
Solving attacker’s countermeasures
Possible solutions against attacker’s countermeasures.
Problem Solutionmalformed files ignore it if possible
reflection execute itdynamic loading execute itlogic/time bomb force conditions
native code watch it from the kernelpacking ???
(dead) remote server ???
24 / 31
Introduction Datasets Malware analysis Performances Conclusion
Malware analysis
We have developed software for:
Static, dynamic analysisSmartphone flashing with custom kernelMore info: http://kharon.gforge.inria.fr
Dynamic analysis requires a lot of efforts to be automatized.
25 / 31
Introduction Datasets Malware analysis Performances Conclusion
1 Introduction
2 Datasets
3 Malware analysis
4 Performances
5 Conclusion
26 / 31
Introduction Datasets Malware analysis Performances Conclusion
Challenges
Is it reliable ?How to evaluate the results ? (execution of the payload)Is it scalable ?
Some results with:
AMD dataset: [Wei et al. 2017]A “native” dataset: apps using native code
27 / 31
Introduction Datasets Malware analysis Performances Conclusion
Reliability
Some malware crash (and people don’t care. . . )
Crash ratio (at launch time):AMD dataset: 5%Our native dataset: 20%
If no crash, how to evaluate the results ?
We need to know where is the payload !
28 / 31
Introduction Datasets Malware analysis Performances Conclusion
Performances
Time evaluation (average):
For one app and one payload:
Flashing device: 60 sStatic analysis: 7 sDynamic analysis (execution): 4 mTotal: 5 m
From the AMD dataset: 135 samples100 payloads per appAll Android OS: 8 versionsTotal: 1 year of experiments (with 1 device)
28 / 31
Introduction Datasets Malware analysis Performances Conclusion
Performances
Time evaluation (average):
For one app and one payload:
Flashing device: 60 sStatic analysis: 7 sDynamic analysis (execution): 4 mTotal: 5 m
From the AMD dataset: 135 samples100 payloads per appAll Android OS: 8 versionsTotal: 1 year of experiments (with 1 device)
29 / 31
Introduction Datasets Malware analysis Performances Conclusion
Scalability
We need a real smartphone.
At this time we use:
A server running our softwareA pool of 1 to 5 smartphones (USB limitations ?)
Frontend visible at:http://kharon.irisa.fr
30 / 31
Introduction Datasets Malware analysis Performances Conclusion
Conclusion
Designing experiments on Android malwareis a difficult challenge!
We still have difficulties:
how to scale experiments?how to evaluate payload execution?
http://kharon.gforge.inria.frhttp://kharon.gforge.inria.fr/datasethttp://kharon.irisa.fr
c©Inria / C. Morel
Questions ?