download malware? no, thanks

7
Download Malware? No, thanks. How Formal Methods can Block Update Attacks Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Corrado Aaron Visaggio Department of Engineering, University of Sannio, Italy {fmercaldo, vnardone, santone, visaggio}@unisannio.it ABSTRACT In mobile malware landscape there are many techniques to inject malicious payload in a trusted application: one of the most common is represented by the so-called update attack. After an apparently innocuous application is installed on the victim’s device, the user is asked to update the application, and a malicious behavior is added to the application. In this paper we propose a static method based on model checking able to identify this kind of attack. In addiction, our method is able to localize the malicious payload at method- level. We obtain an accuracy very close to 1 in identifying families implementing update attack using a real Android dataset composed by 2, 581 samples. Keywords Malware; Android; Security; Model Checking; Temporal logic 1. INTRODUCTION AND MOTIVATION Mobile malware techniques used for evading the detection are becoming more and more sophisticated. One of them, which recently is largely used, is the update attack, which consists of downloading the malicious payload on the victim’s device after the user has installed an app that does not exhibit any harmful behavior. This download is hidden behind an apparent normal update of the app. By this way the anti-malware does not detect any malicious component when scanning the app at the initial stages of its life on the victim’s device; additionally, as the payload is installed as a normal update, the unaware victim grants the proper privileges. As demonstrated in [1] and [2] it is easy to implement this technique in real world environments. The update attack leverages the current mobile scenario where every a few months updates are available for apps and OSs as the marketplace requires more and more functionalities to keep an app appealing and popular. Beyond the undetected installation of a malicious payload, another threat is caused by an uncontrolled updating, namely the Pileup (privilege escalation through updating) [3]. It allows an unprivileged malicious app to acquire system’s capabilities after a new OS is updated, without being noticed by the phone user. The point is that this menace exploits Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. FormaliSE’16, May 15 2016, Austin, TX, USA c 2016 ACM. ISBN 978-1-4503-4159-2/16/05. . . $15.00 DOI: http://dx.doi.org/10.1145/2897667.2897673 the flaws in the updating mechanism of the new OS, which the current system is upgraded to. Four main techniques are usually employed to implement the update attack on Android apps [4]: (i) notifying the user that an update (i.e., complete replacement) to the original application is available; (ii) dynamic loading of a compiled Android code (i.e., executable DEX files) using Android’s DexClassLoader class and allowing the execution of code not installed as part of the application; (iii) dynamic loading of a binary shared object file (also called .so library) or an executable file containing native (i.e., machine) code which can be executed using Java’s Runtime class; and (iv) dynamic loading of a certain file (i.e., mp3, jpg, flash, and pdf) containing a malicious payload (i.e., shellcode) and executing it by exploiting vulnerabilities in the system libraries or external applications handling the file type. In 2010, Jon Oberhide demonstrated [5] how to download arbitrary code in Android applications at runtime, while Bellissimo et al. observed how and why many update mechanisms of applications and operating systems are insecure [6]. Malicious dynamic code was increasingly adopted by malware writers for circumventing Google Bouncer, which is a tool used by Google for finding apps containing security threats and banning the pubblication on the Google Store. Peoplau et al. [7] studied the extent of the unsafe dynamic loading in the Android ecosystem. They found that in 1, 632 popular applications retrieved from Google Play, external code loading is implemented in an insecure way in as much as 9.25% of those applications and even 16% of the top 50 free applications. Static techniques are ineffective since the original version of the application is completely benign by itself. Dynamic analysis can be evaded by delaying or filtering the deployment of the malicious payload. The main problem with the update attack identification is that it is not possible to know in advance when the payload will be downloaded and integrated in the host application. Moreover, the update attack exploits the dynamic loading, a possibility offered by Android for improving the flexibility of (benign) apps. Thus, distinguishing between a licit upload and an illicit one is very hard with automatic techniques. As a matter of fact this issue is scarcely discussed in literature. We propose a technique that applies formal methods to identify the update attack on the app and to localize the portion of the code that implements the download. The method brings about two benefits: (1) it is static, so it has the typical advantage of the static analysis: it can be run directly on the application without the need of executing it; (2) it is able to localize the portion of the code where the upload is implemented and this can help the malware dissection and analysis.

Upload: others

Post on 15-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Download Malware? No, thanks

Download Malware? No, thanks.

How Formal Methods can Block Update Attacks

Francesco Mercaldo, Vittoria Nardone, Antonella Santone, Corrado Aaron VisaggioDepartment of Engineering, University of Sannio, Italy

{fmercaldo, vnardone, santone, visaggio}@unisannio.it

ABSTRACTIn mobile malware landscape there are many techniques to injectmalicious payload in a trusted application: one of the most commonis represented by the so-called update attack. After an apparentlyinnocuous application is installed on the victim’s device, the user isasked to update the application, and a malicious behavior is addedto the application. In this paper we propose a static method basedon model checking able to identify this kind of attack. In addiction,our method is able to localize the malicious payload at method-level. We obtain an accuracy very close to 1 in identifying familiesimplementing update attack using a real Android dataset composedby 2,581 samples.

KeywordsMalware; Android; Security; Model Checking; Temporal logic

1. INTRODUCTION AND MOTIVATIONMobile malware techniques used for evading the detection are

becoming more and more sophisticated. One of them, whichrecently is largely used, is the update attack, which consists ofdownloading the malicious payload on the victim’s device afterthe user has installed an app that does not exhibit any harmfulbehavior. This download is hidden behind an apparent normalupdate of the app. By this way the anti-malware does not detectany malicious component when scanning the app at the initialstages of its life on the victim’s device; additionally, as thepayload is installed as a normal update, the unaware victim grantsthe proper privileges. As demonstrated in [1] and [2] it is easy toimplement this technique in real world environments.

The update attack leverages the current mobile scenario whereevery a few months updates are available for apps and OSs as themarketplace requires more and more functionalities to keep an appappealing and popular. Beyond the undetected installation of amalicious payload, another threat is caused by an uncontrolledupdating, namely the Pileup (privilege escalation throughupdating) [3]. It allows an unprivileged malicious app to acquiresystem’s capabilities after a new OS is updated, without beingnoticed by the phone user. The point is that this menace exploits

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected]’16, May 15 2016, Austin, TX, USAc� 2016 ACM. ISBN 978-1-4503-4159-2/16/05. . . $15.00

DOI: http://dx.doi.org/10.1145/2897667.2897673

the flaws in the updating mechanism of the new OS, which thecurrent system is upgraded to.

Four main techniques are usually employed to implement theupdate attack on Android apps [4]: (i) notifying the user that anupdate (i.e., complete replacement) to the original application isavailable; (ii) dynamic loading of a compiled Android code (i.e.,executable DEX files) using Android’s DexClassLoader class andallowing the execution of code not installed as part of theapplication; (iii) dynamic loading of a binary shared object file(also called .so library) or an executable file containing native (i.e.,machine) code which can be executed using Java’s Runtime class;and (iv) dynamic loading of a certain file (i.e., mp3, jpg, flash, andpdf) containing a malicious payload (i.e., shellcode) and executingit by exploiting vulnerabilities in the system libraries or externalapplications handling the file type.

In 2010, Jon Oberhide demonstrated [5] how to downloadarbitrary code in Android applications at runtime, whileBellissimo et al. observed how and why many update mechanismsof applications and operating systems are insecure [6]. Maliciousdynamic code was increasingly adopted by malware writers forcircumventing Google Bouncer, which is a tool used by Googlefor finding apps containing security threats and banning thepubblication on the Google Store. Peoplau et al. [7] studied theextent of the unsafe dynamic loading in the Android ecosystem.They found that in 1,632 popular applications retrieved fromGoogle Play, external code loading is implemented in an insecureway in as much as 9.25% of those applications and even 16% ofthe top 50 free applications.

Static techniques are ineffective since the original version of theapplication is completely benign by itself. Dynamic analysis canbe evaded by delaying or filtering the deployment of the maliciouspayload.

The main problem with the update attack identification is that itis not possible to know in advance when the payload will bedownloaded and integrated in the host application. Moreover, theupdate attack exploits the dynamic loading, a possibility offeredby Android for improving the flexibility of (benign) apps. Thus,distinguishing between a licit upload and an illicit one is very hardwith automatic techniques. As a matter of fact this issue isscarcely discussed in literature.

We propose a technique that applies formal methods to identifythe update attack on the app and to localize the portion of the codethat implements the download. The method brings about twobenefits: (1) it is static, so it has the typical advantage of the staticanalysis: it can be run directly on the application without the needof executing it; (2) it is able to localize the portion of the codewhere the upload is implemented and this can help the malwaredissection and analysis.

Page 2: Download Malware? No, thanks

We have designed and implemented an experimentation with2,581 Android applications, aimed at answering the followingresearch question:

RQ: is it possible to identify the update attack by using formalmethods?

The main contributions of our paper are: (i) to provide aneffective and novel solution to the identification of an attack forwhich a few effective solutions exist for; and (ii) to offer amechanism that localize in the code the part where the attack isaccomplished.

The paper proceeds as follows: Section 2 describes andmotivates our detection method; Section 3 illustrates the results ofexperiments; Section 4 discusses related work; finally, conclusionsare drawn in the Section 5.

2. THE APPROACHIn this section, we describe a model checking

based methodology for update attack recognition. Beforeintroducing our approach, we need some preliminaries ontemporal logic and on formal specification languages.

2.1 Mu-Calculus LogicWe use the mu-calculus logic [8] as a branching temporal logic

to express behavioural properties. The syntax of the mu-calculusis the following, where K ranges over sets of actions and Z rangesover variables:

f ::= tt | ff |Z | f^f | f_f | [K]f | hKif | nZ.f | µZ.f

A fixpoint formula has the form µZ.f (resp. nZ.f) where µZ (resp.nZ) binds free occurrences of Z in f. An occurrence of Z is free ifit is not within the scope of a binder µZ (resp. nZ). A formula isclosed if it contains no free variables. µZ.f is the least fixpoint ofthe recursive equation Z = f, while nZ.f is the greatest one. Fromnow on we consider only closed formulae.

The satisfaction of a formula f by a state s of a transition systemis defined as follows: each state satisfies tt and no state satisfies ff;a state satisfies f1 _f2 (f1 ^f2) if it satisfies f1 or (and) f2. [K] fis satisfied by a state which, for every performance of an action inK, evolves to a state obeying f. hKi f is satisfied by a state whichcan evolve to a state obeying f by performing an action in K. Formore details, the reader can refer to [8].

2.2 Calculus of Communicating SystemsThe Milner’s Calculus of Communicating Systems (CCS) [9] is

one of the most well known process algebras. Readers unfamiliarwith CCS are referred to [9] for further details. CCS containsbasic operators to build finite processes, communication operatorsto express concurrency, and some notion of recursion to captureinfinite behaviour.

We suppose set of actions A , which contains also the actiont 2 A , called internal action. The t actions allow some level ofabstraction in the description of the processes. The set of visibleactions V is defined as A �{t}.

The semantics of a CCS term p is precisely defined by meansof the structural operational semantics. The semantic definition isgiven by a set of conditional rules describing the transition relationof the automaton corresponding to the behavior expression definingp. The CCS operators are the following:

• “nil” : nil represents a process that can do nothing.

• “+”: the process p+q is a process that non-deterministicallybehaves either as p or as q.

• “.”: this operator is suitably used to express thesequentialization. The process a.q (a 2 A) means that amust be performed before the process q can start itsexecution.

• “|”: the process p|q represents the parallel execution of pand q. p and q may act independently or they may also beengaged in a communication (the t action).

• “\L”: if L is a set of visible actions, p\L is a process thatbehaves as p except that it cannot perform any of the actionslying in L externally.

• “[ f ]”: the operator [ f ] expresses the relabelling of actions.If p can perform a (a 2 A) and become p0, then p[ f ] canperform f (a) and become p0[ f ].

• “x”: the behavior of the process x, with x def= p, is that of its

definition p.

The Concurrency Workbench of New Century (CWB-NC) [10] isa formal verification environment that supports CCS as the inputlanguage and it offers a model checker to verify mu-calculusformulae.

2.3 Model checking for the Update AttackRecognition

While model checking [11] was originally developed to verifythe correctness of systems against specifications, recently it hasbeen highlighted in connection with a variety of disciplines suchas business process management [12], biology [13], self-adaptivesystems [14], secure information flow [15], incremental designand system evolution scenarios [16, 17] among others. In thispaper we present the use of model checking in the security field.

The first step of our methodology consists of describing apk(i.e., Android package) programs through CCS. We define anAPK-to-CCS transform operator that directly applies to the JavaBytecode and translates it into CCS process specifications. Moreprecisely, the bytecode of the analysed app that resides in a classfolder or in JAR files is fed to a custom parser, based on theApache Commons Bytecode Engineering Library (BCEL)1. Theparsed Java Bytecode of the .class files are successively translatedinto formal models. The transform operator is defined for eachinstruction of the Java Bytecode. Thus, all the instructions aretranslated in CCS processes. This translation has to be performedonly one time for each app to be analysed and it has beencompletely automated. Every instruction that is not a (conditionalor unconditional) jump is represented by a process that, throughthe sequentialization operator (“.”), invokes the processcorresponding to its immediately following instruction.Conditional jumps are instead specified as non-deterministicchoices. An unconditional jump is represented by a CCS processthat invokes the corresponding process of the jump target. Just togive the reader the flavour of the approach followed, below wewill show only the translation of the unconditional Java Bytecodeinstruction j : goto k. We associate a new CCS process to eachJava Bytecode instruction; in this case we define the CCS constantx j corresponding to the instruction j. More precisely, the

instruction j is translated as x jdef= goto.xk, i.e., the CCS process x j

performs the action goto and then jumps to the instruction k,corresponding to the CCS process xk.

1http://commons.apache.org/bcel/

Page 3: Download Malware? No, thanks

Malware programs that use the update attack technique share acommon set of characteristic behaviours, which we want toencode using a temporal logic. Thus, the second step of ourmethodology tries to recognize specific and distinctive features ofthe update attack behaviour that distinguish it from all the othermalware families and to goodware too. This specific behaviour iswritten as a set of properties. In our approach, mu-calculus logic isused. Four different families of Android malware that implementthe update attack are known [18]: Plankton, AnserverBot,BaseBridge and DroidKungFuUpdate. Plankton payload is able tosilently forward information about the device to a remote location.It does not require root privileges and the payload is downloadedfrom a remote location. Once installed on a device, Plankton willsend details like IMEI, browser history, permissions granted.

When an app infected with Plankton payload is installed, aservice is launched in the background. The service checks thedetails of the installed application, including its securitypermissions, and sends the details to a hard-coded HTTP server.The server replies with a URL that is used to download a JAR filethat represents the malicious payload. Once the JAR file isdownloaded, Plankton uses the DexClassLoader object to load theDex byte code from the downloaded file. The downloaded Dexcode launches a connection to the Command server and listens forcommands to execute. From manual inspection of few samples wenotice that the botnet commands are embedded into the package,the payload contains the business logic able to activate the botnet:we start from this consideration to define the logic rule forPlankton family.

In particular, Figure 1 shows the Java code (1), the bytecode(2,2-B) and the derived logic rule (3)2 regarding a number of basicbot-related commands that can be remotely invoked from theapplication. We highlight that the business logic implementing thebot-commands are not yet implemented in the application, but itwill be downloaded at run-time.

AnserverBot malicious behavior, differently from Plankton one,is embedded into the host app at installation time, i.e., themalicious payload is not downloaded from a remote location but itis stored in an external folder at installation time. Indeed under theraw and the assets directory there are two hidden apps with namesanservera.db and anserverb.db. When the host app runs, it willpop up a fake upgrade window to lure user to install the firstpayload, i.e., anservera.db. This payload consists of a bot programthat runs silently in the background without showing any icon inthe home screen after the installation. At runtime, both host appand anservera.db can dynamically load and execute code inpayload anserverb.db through the built-in Dalvik class loadingcapability in Android without actually installing it. In addition,anservera.db can make phone calls and download and upgrade theanserverb.db, while anserverb.db is able to update itself and it canindependently talk to Command & Control (C&C) servers to fetchand execute subsequent commands.

BaseBridge family presents anserverb.db as payload embeddedin an external folder like AnserverBot family. The BaseBridgepayload is able to receive premium numbers from remote C&Cservers and dial calls or send out SMS messages to them, incurringfees for users.

Figure 2 shows the Java code (A-1) and the derived logic rule(A-2) for AnserverBot samples, while in box B-1 the Java code isshown with the derived logic rule (B-2) for BaseBridge family. Wehighlight that AnserverBot sample installs alsoRageAgainstTheCage, used to gain root access on carrier locked2Note that the CWB-NC represents µZ.j (resp. nZ.j) as min Z =j (resp. max Z = j).

android phones and busybox, a Unix utility that enable the deviceto execute superuser commands like: chroot, cp, ifconfig andadduser. The installation script of RageAgainstTheCage andbusybox are in the anserverb.db payload.

DroidKungFuUpdate uses the update attack to download theactual payload. But instead of carrying or enclosing the “updated”version inside the original app, it chooses to remotely download anew version from the Web. Once the payload is downloaded thesamples present same behaviour of DroidKungFu3 family.

Finally, we use the previous results to identify whether a givenapplication belongs to those implementing the update attack.More precisely, once we have both the formal model and the set oftemporal logic formulae, we invoke the CWB-NC. Thus, when theresult of the CWB-NC model checker is true, it means that the appunder analysis implements the update attack, false otherwise.Thanks to very detailed CCS model and logic formulae, we areable to reach a good accuracy of the overall results, as explained inthe following section.

To the authors’ knowledge, model checking has never usedbefore for the update attack detection.

Our experiments, as shown in the following section, indicatethat the methodology can effectively pinpoint maliciousapplications, including obfuscated ones. Note that the use of aprocess algebra like CCS allows us to propose, as opposite tosignature-based methods, a behaviour-based method, i.e.,identification of common “behaviours" of the update attack.Obviously, this approach can be easily extended to consider alsoother kinds of malware families. It is sufficient to provide thecharacterization of the malware family under consideration aslogic formulae.

3. RESULTS AND DISCUSSIONThe malware samples examined in the experimentation were

collected from two different datasets: the Drebin project [19, 20]and the Genoma one [18]. Each malware sample in both thedatasets is labelled according to the malware family it belongs to:each family comprehends samples which have in common thesame payload. We retrieved the Plankton and the BaseBridgefamilies from the Drebin project, while the AnserverBot and theDroidKungFuUpdate from the Genoma one, in order to cover allthe existing update attack families, as stated in [18].

Furthermore, we developed a framework3 able to inject severalobfuscation levels in Android applications: (i) changing packagename; (ii) identifier renaming; (iii) data encoding; (iv) callindirections; (v) code reordering; (vi) junk code insertion. Weproduced a morphed version of the applications in the malwaredataset, with our framework, then we applied our method to themorphed dataset in order to verify if it loses its effectiveness, sincea previous study [21] demonstrated that current anti-malwaresolutions are not able to detect malware with trivial codetransformations.

Table 1 characterizes the 2,581 samples belonging to the datasetwe used to test the effectiveness of our method.

Regarding the update attack families (Plankton, BaseBridge,AnserverBot and DroidKungFuUpdate), in Table 1 shows thenumber of original and morphed samples we tested for eachfamily; in some cases our framework was not able to disassemblesome of the selected samples, this is the reason why we had todiscard them and we have a smaller number of morphed samples.

In order to test the capacity of our rules to identify just updateattack families, we include in the dataset trusted samples and

3https://github.com/faber03/AndroidMalwareEvaluatingTools

Page 4: Download Malware? No, thanks

Figure 1: Plankton sample identified by the hash 00ceaa5f8f9be7a9ce5ffe96b5b6fb2e7e73ad87c2f023db9fa399c40ac59b62. Box 1 contains the botnet Javacode, in boxes 2-A and 2-B the bytecode extrated from the Java code snippet is showed while the derived logic rule is in box 3.

malware from other families (respectively Trusted andRepackaged Attack).

Table 1: Dataset used in the Experimentation

Dataset Original Samples Morphed Samples #Samples for FamilyPlankton 625 543 1,168BaseBridge 330 307 637AnserverBot 187 187 374DroidKungFuUpdate 1 1 2Repackaged Attack 200 0 200Trusted 200 0 200Total 1,543 1,038 2,581

We retrieved the top 200 Android free applications available inDecember 2015 from the Google official app store4. We submittedthe downloaded applications to VirusTotal API: we marked astrusted only the samples that were considered clean by all the 57anti-malware solutions provided by VirusTotal.

Table 2 provides a brief description of payload exposed by themalware families labelled as Repackaged Attack in Table 1, i.e.,malware that does not adopt the technique of the update attack.

Table 2: Repackaged Attack Malware dataset

Malware Family payload descriptionFakeInstaller server-side polymorphic familyDroidKungFu it installs a backdoorOpfake first Android polymorphic malwareGinMaster malicious service to root devicesKmin it sends info to premium-rate numbers

Each family we consider includes the malicious payload at the4https://play.google.com/store

installation time. i.e., these samples do not require to download thepayload at run-time: it represents the so-called repackage attack.We test 40 samples for each family. The malware was retrievedfrom the Drebin project [19, 20] (we take into account the top 5populous families).

3.1 Empirical Evaluation ProcedureTo estimate the detection performance of our methodology we

compute the metrics of precision and recall, F-measure (Fm) andAccuracy (Acc), defined as follows:

PR =T P

T P+FP; RC =

T PT P+FN

;

Fm =2PR RCPR+RC

; Acc =T P+T N

T P+FN +FP+T Nwhere T P is the number of malware that was correctly identifiedin the right family (True Positives), T N is the number of malwarecorrectly identified as not belonging to the family (TrueNegatives), FP is the number of malware that was incorrectlyidentified in the target family (False Positives), and FN is thenumber of malware that was not identified as belonging to theright family (False Negatives).

3.2 EvaluationIn order to highlight the effectiveness of our method relative to

current anti-malware we compare results obtained with currentsignature-based anti-malware technologies submitting the fulldataset (original samples and morphed ones) to the top 10 rankedmobile anti-malware from AVTEST5 (an independent Security

5https://www.av-test.org/en/antivirus/mobile-devices/

Page 5: Download Malware? No, thanks

Figure 2: Box A-1 and A-2 show the AnserverBot sample identified by the hash 93d55a653b5e494eedf828b711e3b8149c7293f3f2d78ed061b0c4d166a5cca32.In the raw folder the anserverb payload is stored, along with the busybox and the rageagainstthecage script that will be executed at runtime. The boxes B-1and B-2 contain respectively the bc407b8036b57b507881352cca4e899423bb60cd6e732a8a3fe2327a8fefc595 AnserverBot sample: the asset folder containsthe anserverb payload. A-2 and B-2 boxes list the logic rules derived from the code snippets in A-1 and B-1 boxes.

Institute for IT) querying the VirusTotal API6.Table 3 shows the evaluation between the top 10 anti-malware

and our method with the original update attack samples and withthe morphed ones.

We consider only the samples identified in the right family(column “ident” in Table 3). We also report the samples detectedas malicious but not identified in the right family and the samplesnot recognized as malware (column “unident” in Table 3).

With regards to Table 3 we notice that ESET-NOD32 showsbetter performance in family identification for Plankton samples,while AhnLab exhibits better family detection ratio forAnserverBot samples. Instead, with regards to morphed samples,anti-malware performance decreases dramatically, indeedESET-NOD32 is able to identify only 197 samples belonging toPlankton family, while AhnLab is able to identify only 138AnserverBot samples. The best anti-malware for BaseBridgesamples is BitDefender, able to correctly classify 302 originalsamples and 89 morphed samples. The DroidKungFuUpdateoriginal sample is identified by AhnLab, Avira and ESET-NOD32anti-malware, while the morphed sample is identified only byAvira and ESET-NOD32 anti-malware.

Due to the novelty of the problem, anti-malware are not stillspecialized in family identification; this is the reason why most ofanti-malware are unskilled to detect families. Another problem isthat current anti-malware are not able to detect malware when thesignature mutates: their performance decreases dramatically withmorphed samples. On the contrary, the detection done by ourmethod is barely affected by the code transformations, so it is

6https://www.virustotal.com/

independent from the signature.Table 4 shows the results obtained using our method: we obtain

an accuracy equal to 1 for Plankton, AnserverBot andDroidKungFuUpdate families, while the accuracy obtained forBaseBridge family is equal to 0.99.

Table 4: Performance Evaluation

Formula # Samples TP FP FN TN PR RC Fm AccPlankton 1,168 1,168 0 0 1,413 1 1 1 1BaseBridge 637 629 0 8 1,944 1 0.98 0.98 0.99AnserverBot 374 374 0 0 2,207 1 1 1 1DroidKungFuUpdate 2 2 0 0 2,579 1 1 1 1

We consider for each family the sum of original and morphedsamples: the detail about the number of original and morphedsamples is shown in Table 1.

4. RELATED WORKIn the last years, formal methods have been applied to detect

malicious behaviors, but at the best knowledge of the authors theyhave been never applied for identifying specifically the updateattack on Android malware. Authors in [22] introduce thespecification language CTPL (Computation Tree Predicate Logic)which extends the well-known logic CTL, and provides anefficient model checking algorithm. Moreover, they confirm themalicious behavior of only thirteen Windows malware variantsusing as dataset a set of worms dating from 2002-2004.

Song et al. [23] present an approach to model MicrosoftWindows XP binary programs as a PushDown System (PDS).They evaluate 200 malware variants (generated by NGVCK andVCL32 engines) and 8 benign programs. The tool PoMMaDe [24]

Page 6: Download Malware? No, thanks

Table 3: Top 10 Signature-Based Antimalware Evaluation Against Our Method.

Antimalware Plankton AnserverBot BaseBridge DroidKungFuUpdate%ident. #ident. #unident. %ident. #ident. #unident. %ident. #ident. #unident. %ident. #ident. #unident.

OriginalAhnLab 8.16% 51 574 93.05% 174 13 29.09% 96 234 100% 1 0Alibaba 0% 0 625 0% 0 187 0% 0 330 0% 0 1Antiy 0% 0 625 6.42% 12 175 0% 0 330 0% 0 1Avast 2.8% 13 612 5.35% 10 177 33.03% 109 221 % 0 1AVG 4% 25 600 0% 0 187 0% 0 330 0% 0 1Avira 79.68% 498 127 0% 0 187 14.55% 48 282 100% 1 0Baidu 59.2% 370 255 82.89% 155 32 32.12% 106 224 0% 0 1BitDefender 73.36% 496 129 0% 0 187 91.52% 302 28 0% 0 1ESET-NOD32 96% 600 25 0% 0 187 36.36% 120 210 100% 1 0GData 79.2% 495 130 0% 0 187 91.52% 302 28 0% 0 1Our Method 100% 625 0 100% 330 0 98.79% 326 4 100% 1 0

MorphedAhnLab 0.74% 4 539 73.8% 138 49 11.08% 34 273 0% 0 1Alibaba 0% 0 543 0% 0 187 0% 0 307 0% 0 1Antiy 0% 0 543 1.6% 3 184 0% 0 307 0% 0 1Avast 1.29% 7 536 1.07% 2 185 15.31% 47 260 0% 0 1AVG 2.58% 14 529 0% 0 187 0% 0 307 0% 0 1Avira 34.62% 188 355 0% 0 187 4.56% 14 293 100% 1 0Baidu 27.26% 148 395 68.45% 128 59 7.49% 23 284 0% 0 1BitDefender 26.89% 146 397 0% 0 187 28.99% 89 218 0% 0 1ESET-NOD32 36.28% 197 346 0% 0 187 6.84% 21 286 100% 1 0GData 38.31% 208 335 0% 0 187 37.13% 114 193 % 0 1Our Method 100% 543 0 100% 187 0 98.7% 303 4 100% 1 0

is able to detect 600 real malware, 200 malware generated by twomalware generators (NGVCK and VCL32), and proves thereliability of benign programs: a Microsoft Windows binaryprogram is modeled as a PDS which allows to track the stack ofthe program. Song et al. [25] model mobile applications using aPDS in order to discovery private data leaking. Their methodworks at Smali code level, i.e. the machine level instructions, eachconsisting in an opcode and its parameters. Jacob andcolleagues [26] provide a basis for a malware model, founded onthe Join-Calculus: the process-based model supports thefundamental notion of self-replication but also interactions,concurrency and non-termination to cover evolved malware. Theyconsider the system call sequences to build the model.

Recently, researchers explore the possibility to identify themalicious payload in repackaged Android applications using amodel checking based approach [27]. Starting from payloadbehavior definition they formulate logic rules testing them using areal world dataset. In their preliminary evaluation authors analyseonly DroidKungFu and Opfake families.

Beyond formal methods, several efforts have been made toidentify malware on mobile devices, but a few works concernspecifically the identification of the update attack.

Peoplau et al. [7] propose a technique to detect unsafe dynamicloading done by Android app. The paper does not focus on theupdate attack but on the possibility that a goodware can uploadexternal components which could contain a threat.

Grace et al. [28] find capability leaks, which are coding bugsconsisting in permissions requested by system applications thatare not properly protected against an illecit use by third-partyapplications. The same authors developed a technique fordetecting some of the methods for loading external code [29].Their solution does not cover all the existing techniques for codeloading, nor the authors propose a protection scheme to preventthis kind of attack.

Xing et al. [3] face a different problem even if has a lot incommon with the update attack: the Pileup. This attack occurswhen the user upgrades the operating system on the device. Insome cases, the (malicious) application can request someprivileges or attributes that are valid only on the new operatingsystem, but that do not exist in the old one. The malicious app can

silently acquire them when the old operating system runs andexploits them when the new one is installed.

At the best of authors’ knowledge, the following two papersrepresent the only research works in current literature exploringthe possibility to identify update attack related to mobile platform.

In [30] the authors carried out an analysis on 30,000applications. Using static analysis they brings out potentiallydangerous update attack applications, i.e., looking for“DexClassLoader;->loadClass" Smali invocation thatrepresents the most employed method to invoke dynamically aclass. The next step is represented by dynamic analysis, authorsexecute in a sandbox for 10 minutes the applications. Using thetime-based triggering the number of downloaded .apk files isincreased by 92% for the Malgenome dataset, 53% for the Drebindataset. A relatively small increase in the number of downloaded.dex files is also observed in the results (6% for Malgenomedataset and 28% for Drebin dataset).

In [4] the authors present a network-based behavioural analysisfor identifying update attacks. They represent, for each application,a model depicting its specific traffic pattern learned locally on thedevice. They evaluate the proposed system with 5 real and 10 self-written Android Trojan malware, obtaining two false alerts duringthe two evaluation days. This technique can be evaded by properlyaltering the profile of network communication.

With respect to these two papers, our solution is able to detectdirectly the update attack and not through indirect analysis, andit is not dependent on the observation window. Moreover this isthe first paper which discusses attempts to use formal method fordetecting update attacks in Android apps.

5. CONCLUSIONSMobile smartphones are spreading out in a plethora of everyday

activities. This is a very appealing scenario for malware writers,that have developed different techniques to inject maliciouspayload in trusted applications evading signature-basedanti-malware. The most sophisticated technique able to evadecurrent anti-malware technologies is represented by the updateattack, a technique able to embed malicious payload at run-timeinto the trusted application.

In this paper we investigate the effectiveness of model checking

Page 7: Download Malware? No, thanks

in order to discriminate the update attack families from trustedsamples and zero-day attacks. Our results are encouraging: weobtain an accuracy very close to 1 in recognizing update attackfamilies, i.e., Plankton, AnserverBot, DroidKungFuUpdate andBaseBridge families.

6. REFERENCES[1] T. Wang, K. Lu, L. Lu, S. Chung, and W. Lee, “Jekyll on ios:

When benign apps become evil.,” in Usenix Security, vol. 13,2013.

[2] G. Canfora, F. Mercaldo, G. Moriano, and C. A. Visaggio,“Composition-malware: building android malware at runtime,” in Availability, Reliability and Security (ARES), 201510th International Conference on, pp. 318–326, IEEE, 2015.

[3] L. Xing, X. Pan, R. Wang, K. Yuan, and X. Wang,“Upgrading your android, elevating my malware: Privilegeescalation through mobile os updating,” in Security andPrivacy (SP), 2014 IEEE Symposium on, pp. 393–408, IEEE,2014.

[4] L. Tenenboim-Chekina, O. Barad, A. Shabtai, D. Mimran,L. Rokach, B. Shapira, and Y. Elovici, “Detecting applicationupdate attack on mobile devices through network features,”in Computer Communications Workshops (INFOCOMWKSHPS), 2013 IEEE Conference on, pp. 91–92, IEEE,2013.

[5] J. Oberheide, “Android hax,” Proceedings of SummerCon(June 2010), 2010.

[6] A. Bellissimo, J. Burgess, and K. Fu, “Secure softwareupdates: Disappointments and new challenges.,” in HotSec,2006.

[7] S. Peoplau, Y. Fratantonio, A. Bianchi, C. Kruegel, andG. Vigna, “Execute this! analyzing unsafe and maliciousdynamic code loading in android applications,” in inproceedings of the 20th Annual Network & DistributedSystem Security Symposium (NDSS), 2014.

[8] C. Stirling, “An introduction to modal and temporal logicsfor ccs,” in Concurrency: Theory, Language, AndArchitecture (A. Yonezawa and T. Ito, eds.), LNCS,pp. 2–20, Springer, 1989.

[9] R. Milner, Communication and concurrency. PHI Series incomputer science, Prentice Hall, 1989.

[10] R. Cleaveland and S. Sims, “The ncsu concurrencyworkbench,” in CAV (R. Alur and T. A. Henzinger, eds.),vol. 1102 of Lecture Notes in Computer Science, Springer,1996.

[11] J. Woodcock, P. G. Larsen, J. Bicarregui, and J. S. Fitzgerald,“Formal methods: Practice and experience,” ACM Comput.Surv., vol. 41, no. 4, 2009.

[12] A. Santone, V. Intilangelo, and D. Raucci, “Application ofequivalence checking in a loan origination process inbanking industry,” in Enabling Technologies: Infrastructurefor Collaborative Enterprises (WETICE), 2013 IEEE 22ndInternational Workshop on, pp. 292–297, 2013.

[13] M. Ceccarelli, L. Cerulo, G. De Ruvo, V. Nardone, andA. Santone, “Infer gene regulatory networks from time seriesdata with probabilistic model checking,” in Formal Methodsin Software Engineering (FormaliSE), 2015 IEEE/ACM 3rdFME Workshop on, pp. 26–32, IEEE, 2015.

[14] A. Filieri and G. Tamburrelli, “Probabilistic verification atruntime for self-adaptive systems,” in Assurances forSelf-Adaptive Systems - Principles, Models, and Techniques,pp. 30–59, 2013.

[15] N. D. Francesco, A. Santone, and L. Tesei, “Abstractinterpretation and model checking for checking secureinformation flow in concurrent systems,” Fundam. Inform.,vol. 54, no. 2-3, pp. 195–211, 2003.

[16] A. Santone, G. Vaglini, and M. L. Villani, “Incrementalconstruction of systems: An efficient characterization of thelacking sub-system,” Sci. Comput. Program., vol. 78, no. 9,pp. 1346–1367, 2013.

[17] A. Santone and G. Vaglini, “Abstract reduction in directedmodel checking CCS processes,” Acta Inf., vol. 49, no. 5,pp. 313–341, 2012.

[18] Y. Zhou and X. Jiang, “Dissecting android malware:Characterization and evolution,” in Proceedings of 33rdIEEE Symposium on Security and Privacy (Oakland 2012),IEEE, 2012.

[19] D. Arp, M. Spreitzenbarth, M. Huebner, H. Gascon, andK. Rieck, “Drebin: Efficient and explainable detection ofandroid malware in your pocket,” in Proceedings of 21thAnnual Network and Distributed System Security Symposium(NDSS), IEEE, 2014.

[20] M. Spreitzenbarth, F. Echtler, T. Schreck, F. C. Freling, andJ. Hoffmann, “Mobilesandbox: Looking deeper into androidapplications,” in 28th International ACM Symposium onApplied Computing (SAC), ACM, 2013.

[21] G. Canfora, A. Di Sorbo, F. Mercaldo, and C. Visaggio,“Obfuscation techniques against signature-based detection: acase study,” in Proceedings of Workshop on Mobile SystemTechnologies, IEEE, 2015.

[22] J. Kinder, S. Katzenbeisser, C. Schallhart, and H. Veith,“Detecting malicious code by model checking,” Springer,2005.

[23] F. Song and T. Touili, “Efficient malware detection usingmodel-checking,” Springer, 2001.

[24] F. Song and T. Touili, “Pommade: Pushdownmodel-checking for malware detection,” in Proceedings ofthe 2013 9th Joint Meeting on Foundations of SoftwareEngineering, ACM, 2013.

[25] F. Song and T. Touili, “Model-checking for android malwaredetection,” Springer, 2014.

[26] G. Jacob, E. Filiol, and H. Debar, “Formalization of virusesand malware through process algebras,” in InternationalConference on Availability, Reliability and Security (ARES2010), IEEE, 2010.

[27] P. Battista, F. Mercaldo, V. Nardone, A. Santone, and C. A.Visaggio, “Identification of android malware families withmodel checking,” in International Conference onInformation Systems Security and Privacy, SCITEPRESS,2016.

[28] M. C. Grace, Y. Zhou, Z. Wang, and X. Jiang, “Systematicdetection of capability leaks in stock android smartphones.,”in NDSS, 2012.

[29] M. Grace, Y. Zhou, Q. Zhang, S. Zou, and X. Jiang,“Riskranker: scalable and accurate zero-day androidmalware detection,” in Proceedings of the 10th internationalconference on Mobile systems, applications, and services,pp. 281–294, ACM, 2012.

[30] A. A. Ilhan and S. Sevil, “Do you want to install an update ofthis application? A rigorous analysis of updated androidapplications,” in Cyber Security and Cloud Computing(CSCloud), 2015 IEEE 2nd International Conference on,pp. 181–186, IEEE, 2015.