towards automatic generationof security1c d a · natural4language4generation once!“a#gui#...

Post on 11-Aug-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

TOWARDS AUTOMATIC GENERATION OFSECURITY-­‐CENTRIC DESCRIPTIONS FOR

ANDROID APPSMu  Zhang  (NEC  Labs)Yue  Duan (Syracuse  Univ.)Qian  Feng  (Syracuse  Univ.)Heng Yin  (Syracuse  Univ.)

What  an  app  claims  to  do What  the  app  actually  doesvs.

Permissions:  1) Hard  to  read.  

Felt  et  al.  (SOUPS’12)

2) Insufficient  to  tell  “HOW”

Motivation:  Limitation  of  App  Descriptions

Contacts

Phone

Internet

Device  ID

Textual  Desc.:Not  really  about  security.WHYPER  (Security’13)AutoCog (CCS’14)

Features

Pricing

Service

How-­‐to

Send  “Contacts”  over  “Internet”

DESCRIBEME:  Automatically  Deriving  Textual  Descriptions  from  Android  Program  Code

Existing  Work:  Automated  Java  Program  Summarization

• In  Software  Engineering  Context– Java  Methods  (ASE’10)– Method  Parameters  (ICPC’11)  – Classes (ICPC’13)  – Conditional  Statements  (ASE’10)  – Algorithmic    Structure  (ICSE’11)

• We  are  dealing  with  a  DIFFERENT problemExisting  Work DESCRIBEME

Purpose Review  legacy  code   Understand  security  risks

Audience Experienced  developers   Average  users

Desc.  Level Intra-­‐procedural,  structure-­‐based Whole-­‐program,   semantic-­‐level  

Security-­‐awareness1

Conciseness2

Human-­‐understandability3

Challenges  &  Requirements

Approach  Overview

Security-­‐awareness1 Conciseness2

Human-­‐understandability3Human-­‐

understandability3

Conciseness2

Human-­‐understandability3

Behavior  Graph

API  Node

API  Prototype

Context

Constant  set

Constant  setStatic  Analysis:  22K  LOC

Data  Dependency

Condition

Security-­‐awareness1

Condition  Analysis

• Extract  only  user-­‐aware conditions  – User  Interface– Device  Status– Natural  Environment

• Present  simple  logic  to  users– Equation/Inequation

Our  condition  analysis  is  focused  only  on  the  conditions  that  users  can  observe  and  evaluate.

Human-­‐understandability3

Subgraph  Mining

Conciseness2 Human-­‐understandability3

Graph  Compression:  Replace  the  subgraphs  with  single  nodes

Discovered  109  patterns  fundamentally  due  to  design  patterns  in  Android  apps.

Natural  Language  GenerationOnce  “a  GUI  component”,  “be”,  “clicked”

depending  on  if  “the  user”,  “select”,  “the  Button  ``Confirm’’  “

Once  a  GUI  component  is  

clicked

depending  on  if  the  user  selects  the  Button  ``Confirm’’

Description:  Once  a  GUI  component  is  clicked,  the  app  retrieves  your  phone  number,  and  encodes  the  data  into  format  “100/app_id=an1005/ani=%s/dest=%s/phone_number=%s/company=%s/”,  and  sends  data  to  network,  depending  on  if  the  user  selects  the  Button  “Confirm”.  

AggregationConciseness2

EVALUATION:  Correctness

Question  1: Is  generated  description  correct?

Total  # Correct Missing Desc. False  Statement

65 55 6 4

1. Points-­‐to  Analysis2. Exception  handling3. Reflection

Run  DESCRIBEME over  DroidBench

EVALUATION:  Security-­‐Awareness

Question  2: Developer’s  descriptions  cannot  faithfully  reflect  the  usage  of  permissions.  Can  we  do  better?

EVALUATION:  Improvement  of  Conciseness

Question  3:  Is  subgraph  mining  effective?

EVALUATION:  Readability

Question  4: Can  average  users  read  the  machine  generated  descriptions?

EVALUATION:  Human-­‐Understandability

Question  5:  Can  our  descriptions  help  users  avoid  risks?

App  Download  Rate w/  old  desc. w/  new  desc.

Malware 63.4% 24.7%

Privacy-­‐breaching 80.0% 28.2%

Clean 71.1% 59.3%

Conclusion

• We  propose  a  novel  technique  that  automatically  describes  security-­‐related  app  behaviors  to  the  end  users  in  natural  language.

• We  implement  DESCRIBEMEwhich  combines  program  analysis,  subgraph  mining  and  natural  language  generation to  create  security-­‐centric,  concise  and  human-­‐readable  descriptions.

Related  Work[1]  Sridhara et  al.,  Towards  Automatically  Generating  Summary  Comments  for  Java  Methods,  in  ASE’10.[2]  Buse et  al.,  Automatically  Documenting  Program  Changes,  in  ASE’10.[3]  Sridhara et  al.,  Automatically  Detecting  and  Describing  High  Level  Actions  Within  Methods,  in  ICSE’11.[4]  Sridhara et  al.,  Generating  Parameter  Comments  and  Integrating  with  Method  Summaries,  in  ICPC’11.[5]  Moreno  et  al.,  Automatic  Generation  of  Natural  Language  Summaries  for  Java  Classes,  in  ICPC’13.[6]  Pandita et  al.,  WHYPER:  Towards  Automating  Risk  Assessment  of  Mobile  Applications,  in  USENIX  Security’13[7]  Qu  et  al.,  AutoCog:  Measuring  the  Description-­‐to-­‐permission  Fidelity  in  Android  Applications

THANK YOU!

UI-related Triggering Conditions

• UI  Analysis:  – to  correlate  what  the  user  sees  to  what  the  app  does

19

<public  type=“id”  name=“send”

id=“0x7f040003”  />

res/values/public.xml

<string  name=“send_binarysms”>Send  binary  sms (to  port  8091)</string>

res/values/strings.xml<CheckBox android:id=

“@+id/send”  android:text=“@string/send_binarysms”/>

res/layout/main.xml

<id=“0x7f040003”,id  name  =  “send”>

<string  name=“send_binarysms”,  text=“Send  binary  sms (to  port  8091)”>

<id  name  =  “send”,  type=“CheckBox”  ,  string  name=“send_binarysms”>

<id=“0x7f040003”,type  =  “CheckBox”,

text =  “Send  binary  sms (to  port  8091)”>

Send  binary  sms (to  port  8091)

Subgraph  Mining

SmsManager.getDefault()

SmsManager.sendTextMessage()

1.  Singleton  Retrieval

divideMessage()

sendMultipartTextMessage()

2.  Workflow

getLastKnownLocation()

getLongitude()

3.  Hierarchical  Data

getLatitude()

Description  Model

• 3-­‐tuple  for  APIs– createFromPdu():  {“the  app”,  “retrieve”,  “incoming  SMS  message”}

• Manually  modeling  306  APIs  and  103  patterns• Guideline  for  Word  Selection– Straightforward– Distinguishable– Counterexamples:  

“Blow  into  the  mic  to  extinguish  the  flame  like  a  real  candle”“You  can  now  turn  recordings  into  ringtones”

Human-­‐understandability3

top related