towards automatic generationof security1c d a · natural4language4generation once!“a#gui#...

21
T OWARDS A UTOMATIC GENERATION OF SECURITY CENTRIC DESCRIPTIONS FOR ANDROID APPS Mu Zhang (NEC Labs) Yue Duan (Syracuse Univ.) Qian Feng (Syracuse Univ.) Heng Yin (Syracuse Univ.)

Upload: others

Post on 11-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

TOWARDS AUTOMATIC GENERATION OFSECURITY-­‐CENTRIC DESCRIPTIONS FOR

ANDROID APPSMu  Zhang  (NEC  Labs)Yue  Duan (Syracuse  Univ.)Qian  Feng  (Syracuse  Univ.)Heng Yin  (Syracuse  Univ.)

Page 2: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

What  an  app  claims  to  do What  the  app  actually  doesvs.

Permissions:  1) Hard  to  read.  

Felt  et  al.  (SOUPS’12)

2) Insufficient  to  tell  “HOW”

Motivation:  Limitation  of  App  Descriptions

Contacts

Phone

Internet

Device  ID

Textual  Desc.:Not  really  about  security.WHYPER  (Security’13)AutoCog (CCS’14)

Features

Pricing

Service

How-­‐to

Send  “Contacts”  over  “Internet”

Page 3: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

DESCRIBEME:  Automatically  Deriving  Textual  Descriptions  from  Android  Program  Code

Page 4: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Existing  Work:  Automated  Java  Program  Summarization

• In  Software  Engineering  Context– Java  Methods  (ASE’10)– Method  Parameters  (ICPC’11)  – Classes (ICPC’13)  – Conditional  Statements  (ASE’10)  – Algorithmic    Structure  (ICSE’11)

• We  are  dealing  with  a  DIFFERENT problemExisting  Work DESCRIBEME

Purpose Review  legacy  code   Understand  security  risks

Audience Experienced  developers   Average  users

Desc.  Level Intra-­‐procedural,  structure-­‐based Whole-­‐program,   semantic-­‐level  

Page 5: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Security-­‐awareness1

Conciseness2

Human-­‐understandability3

Challenges  &  Requirements

Page 6: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Approach  Overview

Security-­‐awareness1 Conciseness2

Human-­‐understandability3Human-­‐

understandability3

Conciseness2

Human-­‐understandability3

Page 7: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Behavior  Graph

API  Node

API  Prototype

Context

Constant  set

Constant  setStatic  Analysis:  22K  LOC

Data  Dependency

Condition

Security-­‐awareness1

Page 8: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Condition  Analysis

• Extract  only  user-­‐aware conditions  – User  Interface– Device  Status– Natural  Environment

• Present  simple  logic  to  users– Equation/Inequation

Our  condition  analysis  is  focused  only  on  the  conditions  that  users  can  observe  and  evaluate.

Human-­‐understandability3

Page 9: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Subgraph  Mining

Conciseness2 Human-­‐understandability3

Graph  Compression:  Replace  the  subgraphs  with  single  nodes

Discovered  109  patterns  fundamentally  due  to  design  patterns  in  Android  apps.

Page 10: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Natural  Language  GenerationOnce  “a  GUI  component”,  “be”,  “clicked”

depending  on  if  “the  user”,  “select”,  “the  Button  ``Confirm’’  “

Once  a  GUI  component  is  

clicked

depending  on  if  the  user  selects  the  Button  ``Confirm’’

Description:  Once  a  GUI  component  is  clicked,  the  app  retrieves  your  phone  number,  and  encodes  the  data  into  format  “100/app_id=an1005/ani=%s/dest=%s/phone_number=%s/company=%s/”,  and  sends  data  to  network,  depending  on  if  the  user  selects  the  Button  “Confirm”.  

AggregationConciseness2

Page 11: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

EVALUATION:  Correctness

Question  1: Is  generated  description  correct?

Total  # Correct Missing Desc. False  Statement

65 55 6 4

1. Points-­‐to  Analysis2. Exception  handling3. Reflection

Run  DESCRIBEME over  DroidBench

Page 12: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

EVALUATION:  Security-­‐Awareness

Question  2: Developer’s  descriptions  cannot  faithfully  reflect  the  usage  of  permissions.  Can  we  do  better?

Page 13: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

EVALUATION:  Improvement  of  Conciseness

Question  3:  Is  subgraph  mining  effective?

Page 14: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

EVALUATION:  Readability

Question  4: Can  average  users  read  the  machine  generated  descriptions?

Page 15: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

EVALUATION:  Human-­‐Understandability

Question  5:  Can  our  descriptions  help  users  avoid  risks?

App  Download  Rate w/  old  desc. w/  new  desc.

Malware 63.4% 24.7%

Privacy-­‐breaching 80.0% 28.2%

Clean 71.1% 59.3%

Page 16: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Conclusion

• We  propose  a  novel  technique  that  automatically  describes  security-­‐related  app  behaviors  to  the  end  users  in  natural  language.

• We  implement  DESCRIBEMEwhich  combines  program  analysis,  subgraph  mining  and  natural  language  generation to  create  security-­‐centric,  concise  and  human-­‐readable  descriptions.

Page 17: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Related  Work[1]  Sridhara et  al.,  Towards  Automatically  Generating  Summary  Comments  for  Java  Methods,  in  ASE’10.[2]  Buse et  al.,  Automatically  Documenting  Program  Changes,  in  ASE’10.[3]  Sridhara et  al.,  Automatically  Detecting  and  Describing  High  Level  Actions  Within  Methods,  in  ICSE’11.[4]  Sridhara et  al.,  Generating  Parameter  Comments  and  Integrating  with  Method  Summaries,  in  ICPC’11.[5]  Moreno  et  al.,  Automatic  Generation  of  Natural  Language  Summaries  for  Java  Classes,  in  ICPC’13.[6]  Pandita et  al.,  WHYPER:  Towards  Automating  Risk  Assessment  of  Mobile  Applications,  in  USENIX  Security’13[7]  Qu  et  al.,  AutoCog:  Measuring  the  Description-­‐to-­‐permission  Fidelity  in  Android  Applications

Page 18: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

THANK YOU!

Page 19: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

UI-related Triggering Conditions

• UI  Analysis:  – to  correlate  what  the  user  sees  to  what  the  app  does

19

<public  type=“id”  name=“send”

id=“0x7f040003”  />

res/values/public.xml

<string  name=“send_binarysms”>Send  binary  sms (to  port  8091)</string>

res/values/strings.xml<CheckBox android:id=

“@+id/send”  android:text=“@string/send_binarysms”/>

res/layout/main.xml

<id=“0x7f040003”,id  name  =  “send”>

<string  name=“send_binarysms”,  text=“Send  binary  sms (to  port  8091)”>

<id  name  =  “send”,  type=“CheckBox”  ,  string  name=“send_binarysms”>

<id=“0x7f040003”,type  =  “CheckBox”,

text =  “Send  binary  sms (to  port  8091)”>

Send  binary  sms (to  port  8091)

Page 20: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Subgraph  Mining

SmsManager.getDefault()

SmsManager.sendTextMessage()

1.  Singleton  Retrieval

divideMessage()

sendMultipartTextMessage()

2.  Workflow

getLastKnownLocation()

getLongitude()

3.  Hierarchical  Data

getLatitude()

Page 21: TOWARDS AUTOMATIC GENERATIONOF SECURITY1C D A · Natural4Language4Generation Once!“a#GUI# component”,# “be”,#“clicked” depending’on’if!“the!user”,!“select”,!“the!Button!``Confirm

Description  Model

• 3-­‐tuple  for  APIs– createFromPdu():  {“the  app”,  “retrieve”,  “incoming  SMS  message”}

• Manually  modeling  306  APIs  and  103  patterns• Guideline  for  Word  Selection– Straightforward– Distinguishable– Counterexamples:  

“Blow  into  the  mic  to  extinguish  the  flame  like  a  real  candle”“You  can  now  turn  recordings  into  ringtones”

Human-­‐understandability3