protecting private web content from embedded scripts

49
Protecting Private Web Content from Embedded Scripts Yuchen Zhou David Evans http://www.cs.virginia.edu/DOMinator

Upload: nile

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Protecting Private Web Content from Embedded Scripts. Yuchen Zhou David Evans. http://www.cs.virginia.edu/DOMinator. Third-Party Scripts. 49.95 %  responsive top 1 million sites use Google Analytics as of Aug 2010. [Wikipedia]. Washington Street Journal, What They Know?. dictionary.com. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Protecting  Private Web  Content from Embedded Scripts

Protecting Private Web Content from Embedded ScriptsYuchen ZhouDavid Evans

http://www.cs.virginia.edu/DOMinator

Page 2: Protecting  Private Web  Content from Embedded Scripts
Page 3: Protecting  Private Web  Content from Embedded Scripts

Third-Party Scripts

Washington Street Journal, What They Know?

49.95% responsive top 1 million sites use Google Analytics as of Aug 2010. [Wikipedia]

dictionary.com

234 scripts from third-party

Page 4: Protecting  Private Web  Content from Embedded Scripts

All or nothing trust model

Dilemma of current web technologies

Host content

Private contentiframe SOP

Page 5: Protecting  Private Web  Content from Embedded Scripts

Threat Model

Content provider embeds third-party scripts directly in its webpages.

Adversary controls those scripts and may use any means to get confidential information.

- DOM APIs- JavaScript variables/functions

High-level goal: • Add policies to host pages to restrict third-party

scripts’ privilege and prevent them from stealing private information.

Page 6: Protecting  Private Web  Content from Embedded Scripts

Related Work

Adjail[Louw et al, USENIX

SEC 10’]

Caja[Miller et al, Tech

report 08’]

MashupOS[Wang et al, SIGOPS 07’]

AdSafe[Douglas

Crockford et al]

OMASH[Steven Crites et al, ACM CCS 08’]

JCShadow[Patil et al, ICDCS 11’]

Browser Modification

BEEP[Jim et al, CCS07’]

ESCUDO[Jayaraman et al,

ICDCS 10’]

Source Code rewriting

CSP

External Library

Simple policies Complex Policies

No Browser Modification

Two Goals:- Expressive and powerful policies- Keep the developers’ job simple

Page 7: Protecting  Private Web  Content from Embedded Scripts

Overview

Server-Provided

Automated Learner

Policy

Enforcement

Automated Policy Generation

Page 8: Protecting  Private Web  Content from Embedded Scripts

Adversary techniques

JavaScript DOM APIs

Access host script variables

Call host script functions

document.getElementById.innerHTML

document.cookie

Isolation Policies

Adversary techniques

JavaScript DOM APIs

Access host script variables

Call host script functions

document.getElementById.innerHTML

document.cookie

Page 9: Protecting  Private Web  Content from Embedded Scripts

JavaScript Execution Isolation

Isolation policy: ‘worldID’ attribute:

- Scripts with the same worldID execute in the same world (context).- Scripts without worldID is most privileged (host script).

One-way access policy: ‘sharedLibID’ and ‘useLibID’ attribute:

- Scripts can share their global objects by specifying ‘sharedLibId’ attribute.- Scripts can use resources in a different world by specifying ‘useLibId’ attribute.

¿ 𝑠𝑐𝑟𝑖𝑝𝑡𝑤𝑜𝑟𝑙𝑑𝐼𝐷= 𝑠𝑡𝑟𝑖𝑛 𝑔 >¿

¿ 𝑠𝑐𝑟𝑖𝑝𝑡 h𝑠 𝑎𝑟𝑒𝑑𝐿𝑖𝑏𝐼𝑑= 𝑠𝑡𝑟𝑖𝑛𝑔 >¿

¿ 𝑠𝑐𝑟𝑖𝑝𝑡𝑢𝑠𝑒𝐿𝑖𝑏𝐼𝑑= 𝑠𝑡𝑟𝑖𝑛𝑔 >¿

Page 10: Protecting  Private Web  Content from Embedded Scripts

Adversary techniques

JavaScript DOM APIs

Access host script variables

Call host script functions

document.getElementById.innerHTML

document.cookie

Isolation Policies

Adversary techniques

JavaScript DOM APIs

Access host script variables

Call host script functions

document.getElementById.innerHTML

document.cookie

Page 11: Protecting  Private Web  Content from Embedded Scripts

DOM APIs Access Control

DOM node access control list:

Script with worldID that does not appear in a DOM node’s access control list cannot perform corresponding actions on that node.

- For RACL: privileged world may read the content/attribute of this node- For WACL: privileged world may modify the content/attribute of this node.

¿𝑑𝑖𝑣 𝑅𝐴𝐶𝐿= 1; 2,  ..𝑤𝑜𝑟𝑙𝑑𝐼𝐷 𝑤𝑜𝑟𝑙𝑑𝐼𝐷 𝑒𝑡𝑐 >¿

¿𝑑𝑖𝑣𝑊𝐴𝐶𝐿= 1; 2,  ..𝑤𝑜𝑟𝑙𝑑𝐼𝐷 𝑤𝑜𝑟𝑙𝑑𝐼𝐷 𝑒𝑡𝑐 >¿

Page 12: Protecting  Private Web  Content from Embedded Scripts

Annotated Page Example

<html><body><div id=‘public’>Hello, world!</div><div id=‘secret’>This is a secret</div><script src=‘third-party.js’></script></body></html>

<html><body><div id=‘public’ RACL=‘3rd-p’ WACL=‘3rd-p’>Hello, world!</div><div id=‘secret’>This is a secret</div><script src=‘third-party.js’ worldID=‘3rd-p’></script></body></html>

Original HTML Annotated HTML

Page 13: Protecting  Private Web  Content from Embedded Scripts

Overview

Server-Provided

Automated Learner

Policy

Enforcement

Automated Policy Generation

Page 14: Protecting  Private Web  Content from Embedded Scripts

Enforcement Overview

V8 JavaScript

Engine

V8/Webkit Bindings

worldID

WebKit DOMImplementation

HTMLResponse

Policy checking

Tainttracking

Callbackfunction

Script Nodes1

2

4

3

5

6

World1

World2

World3

HTML ParserScriptController

WebKit Renderer

DOM Nodes

ACL

worldID

Chromium Architecture

Page 15: Protecting  Private Web  Content from Embedded Scripts

Enforcement Overview

V8 JavaScript

Engine

V8/Webkit Bindings

worldID

WebKit DOMImplementation

HTMLResponse

Policy checking

Tainttracking

Callbackfunction

Script Nodes1

2

4

3

5

6

World1

World2

World3

HTML ParserScriptController

DOM Nodes

ACL

worldID

Chromium Architecture

WebKit Renderer

Page 16: Protecting  Private Web  Content from Embedded Scripts

Isolated WorldAdam Barth et al.[NDSS 2010] - Goal: separate extension execution contexts - Already built into Chromium’s trunk code

DOM-JS 1-to-1 mapping

DOM-JS 1-to-n mapping

Our Goal: Apply this to page script isolation.

Page 17: Protecting  Private Web  Content from Embedded Scripts

Dynamic Scripts

An eval() example:

<div id=“secret”>A secret</div><script worldID=“untrusted1”>

…eval(alert(document.getElementById(‘secret’).innerHTML));

…</script>

V8 sees eval

Compile string

Wait for eval

to finish

Execute compiled

code

Done…

V8 sees eval

Compile string

Wait for eval to finish

Execute compiled

code

Done…

Record current worldID Enter

evaluator’sworld

Enter Mainworld

Page 18: Protecting  Private Web  Content from Embedded Scripts

Enforcement Overview

V8 JavaScript

Engine

V8/Webkit Bindings

worldID

WebKit DOMImplementation

HTMLResponse

Policy checking

Tainttracking

Callbackfunction

Script Nodes1

2

4

3

5

6

World1

World2

World3

HTML ParserScriptController

DOM Nodes

ACL

worldID

Chromium Architecture

WebKit Renderer

Page 19: Protecting  Private Web  Content from Embedded Scripts

Enforcement Overview

V8 JavaScript

Engine

V8/Webkit Bindings

worldID

WebKit DOMImplementation

HTMLResponse

Policy checking

Tainttracking

Callbackfunction

Script Nodes1

2

4

3

5

6

World1

World2

World3

HTML ParserScriptController

WebKit DOM

DOM Nodes

ACL

worldID

Chromium Architecture

Page 20: Protecting  Private Web  Content from Embedded Scripts

Node ACL Enforcement

RACL enforcement:- Hide handle of node;

WACL enforcement:- Add mediation to corresponding DOM APIs.

Subject Policy Semantic

DOM node <div RACL = ‘d1;d2…’> Worlds that may access this

DOM node <div WACL = ‘d1;d2…’> Worlds that may modify this

Page 21: Protecting  Private Web  Content from Embedded Scripts

Hiding parts of DOM

Page 22: Protecting  Private Web  Content from Embedded Scripts

DOM Element ACL policy

C

B

V8:3rd-p script

WriteMediation

DOM APIImplementation

World 2

A

V8:3rd-p script

World 1

V8 Callback Table

innerHTML innerHTML_getter

setAttribute setAttribute

removeChild removeChild

Page 23: Protecting  Private Web  Content from Embedded Scripts

Overview

Server-Provided

Automated Learner

Policy

Enforcement

Automated Policy Generation

Page 24: Protecting  Private Web  Content from Embedded Scripts

Annotated Page Example

<html><body><div id=‘public’>Hello, world!</div><div id=‘secret’>This is a secret</div><script src=‘third-party.js’></script></body></html>

<html><body><div id=‘public’ RACL=‘3rd-p’ WACL=‘3rd-p’>Hello, world!</div><div id=‘secret’>This is a secret</div><script src=‘third-party.js’ worldID=‘3rd-p’></script></body></html>

Original HTML Annotated HTML

Page 25: Protecting  Private Web  Content from Embedded Scripts

Server-Provided Policy

Developers Manual effort: - Requires significant effort - Easy to forget - Almost impossible for high-profile/dynamic sites

Web Framework Assisted: - Declare policy once, automate the rest

Page 26: Protecting  Private Web  Content from Embedded Scripts

GuardRails Integration

GuardRails is an extension for Ruby on Rails framework that makes it easy for developers to define security policies by writing annotations.

GuardRails provides a character-level precision taint tracking system to trace sensitive data flows.

# @ :read_worlds, :name, [“World1”]class Cart...

Name: <span RACL=“World1”>SomeProduct</span><br/>Description: <span RACL=“World1, World2” WACL=“World1, World2”>Accessories for <b RACL=“World1”>Some Other Product</b></span>

Jonathan Burket, Patrick Mutchler, Michael Weaver, Muzzammil Zaveri, and David Evans. June 2011. GuardRails: A Data-Centric Web Application Security Framework. In 2nd USENIX Conference on Web Application Development (WebApps'11).

GUARDRAILS

Page 27: Protecting  Private Web  Content from Embedded Scripts

Overview

Server-Provided

Automated Learner

Policy

Enforcement

Automated Policy Generation

Page 28: Protecting  Private Web  Content from Embedded Scripts

Policy learner workflow

Proxy ServerClient

Browser

AnnotatedResponse

Request Cookie

Response 1

Response 2

Request

Diff &Annotate

Request Cookie

Page 29: Protecting  Private Web  Content from Embedded Scripts

Limitations of Automated Learning

Side effect of sending two requests:- Double traffic, significant higher latency.

- Extra requests may cause undesired server state changes.

Page 30: Protecting  Private Web  Content from Embedded Scripts

Experiments

Security

Compatibility

Policy inference

Page 31: Protecting  Private Web  Content from Embedded Scripts

Compatibility Experiments

• Isolating the execution context of third-party scripts could possibly cause problems in real-world websites.

• Tried the modified browser on 60 sites.

• We use our automatic policy learner to derive the policies for each site.• We manually corrected third-party script identification errors generated by

policy learner.

Alexa.com Top 10K sites

50 3001 10K

20 20 20

1K

· · · · ·· · ····· · · · · · ·· ····· · ·· ···

Page 32: Protecting  Private Web  Content from Embedded Scripts

Compatibility Results

4 4 33

46=23+23

Compatibility test

N/AParser errorMalfunctionPolicy violationNo error

No third-party scripts/SSL sites

Significant undesired JS errors

detected

No javascript console error as we know of, after trivial tweaks for 23 (50%) sites. (Our policy learner is not

perfect)

Third-party scripts try to grab handle of private

node.

Hpricot Parser crashes on these pages

Page 33: Protecting  Private Web  Content from Embedded Scripts

Policy Learner Result

Alexa Ranking 1-20 50-300 1000+

Sample size 13 11 18

% of public nodes before login 71.6% 95.7% 98%

% of public nodes after login 52.6% 78.4% 83%

% of nodes switched from public to private after login

26.6% 18% 15.3%

# of third-party scripts embedded

0.84 2.61 2.2

Page 34: Protecting  Private Web  Content from Embedded Scripts

Conclusion

Current web technologies don’t match site trust models.

We have made one step toward easier deployment of web security mechanism.

- Automatic policy learning is hard without server side assistance.

- We hope the idea of integrating server framework extensions with client-side protection could be widely adopted.

Page 35: Protecting  Private Web  Content from Embedded Scripts

Thank you!Questions?

Page 36: Protecting  Private Web  Content from Embedded Scripts

Backup slides

Page 37: Protecting  Private Web  Content from Embedded Scripts

One-way object access• Host is entirely separated with third-party scripts.

<script type="text/javascript">

var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-XXXXX-X']); _gaq.push(['_trackPageview']); _gaq.push(['_addTrans', '1234', // order ID - required 'Acme Clothing', // affiliation or store name '11.99', // total - required '1.29', // tax '5', // shipping 'San Jose', // city 'California', // state or province 'USA' // country ]);</script>

Page 38: Protecting  Private Web  Content from Embedded Scripts

One-way object access• In Javascript, the window object is the super object of all other

objects.

• Two new attribute for script tags:

• The window object of the scripts with sharedLibId is injected into main world as a custom object.

• Third-party scripts may use other party’s script by adding useLibId string.

¿ 𝑠𝑐𝑟𝑖𝑝𝑡 h𝑠 𝑎𝑟𝑒𝑑𝐿𝑖𝑏𝐼𝑑¿ ′ 𝑠𝑡𝑟𝑖𝑛𝑔 ′>¿

¿ 𝑠𝑐𝑟𝑖𝑝𝑡𝑢𝑠𝑒𝐿𝑖𝑏𝐼𝑑¿ ′ 𝑠𝑡𝑟𝑖𝑛𝑔 ′>¿

Page 39: Protecting  Private Web  Content from Embedded Scripts

One-way object access• Host is entirely separated with third-party scripts.

<script src=“GA.js” worldID=“Analytics” sharedLibId=“GA”></script><script type="text/javascript">

var _gaq = GA._gaq || []; GA._gaq.push(['_setAccount', 'UA-XXXXX-X']); GA._gaq.push(['_trackPageview']); GA._gaq.push(['_addTrans', '1234', // order ID - required 'Acme Clothing', // affiliation or store name '11.99', // total - required '1.29', // tax '5', // shipping 'San Jose', // city 'California', // state or province 'USA' // country ]);</script>

Page 40: Protecting  Private Web  Content from Embedded Scripts

Node tainting

JavaScript

Page 41: Protecting  Private Web  Content from Embedded Scripts

Immutable policy attributes• All abovementioned policy attributes are made immutable to

prevent malicious scripts from changing them.

Page 42: Protecting  Private Web  Content from Embedded Scripts

Eventhandler

Register Eventhandler

…. and the event is triggered

Eventhandler code executed

Normal event handling process

Page 43: Protecting  Private Web  Content from Embedded Scripts

Eventhandler

Register Eventhandler

Record host’s worldID in a wrapper

…. and the event is triggered

Enter correct world

Eventhandler code executed

Modified event handling process

Event Handler

worldID

Page 44: Protecting  Private Web  Content from Embedded Scripts

Experiment Result - Security• We constructed test-cases according to W3C standard for each

defense mechanism we implemented, example test cases include:

Attack Type Examples

Directly calling DOM API to get node handlers

document.getElementById(), nextSibling(),

window.nodeIDDirectly calling DOM API to modify

nodesnodeHandler.setAttribute(),

innerHTML=, style=,nodeHandler.removeChild()

Probing host context for private variables/functions

referring to host variables, calling host functions,

explicitly calling event handlersAccessing special properties document.cookie, open(),

document.location

Page 45: Protecting  Private Web  Content from Embedded Scripts

Third-party scripts identification

Definition: Any scripts that come from an external domain. Inline scripts are considered as trusted.

Host: Engadget.com

Page 46: Protecting  Private Web  Content from Embedded Scripts

Policy Learner Result

• Identifying third-party scripts• False positives

• Content Delivery Networks (CDN), mostly seen in top websites;• JavaScript libraries (jQuery, e.g.).

• False negatives• Code snippets that assist a bigger script (Google Analytics, e.g.);• Copy third-party scripts to local server (rare cases).

Added Heuristics:• Add whitelist for specific website’s CDNs and common JS libraries; • Search for specific patterns in code snippet and mark them as third-

party script.

• Private node identification

Page 47: Protecting  Private Web  Content from Embedded Scripts

Policy Violations• Washingtonpost.com (fb)• Imtalk.org (addthis)• Mysql.com(some script, grab the ‘logout’ button)

Page 48: Protecting  Private Web  Content from Embedded Scripts

Example Results – Sites Ranked 50-100

Site PublicNologin PublicLogin 3rd-p scripts Compatibility Issues Trusted Domain

Twitpic 87/109

83% 150/193

77% CrowdscienceScorecardresearchQuantserveFmpubgstatic

Guest variable inline access Googleapis.comtwitter

washingtonpost

1721/1722

99% 1783/1975

90% Facebook Guest variable inline accessPolicy violation

Digg 934/967

97% 652/1000

65% Diggstatic.comscorecardresearch

Facebook

Expedia 748/814

92% 746/814

92% Intentmedia

Vimeo 400/413

97% 202/431

47% Google AnalyticsQuantserve

Vimeocdn.com

Statcounter 457/457

100% 137/190

72% Doubleverify

Bit.ly 102/105

97% 86/121

71% TwitterGoogle Analytics

Guest variable inline access

Indeed.com 126/128

98% 120/129

93% JobsearchGoogle Analyticsscorescardresearch

Policy violation

Yelp.com 782/794

98% 733/848

86% Google Analytics Yelpcdn.com

Page 49: Protecting  Private Web  Content from Embedded Scripts

References• [1] Google Analytics market share. http://

metricmail.tumblr.com/post/904126172/google-analytics-market-share

• [2] What they know. http://blogs.wsj.com/wtk/

• [3] Adam Barth, Adrienne Porter Felt, Prateek Saxena, and Aaron Boodman. Protecting Browsers from Extension Vulnerabilities. In 17th Network and Distributed System Security Symposium, 2010.