jigsaw: efﬁcient, low-effort mashup isolation · jigsaw: efﬁcient, low-effort mashup isolation...

Jigsaw: Efficient, Low-effort Mashup Isolation

James MickensMicrosoft Research

[email protected]

Matthew FinifterUniversity of California, Berkeley

[email protected]

Abstract

A web application often includes content from a va-riety of origins. Securing such a mashup applicationis challenging because origins often distrust each otherand wish to expose narrow interfaces to their privatecode and data. Jigsaw is a new framework for isolat-ing these mashup components. Jigsaw is an extension ofthe JavaScript language that can be run inside standardbrowsers using a Jigsaw-to-JavaScript compiler. Un-like prior isolation schemes that require developers tospecify complex, error-prone policies, Jigsaw leveragesthe well-understood public/private keywords from tradi-tional object-oriented languages, making it easy for a do-main to tag internal data as externally visible. Jigsawprovides strong iframe-like isolation, but unlike previ-ous approaches that use actual iframes as isolation con-tainers, Jigsaw allows mutually distrusting code to runinside the same frame; this allows scripts to share stateusing synchronous method calls instead of asynchronousmessage passing. Jigsaw also introduces a novel encap-sulation mechanism called surrogates. Surrogates allowdomains to safely exchange objects by reference insteadof by value. This improves sharing efficiency by elimi-nating cross-origin marshaling overhead.

1 Introduction

Unlike traditional desktop applications, web applicationsare often mashups: applications that contain code fromdifferent principals. These principals often have asym-metrical trust relationships with each other. For example,a page that generates localized news may receive datafrom a news feed component and a map component; theintegrating page may want to isolate both componentsfrom each other, and present them with an extremely nar-row interface to the integrator’s state. As another exam-ple, a social networking page might embed a third-partyapplication and an advertisement. The integrating page

may expose no interface to the advertisement. However,if the developer of the third-party application has signeda terms-of-use agreement, the integrating page may ex-pose a relatively permissive interface to its local state.

Given the wide range of trust relationships that existbetween web principals, it is challenging for developersto create secure mashups. Principals often want to sharewith each another, but in explicit and controlled ways.Unfortunately, JavaScript (the most popular client-sidescripting language) was not designed with mashup secu-rity in mind. JavaScript is an extremely permissive lan-guage with powerful reflection abilities but only crudesupport for encapsulation.

1.1 Previous Approaches

Given the increasing popularity of web services (andthe deficiencies of JavaScript’s isolation mechanisms),a variety of mashup isolation frameworks have emergedfrom academia and industry. Unfortunately, these frame-works are overly complex and present developers withan unwieldy programming model. Many of these ap-proaches [2, 11] force developers to use asynchronous,pass-by-value channels for cross-principal communi-cation. Asynchronous control flows can be difficultfor developers to write and understand, and automatedtools that convert synchronous control flows into asyn-chronous ones can introduce subtle data races (§2.1).Additionally, marshaling data over pass-by-value chan-nels like postMessage() can introduce high serial-ization overheads (§4.2).

Prior mashup frameworks also present developers withcomplex, overly expansive APIs for policy specification.For example, object views [11] require developers to de-fine policy code that runs during each property access ona shared object. Understanding how these filters com-pose across large object graphs can be difficult. Sim-ilarly, ConScript [12] policy files consist of arbitraryJavaScript code. This allows Conscript policies to be

extremely general, but as we demonstrate, such expres-sive power is often unnecessary. In many cases, securemashups require only two policy primitives: a simpleyes-no mechanism for marking JavaScript state as exter-nally sharable, and a simple grammar based on CSS andregular expressions that constrains how untrusted codecan access browser resources like persistent storage andthe visual display.

1.2 Our Solution: Jigsaw

In this paper, we introduce Jigsaw, a new framework formashup isolation. Jigsaw allows JavaScript code frommutually distrusting origins to selectively expose privatestate. Jigsaw’s design was driven by four goals:

Isolation by default: In Jigsaw, an integrating scriptincludes guest code which may hail from a differentorigin. The integrator has access to browser resources,and the integrator can provide its guests with access tosome portion of those resources. However, by default, aguest cannot generate network traffic, update the visualdisplay, receive GUI events, or access local storage.Similarly, each principal’s JavaScript namespace is hid-den from external code by default, and can be accessedonly via public interfaces that are explicitly defined bythe owning principal.

Efficient, synchronous sharing: Jigsaw eschewsasynchronous, pass-by-value sharing in favor of syn-chronous, pass-by-reference sharing. Inspired bytraditional object-oriented languages like Java and C++,Jigsaw code uses the public and private keywordsto indicate which data can be accessed by externaldomains. When an object is shared outside its localdomain, Jigsaw automatically wraps the object in asurrogate object that enforces public/private seman-tics. By inspecting surrogates as they cross isolationboundaries, Jigsaw can “unwrap” surrogates whenthey return to their home domain, ensuring that eachdomain accesses the raw version of a locally createdobject. Jigsaw also ensures that only one surrogateis created for each raw object. This guarantees thatreference-comparison == operations work as expectedfor surrogates. Using surrogates, Jigsaw can placemutually distrusting principals inside the same iframewhile providing iframe-style isolation and pass-by-reference semantics. Since principals reside withinthe same iframe, no postMessage() barrier mustbe crossed, which allows for true synchronous interfaces.

Simplicity: Using deny-by-default policies for browserresources like network access, and using the publicand private modifiers to govern access to JavaScript

namespaces, Jigsaw can express many popular typesof mashups—most require only a few lines of policycode and the explicit definition of a few public interfacemethods. In designing Jigsaw, we consciously avoidedmore complex isolation schemes like information flowcontrol, object views [11], and ConScript [12]. Whilethese schemes are more expressive than Jigsaw, theirinterfaces are unnecessarily complex for many of themashup patterns that are found in the wild.

Fail-safe legacy code: In most cases, regular JavaScriptcode that has not been adapted for Jigsaw will workas expected when used within a single domain. In allcases, unmodified legacy code will fail safely (i.e.,leak no data) when accessed by external domains.Jigsaw prohibits some JavaScript features like dynamicprototype manipulation, but these features are rarelyused by benevolent programs (and are potentiallyexploitable by attackers) [1, 11]. Jigsaw makes fewchanges to the core JavaScript language, and we believethat these changes will be understandable by the averageprogrammer, since the changes make JavaScript’s objectmodel behave more like that of a traditional, class-basedOO language like C#. Jigsaw preserves many of thelanguage features that make JavaScript an easy-to-usescripting language. For example, Jigsaw preservesclosures, first-class function objects, object literals, anevent-driven programming model, and pass-by-referencesemantics for all objects, not just those that are sharedwithin the same isolation domain.

2 Design

In Jigsaw, domains are entities that provide web content.In the context of the same-origin policy, a Jigsaw do-main corresponds to an origin, and we use the terms “do-main” and “origin” interchangeably. A principal is aninstance of web content provided by a particular domain.A principal may contain HTML, CSS, and JavaScript.Note that some principals might contain only JavaScript,e.g., a cryptographic library might only define JavaScriptfunctions to be invoked by external parties.

A user visits top-level web sites; each of these sitescan be an integrator principal. An integrator may includeanother principal Pi by explicitly downloading contentfrom Pi’s origin. In turn, Pi may include another princi-pal Pj . Figure 1 depicts the relationship between the userand this hierarchy of client-side principals. When thereis an edge from Pi to Pj , we refer to Pi as the includingprincipal or parent, and Pj as the included principal orchild.

User

Integrator

P1 P2P0

P3

Visits

Includes

Includes

Figure 1: An example of a principal hierarchy. The user visitsan integrator site that includes other principals. Each of thoseprincipals may include additional principals.

2.1 Boxes

Jigsaw places each principal in an isolation container thatwe call a box. Each box is associated with the followingresources:• A JavaScript namespace containing application-

defined objects and functions.• A DOM tree, which is a browser-defined data struc-

ture representing the HTML and CSS content be-longing to the principal.• An event loop, which captures mouse and keyboard

activity intended for that box.• A rectangular visual region with a width, a height,

a location within the larger browser viewport, and az-axis value.• A network connection, which allows the principal

to issue HTTP fetches for data.• A local storage area, which stores cookies and im-

plements the DOM storage abstraction.In some ways, a Jigsaw box resembles a traditionaliframe. Both provide a principal with a local JavaScriptnamespace, DOM tree, event loop, and visual field.However, Jigsaw boxes differ from iframes in three im-portant ways.

First, if two iframes share the same origin, theycan directly access each other’s JavaScript names-paces via frame references like window.parent andwindow.parent.frames. A Jigsaw box does notallow such unfettered cross-principal access. By default,two boxes have no way to communicate with each other,even if they belong to the same origin. This enables fault

//Download a script and place it inside a//new box. The return value of the script//is its principal object.var p = Jigsaw.createBox(’http://x.com/box.js’,

{network: /(x|y)\.com/,storage: true, dom: null});

//Call a public method defined by the box.//The pass-by-reference of localData and//result is made safe by the use of//surrogates.var result = p.f(localData);

Figure 2: An integrator creating and interacting with a box.The box can exchange network traffic with servers from x.comand y.com; it can also access its origin’s DOM storage, but itcannot access the integrator’s DOM. Surrogates are explainedin Section 2.6.2.

isolation and privilege separation for different principalsthat originate from the same domain. To allow cross-box communication, each principal must explicitly de-fine public functions on a principal object. By exchang-ing principal objects with each other, boxes define the setof external domains with which they communicate, andthe operations that these domains may invoke on privatestate.

A second difference between iframes and Jigsawboxes is that boxes use nesting relationships to moretightly constrain the resources of children. A page’s top-most Jigsaw box is given a full visual field that is equiv-alent to the entire browser viewport. The root box is alsogiven the maximal network permissions allowed by thesame-origin policy. By default, descendant boxes lackaccess to non-computational browser resources. For ex-ample, child boxes cannot issue network requests, andthey cannot access the visual field (and thus they can-not receive GUI events from the user). A parent boxmay delegate a region of its visual field to a child. Sim-ilarly, a parent can grant a child box a portion of its net-work permissions. In both cases, parent-child delegationhas monotonically increasing strictness, i.e., a parent cannever give a child a larger visual field or more network-ing permissions than the parent has. Figure 2 shows anexample of how an integrator creates a new box. Notethat DOM storage ACLs are unique because the browsergives each origin a local storage area. Thus, an integra-tor can grant a child access to the child domain’s localstorage, or completely prohibit such accesses. However,the integrator cannot directly expose its own local stor-age to a child domain (although it can define a methodon its principal object that mediates access to its DOMstorage).

The final difference between iframes and boxes in-volves communication channels. Applications in dif-ferent iframes communicate with the asynchronous

egate a visual field, Jigsaw sets the box’s DOM tree tonull, and prevents the child principal from adding DOMnodes to it. Otherwise, the child may update its visualfield and receive GUI events for that field by modifyingthe DOM tree in the standard way. Jigsaw ensures thatthe visual updates respect the constraints defined by theparent.

A visual field consists of a width, a height, a locationwithin the parent’s visual field, and a z-order. Parentsspecify these parameters using CSS-style syntax. Thus, achild can have a static visual geometry, or one that flowsin dynamic ways, e.g., to occupy a percentage of the par-ent’s visual field, regardless of how the parent is resized.

Visual parameters are associated with each principalobject (e.g., principal.height). A parent can dy-namically change a child’s visual field by writing to thesefields. If a child wants to change its visual field, it canalso try to write to these fields. However, the changesmust be validated by the parent. A child write to a vi-sual field parameter fires a special Jigsaw event in theparent called childVisualFieldRequest. If theparent has registered a handler for this event, the handlerinspects the changes requested by the child. If the parentapproves the change, the handler returns true, otherwiseit returns false. Jigsaw will only implement the change ifthe parent has defined such a handler, the handler returnstrue, and the change would not place the child’s visualfield outside the one owned by the parent.

Similar to the Gazelle browser [24], Jigsaw requiresvisually overlapping boxes to be opaque with respect toeach other. In other words, boxes cannot request trans-parent blending of their overlapping visual region—thebox with the higher z-order occludes all others in thestack. This prevents a malicious box from making it-self transparent, creating a child box containing a victimpage, and then collecting GUI events that the user in-tended to send to the victim box.

Even with these protections, a principal must still trustits ancestors in the principal hierarchy. This is because amalicious parent can virtualize a child’s runtime (§3) insubversive ways, or not create a child at all. Jigsaw canonly guarantee that parents are protected from descen-dants, and that sibling principal hierarchies are protectedfrom each other if the shared parent is non-malicious.

2.4 Network Access

A principal uses HTTP requests to communicate withremote servers. The principal’s parent controls the setof servers that are actually reachable. Like visual fieldpermissions, network privileges nest in a monotonicallyrestrictive way. The most expansive privilege is “*”,which means that a principal can fetch any resource thatis allowed by the same-origin policy. Parents can also

restrict children to a limited set of accessible domains.Parents can specify a group of related domains using astraightforward wildcard syntax, e.g., *.foo.com orcache.*.bar.com. As a syntactic shortcut, a parentcan specify the “self” domain to indicate that the childcan communicate with servers from the child’s origin.Similarly, the “parent” token resolves to the parent’s ori-gin.

2.5 Local Storage

In HTML5, the DOM storage abstraction allows eachorigin to maintain a client-side key/value database. Eachdatabase can be accessed only by JavaScript code fromthe associated origin. Jigsaw partitions DOM storage inthe same way. If principals from different origins want toexchange data from their respective DOM storage areas,they must do so via public interface methods.

2.6 The JavaScript Namespace

In traditional JavaScript, objects are dictionaries thatmap property strings to values. Using JavaScript’sextremely permissive reflection interface, a programcan dynamically enumerate an object’s properties andread, write, or delete those properties. Unlike standardclass-based languages like Java and C#, JavaScript usesprototype-based inheritance. A prototype object is anexemplar which defines the property names and defaultproperty values for other instances of that object. By set-ting an object’s proto property to the exemplar, theobject becomes an instance of the prototype’s class. Bysetting the proto fields of prototype objects, onecreates inheritance hierarchies. By default, an object’sproperty list is dynamic, so an instance of a particularprototype can dynamically gain additional properties thatare not defined by the prototype. An object’s protofield is just another property, meaning that an object’sclass can dynamically change as well.

These default reflection semantics are obviously un-suited for cross-domain encapsulation. JavaScript doesallow a limited form of data hiding using closures, whichare functions that can access a hidden, non-reflectablenamespace. Unfortunately, closures are an imperfectsubstrate for cross-domain sharing. They cannot beshared across iframes, and within an iframe, each closurehas unfettered access to the DOM tree, event loop, andnetwork resources that belong to the enclosing frame.Furthermore, closures are clumsy to program and main-tain, since the hidden closure variables are implicitly ob-scured via lexical scoping instead of explicitly markedvia a special keyword. Jigsaw provides simpler, strongerencapsulation using boxes and the public/ privatekeywords.

rogate will never call createSurrogate() for thatproperty. As we show in Section 4.2, this lazy evaluationis beneficial when boxes share enormous object graphsbut only touch a fraction of the objects. Using lazy eval-uation, Jigsaw never devotes computational resources toprotect objects that are shared but never accessed.

For each public method belonging to obj, the sur-rogate defines a wrapper function whose this pointeris bound to obj. Thus, even if external code assignsthe surrogate method to be a property on another object,the method will always treat obj as its this pointer.This prevents attacks in which a malicious box subvertsa method’s intended semantics by supplying an inappro-priate this object.

When a surrogate function is invoked, it callscreateSurrogate() on all of its arguments beforepassing those arguments to the underlying function. Thesurrogate function also calls createSurrogate()on the underlying function’s return value, and returnsthat surrogate object to the original caller of the surro-gate function.

In summary, surrogates automatically protect cross-box data exchanges. Since principal objects are surro-gates, and boxes can be accessed only via their principalobjects, Jigsaw ensures that all cross-box interactions re-spect public/private semantics.

2.6.3 Predefined JavaScript Objects

The browser predefines a set of JavaScript objects thatlive in each box’s JavaScript namespace. The most im-portant predefined object is the DOM tree; others pro-vide support for regular expressions, mathematical func-tions, and so on. Jigsaw virtualizes the DOM tree ineach box (§3), redirecting the box’s DOM operations toa Jigsaw-controlled data structure that performs securitychecks before reflecting operations into the real DOM.Jigsaw also ensures that constructor functions for glob-ally shared built-in objects like regular expressions areprivate and immutable. These safeguards prevent a ma-licious box from arbitrarily manipulating the visual dis-play, or redefining constructor functions that are used byall boxes.

2.6.4 Cross-box Events

Box X may wish to register one of its functions as anevent handler in a different box Y . To do so, X sim-ply passes the handler to Y via Y ’s public interface; Ycan then register the handler with the browser’s event en-gine in the standard way. When the relevant event in Yoccurs, the browser executes the handler like any other.However, the browser passes a scrubbed event object tothe handler. This event does not contain references to Y ’sprivate-by-default JavaScript namespace. This prevents

information leakage via foreign event handlers. Like allforeign methods, the handler executes in the JavaScriptcontext of the box that created it.

2.7 Client-side Communication Privileges

A parent can restrict the principal objectsthat are visible to a child. By default, achild can reference its parent’s principal ob-ject (Jigsaw.getParentPrincipal()),the principal objects of its own children (theJigsaw.principals array), and the princi-pal objects of other boxes from its own domain(Jigsaw.getSameDomainPrincipals()). Jig-saw associates each principal with a unique id, anda parent can restrict a child’s access to a subset ofprincipals ids.

2.8 Dropping Privileges

Jigsaw allows a box to voluntarily restrict the networkingprivileges that it received from its parent. A box can alsorelinquish the right to a visual field, or abandon the abil-ity to write to DOM storage. Privilege can drop only in amonotonically decreasing fashion. For example, if a par-ent gives a child unrestricted network access, the childcannot restrict its privileges to only foo.com and thenunrestrict itself.

2.9 Summary

Jigsaw provides a robust encapsulation technique forcross-principal sharing. All data is accessible by refer-ence, but all data is implicitly hidden from external par-ties unless it is explicitly declared as public by theowning principal. Parents define resource permissionsfor the execution contexts of their children. Such per-missions become monotonically stricter as the principalnesting depth increases.

Jigsaw does enforce some restrictions on the stan-dard JavaScript language. In particular, a surrogatenever exposes the prototype object or constructor func-tion for the underlying object. Jigsaw also preventsbox code from tampering with the prototypes of globalbuilt-in objects like Array. While this prevents boxesfrom changing externally defined prototype chains, well-written JavaScript code rarely uses such tricks, and al-lowing such behavior allows malicious boxes to launchprototype poisoning attacks [1, 11] against other boxes.Despite these restrictions, Jigsaw preserves many fea-tures of the standard JavaScript language. For exam-ple, Jigsaw supports closures, first-class function objects,object literals, an event-driven programming model, andpass-by-reference semantics for all objects, not just thoseshared within the same domain.

3 Implementation

Our Jigsaw implementation consists of a Jigsaw-to-JavaScript compiler and a client-side JavaScript library.The compiler parses Jigsaw code using an ANTLRtoolchain [20]. A custom C# program adds static se-curity checks to the resulting ASTs, and then translatesthe modified ASTs to JavaScript code that a browsercan execute. The emitted JavaScript code also containsthe client-side Jigsaw library, which implements runtimesecurity checks and defines box management interfaceslike Jigsaw.createBox().

The rewriter modifies every object creation so thateach object receives a unique integer id. This al-lows the Jigsaw library to maintain a mapping fromraw objects to the associated surrogates, ensuring thatcreateSurrogate() makes at most one surrogatefor each raw object.1

The rewriter also tags each object with the id ofits creating box (the Jigsaw library defines an internalgetCurrentBoxId() function to which the rewritercan insert a call). This tag allows the Jigsaw library todetermine whether a surrogate is being passed to its orig-inal box—if so, Jigsaw “unwraps” the surrogate, return-ing the backing object instead of the surrogate (§2.6.2).To allow createSurrogate() to determine the cur-rently executing box context, the rewriter modifies allfunction definitions such that on entry, a function pushesits box id onto a stack, and on exit, a function pops thestack.

The rewriter translates public and private prop-erty declarations into operations on a per-object map ofvisibility metadata. The rewriter assigns such a map toeach object at object creation time. Using property de-scriptors [19], Jigsaw ensures that per-object metadatais immutable and cannot be modified by a malicious orbuggy box.

The Jigsaw library is responsible for creating newboxes. To do so, the library uses an eval() statement todynamically load the (rewritten) box code. However, theeval() call is invoked within the context of a specialJigsaw function that defines aliasing local variables forstandard global properties like window, document,and so on. The Jigsaw-defined aliases implement a virtu-alized browser environment that forces a box’s commu-nication with the outside world to go through Jigsaw’ssecurity validation layer. For example, Jigsaw’s virtualXMLHttpRequest object ensures that a box’s AJAXrequests satisfy the security policies defined by the box’sparent. Similarly, the virtual DOM tree is backed by abranch of the real DOM tree, but virtual operations are

1To prevent this data structure from hindering garbage collection,Jigsaw requires weak maps, whose design is being finalized for thenext version of JavaScript [16].

not reflected into the real tree unless they satisfy the par-ent box’s security policy. For dangerous functions likeeval(), and for sensitive internal Jigsaw functions, Jig-saw creates null virtualizations that do nothing or throwexceptions on access. As with all things JavaScript, thereare various subtleties in the virtualization process that weelide due to space constraints.

Our current Jigsaw prototype implements the bulk ofthe design from Section 2. The primary exception is fullDOM tree virtualization. This is still a work-in-progressdue to the complexity of the DOM interface.

4 Evaluation

Jigsaw’s goal is to provide an efficient, developer-friendly isolation framework. In this section, we describeour experiences with porting preexisting JavaScript li-braries to Jigsaw; we then evaluate the performance ofthe modified libraries. We show that porting legacycode to Jigsaw is straightforward, that Jigsaw’s pass-by-reference surrogates are much more efficient than pass-by-value marshaling, and that Jigsaw’s dynamic secu-rity checks are similar in performance to those of otherrewriting-based systems like Caja [15].

4.1 Porting Effort

We found that many preexisting JavaScript libraries al-ready had an implicit notion of “public” and “private”;thus, porting these libraries to Jigsaw primarily con-sisted of making implicit visibility settings explicit viathe public and private keywords. For example, atinitialization time, many libraries add a single new ob-ject to the global JavaScript namespace, and use that ob-ject’s properties as the high-level interface to the librarycode. This object serves as a de facto principal object (al-though it has none of the security protections afforded byJigsaw). To port libraries like this to Jigsaw, we first de-clared the de facto gateway object to be the Jigsaw prin-cipal object for the library. We marked that object’s prop-erties with the public keyword. We then identified thepublic properties of other library objects with the helpof an instrumented version of the Jigsaw runtime. Foreach surrogate that crossed a box boundary, the instru-mented runtime logged the public and private propertiesfor the surrogate’s backing object; the runtime also mod-ified each surrogate so that all foreign box accesses toprivate properties on the backing object threw an imme-diate exception instead of returning undefined.

The surrogate log and the fail-stop exceptions on pri-vate property accesses made it easy to identify legacycode properties that needed to be marked as public. Port-ing was also simplified because we did not have to worryabout asynchronous control flows as in PostMash [2].

0

5

10

15

Depth: 4, ObjBr: 2,FuncBr: 2



Tim

e t

o P

ass

Ob

ject

G

rap

h (

ms)

Serialize by value

Surrogates

155 nodes 315 nodes

635 nodes

Figure 7: When mutually distrusting domains must shareobjects with each other, Jigsaw’s pass-by-reference surro-gate mechanism is much faster than pass-by-value marshaling(Depth: depth of shared object tree, ObjBr: Number of ob-ject properties per object, FuncBr: Number of function prop-erties per object).

We also did not need to explicitly insert object saniti-zation calls—in contrast to Caja [15], which requires de-velopers to invoke a tame() sanitization function wher-ever an object crosses a trust boundary, Jigsaw automat-ically creates surrogates when “raw” objects travel be-tween boxes. We provide a more detailed comparison ofJigsaw, PostMash, and Caja in Section 5.

4.2 Performance

In this section, we use microbenchmarks and Jigsaw-modified applications to evaluate Jigsaw’s performanceoverheads. All of the results were generated on aWindows 7 PC with 4 GB of RAM and a dual-core 3.2GHz processor. All web pages were executed in theFirefox 8.0.1 browser.

Sharing overheads: In mashup frameworks that usepostMessage(), isolation containers share data byasynchronously exchanging immutable strings. If con-tainers wish to share more complex objects, each end-point must implement a marshaling protocol that serial-izes objects on sends and deserializes objects on receives.The marshaling costs can be high, particularly if con-tainers wish to share functions, since the recipient has todynamically compile each shared function’s source codeusing eval().

When a sender passes an object to a recipient, thesender actually shares an object graph that includes allof the objects that are recursively reachable from theroot. Figure 7 depicts the sharing cost for synthetic ob-ject graphs of various sizes. In these experiments, thegraphs were trees. The “Depth” metric corresponds tothe height of the tree. The “ObjBr” (object branch) met-ric indicates how many of an object’s properties refer-enced other objects; similarly, the “FuncBr” (functionbranch) metric indicates how many properties pointed tofunctions. Functions had no child objects or functions,i.e., they were leaf nodes in the object tree.

0

5

10

15

JSON-RPC DOM-SQL AES encrypt Mousemove

Slo

wd

ow

n f

acto

r

Figure 8: Jigsaw’s extra security tests add between 0x–12x per-formance overhead (the dotted line indicates a slowdown fac-tor of 1, i.e., no slowdown). Jigsaw’s performance overheadsare similar to those [10, 22] of other rewriting-based mashupframeworks like Caja [15].

As Figure 7 demonstrates, passing an object graph be-tween isolation containers using surrogates has almostzero overhead. When the root of an object graph ispassed across a box boundary, Jigsaw must create anew surrogate object for it; however, the only cost isan object creation (for the surrogate object itself) anda function call for each public property on the back-ing object (to create the getter/setter property descrip-tors (§2.6.2)). In contrast, marshaling pass-by-value datais much more expensive, since the entire object graphmust be traversed, serialized, and then deserialized, withcostly eval() operations on the receiver-side to recre-ate shared functions. Note that the results in Figure 7 donot include postMessage() overhead, i.e., the senderand the receiver were in the same frame. Thus, these re-sults are a conservative estimate of the marshaling costsin a postMessage() system.

In Jigsaw, passing large object trees across isolationbarriers is efficient because surrogates are lazily createdwhen a box actually tries to access a foreign object.Thus, at sharing time, initially only one surrogatemust be created for the root of the object graph. Inpass-by-value systems, once an object graph has beenrecreated by the receiver, property accesses on the graphare just as cheap as regular property access on locallycreated objects. In contrast, accessing a property througha Jigsaw surrogate introduces the overhead of invoking agetter or setter. Such accesses are roughly 30 times moreexpensive than a regular property access. However, webelieve that this cost is acceptable for three reasons.First, it is only paid upon accessing external objects;it is not paid for objects that are never accessed byexternal domains, nor is it incurred when a box accessesits locally declared objects. Second, since Jigsaw’sinitial sharing cost is O(1) in the size of the objectgraph to share, Jigsaw reduces the initial sharing costsof non-trivial graphs by multiple orders of magnitudecompared to a pass-by-value solution. Modern webapplications already have object graphs containing

hundreds of thousands of objects [14], so for complexmashups exchanging complex object graphs, the initialsharing cost of pass-by-value will be unattractive due tothe computational latencies. Third, unlike pass-by-valuesystems, surrogates allow mashups to communicatesynchronously using pass-by-reference semantics. Thismakes developing mashups much easier, and makesJigsaw’s property access penalties easier to bear.

End-to-end performance: In addition to performingchecks during property accesses, Jigsaw must perform avariety of bookkeeping tasks, e.g., assigning box ids andobject ids to newly created objects, and interposing onaccesses to virtualized browser resources to ensure thatboxes adhere to the security policies established by theirparents. Figure 8 shows the end-to-end performanceslowdown for several Jigsaw-enabled applications. Theslowdown is normalized with respect to the performanceof the baseline, non-Jigsaw-enabled applications. Thedotted line indicates a slowdown of 1, i.e., a situation inwhich Jigsaw adds no performance overhead. We exam-ined the following applications:• JSON-RPC [9] is a JavaScript library that layers

an RPC protocol atop an AJAX connection. Wedefined the performance of a JSON-RPC sessionas the integrator-perceived completion rate of nullRPCs that performed no actions at a localhost RPCserver. Using a localhost server instead of a remoteone eliminated the impact of network delay and al-lowed us to focus on Jigsaw’s CPU overhead.• The DOM-SQL [3] library provides a SQL interface

to DOM storage. We defined the performance ofDOM-SQL as the number of rows that the integratorcould insert into a table per second. Each insertioncaused a synchronous write to DOM storage.• The AES encryption function belongs to the Stan-

ford JavaScript crypto library [21]. Performancewas defined as the throughput with which the inte-grator could feed plaintext to the library and receiveciphertext.• Mousemove is a simple benchmark library which

registers a handler for mouse movement in the in-tegrator’s DOM. A human user moves the mouseback and forth as quickly as possible, and the li-brary increments a counter every time that its han-dler fires. Performance was defined as the numberof times that the handler fired.

In the Jigsaw version of each application, the library codewas placed in a separate box from the integrator, and allintegrator-integratee communication took place througha principal object or a virtualized DOM resource.

Figure 8 shows that Jigsaw’s security checks cause a 0-12x slowdown. The slowdown is application-dependent.For example, in the Mousemove test, the rate at which

the browser fired mouse handlers was slow enough thatJigsaw’s security overhead was hidden. The JSON-RPCtest was similar, since the rate at which AJAX callbacksfired was also slow enough to hide Jigsaw’s overhead. Incontrast, in the encryption test and the DOM-SQL test,the applications were rarely blocked on external browseractivity. Instead, these programs executed many small,application-defined functions. This incurred a lot of Jig-saw bookkeeping overhead, since Jigsaw had to updateits internal call stack for each function invocation andreturn (§3). Nevertheless, Jigsaw’s performance over-heads were similar to those of other rewriting systemslike Caja [10, 22].

5 Related Work

There are many preexisting frameworks for securingmashup applications. As we describe in more detail be-low, these systems present very different programmingmodels to developers. At a high level, Jigsaw differsfrom all of these systems due to its focus on providingsimple, efficient isolation mechanisms. Jigsaw is simplebecause it defines a concise ACL language for browserresources, a straightforward public/private distinc-tion for JavaScript properties, and an automatic surro-gate mechanism that transparently protects cross-domaindata exchanges while preserving synchronous functioncall semantics. Jigsaw is efficient because its pass-by-reference surrogates avoid the marshaling overhead thatafflicts pass-by-value systems.

ADsafe, FBJS, and Dojo Secure: ADsafe [6] andDojo Secure [25] use a language subsetting approach,forcing guest code to be written in a restricted portionof the larger JavaScript language. In contrast, Jigsaw al-lows guest code to be written in a larger, more expres-sive subset. This makes it easier for developers to portlegacy applications to Jigsaw (and write new Jigsaw ap-plications from scratch). However, Jigsaw does pay aperformance penalty due to the dynamic security checksthat are required to secure the larger language subset.

FBJS [8] uses rewriting to prepend guest codeidentifiers with a unique random prefix. This ef-fectively isolates the guest code from the integrator.FBJS allows guest code to interact with its parentthrough a restricted, virtualized API, e.g., through callsto a VirtDOMnode.getParentNode() methodinstead of through direct accesses to the parent’sDOMnode.parentNode property.

Broadly speaking, FBJS, ADsafe, and Dojo Securepresent a similar architectural model: strict guest isola-tion with a narrow, predefined interface between isola-tion containers. In contrast, Jigsaw allows principals todefine their own public interfaces.

Caja: Like Jigsaw, Caja [15] is a rewriting system thatplaces scripts inside virtualized execution environments.Caja defines a tame(obj) function that makes objsafe to pass to untrusted JavaScript contexts. tame()performs many of the security checks described in Sec-tion 2.6.2. For example, it prevents dynamic proto-type manipulation, and it ensures that methods cannotbe called with arbitrary this references.

Jigsaw differs from Caja in several important ways. InJigsaw, objects and their properties are invisible to ex-ternal domains by default. Developers use the publickeyword to mark an interface as externally visible; visi-bility annotations are defined as part of the interface dec-laration. In contrast, Caja’s visibility metadata is man-aged at interface sharing time instead of interface dec-laration time. In Caja, programmers must remember toinvoke tame(obj) before obj is passed across an iso-lation barrier. This makes a program’s security proper-ties more difficult to understand, since developers can nolonger reason about how an object can be accessed with-out inspecting all of the places at which the object crossesan isolation boundary. In contrast, Jigsaw’s declaration-time visibility modifiers provide clearer, more central-ized indications of object access rights. Jigsaw’s sur-rogate mechanism also provides automatic “taming” asobjects flow between boxes. This eliminates the devel-oper burden of having to manually tame objects at shar-ing time. It also guarantees that taming takes place allof the time, regardless of whether the developer remem-bered to tame an object.

Also note that Caja’s primary goal is to make it easyfor an integrating page to incorporate untrusted scripts—guest-to-guest interactions are of secondary importance.Thus, host-to-guest communication is straightforward,but guest-to-guest interactions must be mediated by thehost. This requires the integrator to define and managea shared communication infrastructure. In contrast, Jig-saw’s goal is to make it easy for arbitrary execution con-texts to communicate through restricted interfaces. Us-ing principal objects, any two contexts in Jigsaw can ex-change information. Furthermore, the integrator is nolonger responsible for managing cross-script communi-cation. Instead, each script implements its own cross-boxprotocols.

Secure ECMAScript (SES): Secure ECMAScript(SES) [17] uses newly standardized features of EC-MAScript 5 [7] to provide Caja-like isolation withoutrequiring Caja-like rewriting and runtime virtualization.By pushing dynamic security checks into the JavaScriptengine, SES can potentially offer dramatic reductions inthe costs of these checks. SES is still being formulated,but once ECMAScript 5 becomes widespread, Jigsawcan use SES techniques to implement its security abstrac-tions.

Pass-by-value systems: Systems like PostMash [2]use iframes as isolation containers, and implement cross-domain communication using the asynchronous, pass-by-value postMessage() call. Such isolation frame-works have several drawbacks. First, there is signif-icant marshaling overhead if domains share non-trivialobject graphs (§4.2). To avoid this penalty, domains cankeep local copies of object graphs and synchronize viewsacross iframes. However, this approach still requires fre-quent exchanges of synchronization messages. In Post-Mash, this message traffic induced a 60% performancedecrease in a Google Maps mashup [2].

A second drawback of these systems is that they relyon an asynchronous channel for cross-domain commu-nication. Asynchronous communication is an awkwardfit for many mashup applications; for example, it is ill-suited for the integration of computationally intensivemodules that implement databases, cryptographic op-erations, input sanitizers, image manipulation routines,and so on. Rewriters can transform asynchronous oper-ations into quasi-synchronous ones using continuation-passing [11]. However, many programmers find it diffi-cult to reason about continuations. Furthermore, contin-uations can introduce subtle race conditions that are notpresent in truly synchronous environments (§2.1).

Browsers give each iframe a separate execution thread.Thus, iframe isolation does have the advantage that a par-ent can make forward progress if a child is hung or com-putationally intensive. In Jigsaw, each box resides withinthe same iframe, so a misbehaving child can intentionallyor inadvertently perform a denial-of-service attack on itsparent. We do not view this as a major disadvantage ofJigsaw, since modern browsers allow users to terminateunresponsive scripts via a pop-up warning dialog.

Object views: Object views [11] let developers spec-ify policy code that controls how objects are sharedacross isolation boundaries. Policy code is written inthe full JavaScript language and is attached to view ob-jects that mediate external access to private backing ob-jects. This security model is very expressive, but webelieve that it is unnecessarily complex (and thereforeerror-prone). Jigsaw’s public and privatemodifierspresent a more intuitive programming model, allowingdevelopers to express simple “yes-no” disclosure poli-cies. In contrast, when a developer writes an object viewpolicy, she must reason about the execution context thatinitiates an access request, and how context-specific fac-tors should influence data exposure. Jigsaw’s visibilityidentifiers act as explicit, declaration-time indications ofvisibility policy.

IFC: Information flow control (IFC) systems likeJif [13] assign security labels to variables, allowing de-velopers to precisely specify the data that principals

should be allowed to read or write. We eschewed anIFC mashup architecture for two reasons. First, a well-known problem with IFC is that programmers are loathto generate the required security annotations. In contrast,simple visibility modifiers like public and privatehave not triggered a similar level of complaint. IFC-stylelabels are also ill-suited for governing access to browserresources. For example, it is difficult to use labels to ex-press policies like “give a principal update rights to theleftmost 30% of the visual display.” Jigsaw can easilyexpress such a policy using a simple CSS-style rule.

ConScript: ConScript [12] uses a modified browserengine to enforce security. Integrators restrict the behav-ior of guests by attaching policy code to key executionpoints, e.g., the invocation of a function or the loadingof a new script. Like object view policies, ConScriptpolicies are written in arbitrary JavaScript and can be ex-tremely expressive. However, as mentioned above, Jig-saw strives to provide simple, developer-friendly secu-rity policies, and we have found that in practice, Jigsaw’ssimpler policies are sufficient to express many kinds ofmashup architectures.

OMash: Like Jigsaw, OMash [5] allows each princi-pal to define a public set of functions that other principalscan invoke. However, OMash does not have a private-by-default visibility policy, nor does it wrap objects inproxies before handing them to external domains. Thus,public OMash functions expose an ostensibly narrow in-terface, but their return values can expose sensitive data.For example, the caller of a public OMash function canperform arbitrary JavaScript reflection on the propertiesof the returned object (and any other objects reachablefrom that root). If the caller modifies any of this data, themodifications will be visible in the data’s source domain.

MashupOS: MashupOS [23] provides a new set ofisolation abstractions for web browsers. In MashupOS,a service instance is a browser-side analogue of a tradi-tional OS process. Each instance gets a partitioned set ofhardware resources like CPU and memory, and commu-nicates with other instances using asynchronous, pass-by-value messages. Jigsaw eschews such a communi-cation style in favor of synchronous, pass-by-referencemessaging. This necessitates a mechanism like surro-gates (§2.6.2) for securely exchanging objects across iso-lation boundaries.

CommonJS Modules: CommonJS [4] defines a mod-ule system for JavaScript. CommonJS gives each librarya protected namespace and the ability to define externalinterfaces. However, CommonJS namespaces are imple-mented using closures. Thus, unlike Jigsaw boxes, Com-monJS namespaces do not protect against attacks likeprototype poisoning [1, 11]. CommonJS also does notprovide strong notions of public and private data. Thus,

as in OMash, the return values from public functions caninadvertently leak private module data.

6 Conclusion

Jigsaw is a new mashup framework for web applica-tions. It allows mutually distrusting content providers todefine narrow public interfaces for their private client-side state. Jigsaw strives to be developer-friendly, soit eschews the complicated security policies of priormashup frameworks; instead, Jigsaw uses the publicand private keywords to mark data as externally vis-ible or domain-private. Jigsaw’s security semantics arethus easily understandable to programmers who are fa-miliar with popular languages like Java that also usepublic/private distinctions.

Prior mashup frameworks often isolate state usingiframes or iframe-like abstractions. These isolationcontainers force domains to communicate using asyn-chronous pass-by-value channels. In contrast, Jigsaw’snovel surrogate mechanism allows domains to pass ob-jects by reference using synchronous function calls. Thismakes it easier for developers to reason about cross-origin sharing, since accessing a locally defined objector function looks no different than accessing an objector function that has been shared by an external domain.Pass-by-reference surrogates are also more efficient thanpass-by-value approaches because surrogates do not in-cur marshaling overhead when they travel between do-mains.

Our evaluation shows that existing web applicationsare easily ported to the Jigsaw framework. Our evalua-tion also demonstrates that Jigsaw has similar or betterperformance than prior mashup schemes.

Acknowledgments

We would like to thank David Wagner and the anony-mous reviewers for their comments on earlier versionsof this paper. This material is based upon work partiallysupported by a NSF Graduate Research Fellowship. Anyopinions, findings, conclusions, or recommendations ex-pressed here are those of the authors and do not neces-sarily reflect the views of the NSF.

References

[1] ADIDA, B., BARTH, A., AND JACKSON, C. Rootkits forJavaScript Environments. In Proceedings of the 3rd USENIXWorkshop on Offensive Technologies (WOOT) (2009), USENIXAssociation.

[2] BARTH, A., JACKSON, C., AND LI, W. Attacks on JavaScriptMashup Communication. In Web 2.0 Security and Privacy(2009).

[3] BOERE, P. dom-storage-query-language: A SQL inspired in-terface for DOM Storage. http://code.google.com/p/dom-storage-query-language/.

[4] COMMONJS. Modules/1.1.1 Specification, February 2012.http://wiki.commonjs.org/wiki/Modules/1.1.1.

[5] CRITES, S., HSU, F., AND CHEN, H. OMash: Enabling SecureWeb Mashups via Object Abstractions. In Proceedings of the15th ACM Conference on Computer and Communications Secu-rity (2008), ACM, pp. 99–108.

[6] CROCKFORD, D. ADsafe. http://www.adsafe.org.

[7] ECMA INTERNATIONAL. Standard ECMA-262: EC-MAScript Language Specification, 5.1 Edition, June2011. http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf.

[8] FBJS (FACEBOOK JAVASCRIPT). http://developers.facebook.com/docs/fbjs.

[9] GHERARDI, G. jsonrpcjs. https://github.com/gimmi/jsonrpcjs.

[10] GOOGLE. google-caja: Performance of cajoled code.http://code.google.com/p/google-caja/wiki/Performance.

[11] MEYEROVICH, L., FELT, A., AND MILLER, M. Object Views:Fine-grained Sharing in Browsers. In Proceedings of the 19thInternational Conference on World Wide Web (2010), ACM,pp. 721–730.

[12] MEYEROVICH, L., AND LIVSHITS, B. ConScript: Specifyingand Enforcing Fine-grained Security Policies for JavaScript in theBrowser. In Proceedings of the IEEE Symposium on Security andPrivacy (2010), IEEE, pp. 481–496.

[13] MEYERS, A. C., ZHENG, L., ZDANCEWIC, S., CHONG, S.,AND NYSTROM, N. Jif: Java Information Flow, July 2001.http://www.cs.cornell.edu/jif.

[14] MICKENS, J., ELSON, J., HOWELL, J., AND LORCH, J. Crom:Faster Web Browsing Using Speculative Execution. In Proceed-ings of NSDI (2010).

[15] MILLER, M., SAMUEL, M., LAURIE, B., AWAD, I., AND STAY,M. Caja: Safe active content in sanitized JavaScript. Googlewhite paper. http://google-caja.googlecode.com/files/caja-spec-2008-06-07.pdf.

[16] MILLER, M. S. ES Wiki: harmony:weak maps, May2011. http://wiki.ecmascript.org/doku.php?id=harmony:weak_maps.

[17] MILLER, M. S. SES (Secure EcmaScript), May 2011.https://code.google.com/p/es-lab/wiki/SecureEcmaScript.

[18] MIX, N. Narrative JavaScript. http://www.neilmix.com/narrativejs/doc/.

[19] MOZILLA DEVELOPER NETWORK. Object.defineProperty().https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Object/defineProperty.

[20] PARR, T. The Definitive ANTLR Reference. Pragmatic Book-shelf, Raleigh, North Carolina, 2007.

[21] STARK, E., HAMBURG, M., AND BONEH, D. Symmetric Cryp-tography in JavaScript. In Proceedings of the Annual Com-puter Security Applications Conference (ACSAC) (2009), IEEE,pp. 373–381.

[22] SYNODINOS, D. ECMAScript 5, Caja and RetrofittingSecurity: An Interview with Mark S. Miller, Febru-ary 25 2011. http://www.infoq.com/interviews/ecmascript-5-caja-retrofitting-security.

[23] WANG, H., FAN, X., HOWELL, J., AND JACKSON, C. Pro-tection and Communication Abstractions for Web Browsers inMashupOS. In Proceedings of SOSP (October 2007).

[24] WANG, H., GRIER, C., MOSHCHUK, A., KING, S., CHOUD-HURY, P., AND VENTER, H. The Multi-Principal OS Con-struction of the Gazelle Web Browser. In Proceedings of the18th USENIX Security Symposium (2009), USENIX Association,pp. 417–432.

[25] ZYP, K. Secure Mashups with dojox.secure, August2008. http://www.sitepen.com/blog/2008/08/01/secure-mashups-with-dojoxsecure/.

jigsaw: efﬁcient, low-effort mashup isolation · jigsaw: efﬁcient, low-effort mashup isolation...

Documents