extensible semantics for xrml vicky weissman joint work with joe halpern

71
Extensible Semantics for XrML Vicky Weissman Joint work with Joe Halpern

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Extensible Semantics for XrML

Vicky WeissmanJoint work with Joe Halpern

The big picture A policy says that under certain

conditions an action, such as downloading a file, is permitted or forbidden.

Digital content providers want to write policies about how their

works may be accessed, and to have their policies enforced.

Diverse apps – same need

Because we can’t regulate access to online content with precision: Digital libraries can’t put certain

content online; it might violate IP laws. The Greek Orthodox Archdiocese of

America is wary of defamation. Cultural traditions aren’t respected.

(Australian Aboriginal communities often restrict access to a clan or gender.)

XrML to the rescue XrML is a language for writing policies. Syntax is XML-based. Semantics is given in 2 ways.

1. An English interpretation of the syntax.2. An English description of an algorithm that

says if a set of XrML policies imply a permission.

Bottom line: write policies in XrML, enforce using the algorithm.

Industry likes XrML XrML endorsed by Adobe, Hewlett-

Packard, Microsoft, Xerox, Barnesandnoble.com, MPEG International Standards Committee…

Microsoft and others plan to make XrML-compliant products.

Will tomorrow’s OS, DVD player, … enforce XrML policies?

Improving XrML MPEG International Standards Committee

generalized some of the concepts and extended the language (slightly).

Released their version in March 2003. We refer to the 2003 version as XrML.

They released another version in 2004, which we discuss later in this talk.

XrML Shortcomings No formal semantics.

Policies can be ambiguous. The interpretation of the syntax

doesn’t quite match the algorithm. The algorithm’s behavior on some

(realistic) input is unintuitive and unintended by language designers. E.g. If Alice is a student and every student

may eat lunch, may Alice? Alg. says no.

Improving XrML (cont) Fix the algorithm to match developers’

intent. Translate XrML policies to formulas in modal

first-order logic. Formal semantics given in a fairly standard way. Lets us compare XrML with languages in CS

literature, borrow complexity results, extensions,…

Prove our translation matches the algorithm. Algorithm says policies imply a permission iff

translated policies imply translated permission.

XrML syntax XrML is an XML-based language.

XrML policies are verbose. We present a syntax that is

more concise and easy to map to XrML syntax.

Syntax Principals

Agents (e.g., Alice, Bob) Resources

Digital content (e.g., a movie, an article) Rights

Actions (e.g., play, edit) Properties

Describe a principal (e.g., adult, trusted)

Syntax (cont.) Statements

Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property

Pr. Perm(p, r, s) means p is permitted to

exercise right r over resource s.

Syntax (cont.) Statements

Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property

Pr. Perm(p, r, s) means p is permitted to

exercise right r over resource s.

Syntax (cont.) Statements

Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property

Pr. Perm(p, r, s) means p is permitted to

exercise right r over resource s.

Syntax (cont.) Statements

Stmt ::= Pr(p) | Perm(p, r, s) | true Pr(p) means principal p has property

Pr. Perm(p, r, s) means p is permitted to

exercise right r over resource s.

Syntax (cont.) grant ::=x1…xn(Stmt … Stmt

Stmt)

If condition holds, then conclusion holds. In our fragment, grants are closed

(no free variables). license ::= (grant, principal)

(g, p) means p issues/says g.

Stmt ::= Pr(p) | Perm(p, r, s) | true

condition conclusion

Examples Can write:

`Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who

gives a talk may have a cookie’ as(x (Prof(x) GivesTalk(x)

Perm(x, eat, cookie)), Vicky).

Examples Can write:

`Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who

gives a talk may have a cookie’ as(x (Prof(x) GivesTalk(x)

Perm(x, eat, cookie)), Vicky).

Examples Can write:

`Joe is a professor’ as true Prof(Joe) and Vicky says `Every professor who

gives a talk may have a cookie’ as(x (Prof(x) GivesTalk(x)

Perm(x, eat, cookie)), Vicky).

Principals – in some detail Set of principals is the set of

primitive principals (e.g., Alice, Bob) closed under union. E.g., Alice Bob is a principal. Often written as {Alice, Bob}.

According to the XrML doc, {p1,.., pn} represents p1, …, pn “acting together as one holistic identified entity”. But what does this mean?

Groups/members relationship Suppose that Alice has property PrA

and the group {Alice, …} has property Prg.

What should we infer? Option 1: nothing. Option 2: {Alice, …} has property

PrA. Option 3: Alice has property Prg.

Groups/members relationship Suppose that Alice has property PrA and

group {Alice, …} has property Prg. What should we infer?

Option 1: nothing. Option 2: {Alice, …} has property PrA. Option 3: Alice has property Prg.

XrML chooses each of these options (at different points in the specification).

Groups/members relationship Suppose that Alice has property PrA and

group {Alice, …} has property Prg. What should we infer?

Option 1: nothing. Option 2: {Alice, …} has property PrA. Option 3: Alice has property Prg.

XrML chooses each of these options (at different points in the specification).

No formal semantics language is inconsistent!

Our fix Since XrML is inconsistent…

We do not assume that a group has the properties of its members or vice-versa.

But can easily write policies to force either relationship (or both).

The syntax given here is a fragment of XrML. (See paper for details.)

XrML Algorithm Query(s,L,G) algorithm

s is a closed statement. L is a set of licenses. G is a set of grants that implicitly

hold. Returns true if s “follows” from L and

G. Query calls Auth and Holds.

Auth Algorithm Recall

s is a closed statement. L is a set of licenses. G is a set of grants that implicitly hold. A condition is a conjunction of

statements. Auth(s, L, G) returns a set D of

closed conditions; s “follows” from L and G if a condition in D “holds”.

Holds Algorithm Holds(d,L) algorithm

d is a closed condition. L is a set of licenses. Returns true if d “follows” from L.

Query(s, L, G) overviewQuery(s, L, G)

Set D to the output of Auth(s, L, G)

Return dD Holds(d, L)

s is a closed statement, L is a set of licenses, and G is a set of grants that hold implicitly.

s “follows” from L and G if a condition in the output of Auth(s, L, G) “holds”.

Holds(d, L) returns true if d “follows” from L.

Problem Let g = true Student(Alice),

g’ = x (Student(x) Perm(x, eat, lunch))

May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’})

Problem Let g = true Student(Alice),

g’ = x (Student(x) Perm(x, eat, lunch))

May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’})

Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}).

Auth returns {Student(Alice)}.

Problem Let g = true Student(Alice),

g’ = x (Student(x) Perm(x, eat, lunch))

May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’})

Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}). Auth returns {Student(Alice)}. Query calls Holds(Student(Alice), ).

lost g!

Problem Let g = true Student(Alice),

g’ = x (Student(x) Perm(x, eat, lunch))

May Alice eat lunch? Query(Perm(Alice, eat, lunch), , {g, g’})

Query calls Auth(Perm(Alice, eat, lunch), , {g, g’}). Auth returns {Student(Alice)}. Query calls Holds(Student(Alice), ). Holds returns false; so, Query returns false.

lost g!

The fix To correct the problem, pass G to

Holds and modify Holds to use the new info.

Bug is easy to find and easy to fix, but still made it into the released March 2003 version of the spec.

Another bug Query(s, , {x (Perm(p, issue, x) s)})

Query calls Auth on same input. Auth returns {Perm(p, issue, g) | g is a

grant}. Recall: Auth output is a set of closed

conditions. Query calls Holds on each returned

condition.

Another bug Query(s, , {x (Perm(p, issue, x) s)})

Query calls Auth on same input. Auth returns {Perm(p, issue, g) | g is a grant}.

Recall: Auth output is a set of closed conditions. Query calls Holds on each returned condition.

The set of grants is infinite. g0 = true Student(Alice) gi = true Perm(Bob, issue, gi-1), i = 1, …

D is an infinite set; so, Query doesn’t terminate.

Our fix Restrict the grants in the language.

If a grant g has a condition d, d mentions a resource variable x, and x is free in d, then x is free in g’s conclusion.

Easy to prove that if the restriction is met, then Auth always returns a finite set.

Can make an empirical argument for why this restriction is okay.

But that’s not all…

In this small fragment of XrML, there are 2 other bugs. See paper for details.

The translation The translation is fairly straightforward. Two points worth noting:

Query assumes that a grant g holds, if it’s issued by some trusted principal (i.e., a principal whose permitted to issue g).

Holds(d, L, G) returns true iff d logically follows from L and G. So, a condition d holds iff d logically follows from L and G.

The translation depends on L and G.

Correctness

Thm: the fixed Query(s, L, G)

returns true iff lLl L,G gG gL,G sL,G is true in every model that satisfies the union properties (p1p2 = p2p1, …). tL,G is the translation of t wrt L and G.

Complexity The XrML alg. runs in exponential

time. The XrML document says that the

language implementers are responsible for optimizations.

But using the translation, we can prove that…

Complexity Determining if a set of XrML grants

imply a statement is NP-hard. This is because the language supports

sets of primitive principals. If we remove from the language…

XrML translates (essentially) to Datalog, which is a well-known tractable language.

Given the translation, finding a tractable, fairly expressive fragment is easy.

Accomplishments We have proposed the first formal

semantics for XrML and, in the process, found significant problems with the spec.

But the importance of this work hinges on whether the results actually lead to a better language.

Impact When we found bugs, we told Xin

Wang and Thomas DeMartini from the MPEG standards committee.

Impact When we found bugs, we told Xin

Wang and Thomas DeMartini from the MPEG standards committee.

In March 2004, the committee released a new version, which is an ISO standard.

They addressed all of our concerns!

Key Change The algorithm is replaced by a formal

description of when a permission follows from L and G.

A permission p follows from L and G iff there’s a tree of finite depth such that leaves are statements that implicitly hold, non-leaves are implied by a single grant

and their children, and the root is p.

Example Alice is a good student and all

good students may play. May Alice play?

Yes.

Alice is good Alice is a student

Alice may play

known facts

follows from children and grant `all good students may play’

Observations The standard has formal semantics!

That’s what the formal description is. Also, some corrections are no longer

necessary. We no longer need to restrict the class of grants so that the

algorithm terminates or modify the algorithm so that complete

information is passed from one routine to the next.

But, wait a minute At the end of the day, we want to

implement XrML. How do we do this without an algorithm?

Tweak our translation to match the standard, then use Datalog techniques to solve the validity problem (assuming no union op).

First-order semantics still useful, even though standard has its own.

Another change The standard assumes that a group

has the properties of its members. Recall: 2003 version is inconsistent.

Why build this relationship into the language?

Hidden assumption The XrML designers assume that

(essentially) all facts about the world come from individuals presenting certificates.

Examples: Alice is a student if she presents her

student id. A principal is the group {Alice, Bob} if the

principal presents both Alice and Bob’s IDs.

A consequence In certificate passing systems it is

rare that an individual loses a right by presenting a certificate. {Alice, Bob} gets some rights by

presenting the Alice ID and does not lose these rights by presenting Bob’s.

The assumption guarantees this. Assuming a certificate passing system

leads to other design decisions….

Negation is unnecessary Most policies for certificate passing

systems are negation-free. Certificates usually give the properties

that their holders have. E.g. students are given student ids, instead of

everyone else being given `not a student’ ids. It’s silly to have a policy that forbids an

action if certain credentials are presented. No one will present them when wanting to do

the forbidden task.

Problem Certificate passing systems are not

appropriate for all applications. Many policies need negation.

E.g. `if a child is not grounded, then she may play outside’ and `smoking is not permitted on the airplane.’

These policies cannot be captured explicitly in the standard.

A partial solution Assume that an action is forbidden

unless it’s explicitly permitted. Problem: Can’t distinguish forbidden

actions from unregulated ones. As a result, policy sets can’t be merged. E.g. A university’s policies talk about who’s

permitted to get tenure. The policies for Alice’s new outreach program don’t. Alice’s policies contradict the university’s.

We may want functions too.

Functions often occur naturally when translating policies from English to first-order logic.

E.g. `Classified information may be copied from one secure server to another’: x1, x2, x3, x4 (Classified(x1) Secure(x2) Secure(x3) Permitted(x4, copySrcDst(x2, x3), x1))

We may want functions too.

Functions often occur naturally when translating policies from English to first-order logic.

E.g. `Classified information may be copied from one secure server to another’: x1, x2, x3, x4 (Classified(x1) Secure(x2) Secure(x3) Permitted(x4, copySrcDst(x2, x3), x1))

Extending the standard We want the standard to support

negations and functions. Extending the syntax is easy. It’s not clear how to modify the

semantics given in the standard. But, extending the translation is easy.

Policies simply translate to a fragment of modal first-order logic that includes functions and negation.

Complexity The validity problem for first-order

formulas with functions and negation is undecidable.

Idea: Restrict functions and negation so that policies translate to a tractable fragment.

Datalog

There are tractable fragments of Datalog that support some functions and negation, but the use of functions is severely

restricted and negation cannot appear in the

conclusion of Datalog rules, so no support for policies that forbid actions.

Lithium The Lithium language

[Halpern/Weissman] is a tractable fragment of first-order logic that supports unlimited use of functions

and allows a fair amount of negation in

both the premise and conclusions of statements.

Expressivity Experimental results suggest that

Lithium is sufficiently expressive for many applications. We collected a number of policies

from libraries and government databases, and have written them in Lithium.

Summary We proposed the first formal semantics

for XrML. This helped the language developers to

find several bugs, which they fixed. They have since given XrML formal

semantics; however, our approach seems better suited to extending and analyzing the language using known techniques.

Summary Industry wants to implement XrML but … XrML has no formal semantics

and needs them! We give formal semantics to a

representative fragment of XrML. Even a small fragment is intractable.

We can leverage results in the CS literature to find fairly expressive, tractable options.

Next step: Add negation to XrML. This is critical for merging policies.

Two minor bugs in paper (don’t effect results); corrected version online.

talk ends on preceding slide

Sample XrML policy Consider the policy `anyone may

play the movie `Big Hit’ for $2 (per use)’.

We could write this policy in XrML as…

<license><grant>  <forAll varName="anyone" />

<!-- This is saying that anyone can use this grant.  -->   <principal varRef="anyone" /> <!-- The right to play the movie is granted  -->   <cx:play />

<!-- This is the movie that we are giving access to.   -->

<cx:digitalWork> <cx:title>Big Hit </cx:title>  

</cx:digitalWork><!-- $2.00 each  --> <sx:fee>

<sx:paymentPerUse>  <sx:rate currency="USD">2.00</

sx:rate> </sx:paymentPerUse

</sx:fee>  </grant> 

</license>

The translation We now translate XrML licenses and grants to

“equivalent” formulas in modal first-order logic.

The translation relies on which licenses have been issued and which grants implicitly hold.

Let sL,G be the translation of any string s wrt the input parameters L and G.

Translation (cont.) Except for licenses and grants,

translation is easy. We assume a constant cg for each grant g Perm(p, issue, g)L,G = Perm(p, issue, cg)

(d1 d2)L,G = d1L,G d2

L,G, Pr(p)L,G = Pr(p), and trueL,G = true

Translating licenses Recall: (g, p) means p said g. According to Query,

if p may issue g, then (g,p) means that g holds otherwise, (g, p) is meaningless

Option 1: (g, p)L,G = Said(p, cg), restrict to models satisfying the axiom scheme

Said(p, cg) Perm(p, issue, cg) gL,G

Option 2: (g, p)L,G = Perm(p, issue, cg) gL,G

Translating grants x1…xn(d e)L,G =

x1…xn (Holds(d, L, G) eL,G) Holds(d, L, G) returns true iff d is a

logical consequence of L and G. Define a modal operator Val, where

Val() is true in a model m iff is true in all models.

Holds(d, L, G)=Val(lL l L,G gG gL,G dL,G )