xiang fu hofstra university chung-chih li illinois state university 04/13/20101nfm 2010

23
Modeling Regular Replacement for String Constraints Solving Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/2010 1 NFM 2010

Upload: trevon-daniel

Post on 31-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 1

Modeling Regular Replacement for String Constraints Solving

Xiang FuHofstra University

Chung-Chih LiIllinois State University

04/13/2010

Page 2: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

Background

Hacker

Server

malicious scripts Cool page!

04/13/2010 NFM 2010 2

Problem? Lack of Sufficient Sanitation of Text

Inputs

Page 3: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 3

One Typical Error1 <?php2 $msg = $_POST[”msg”];3 $sanitized = pregreplace(4 ”/\< s c r i p t .*?\>.*?\<\/ s c r i p t .*?\ >/ i

”,5 ” ” , 6 $msg ) ;7 savetodb($sanitized )8 ?>

04/13/2010

<<script></script>script>alert(’a’)</script>Attacker’s Input

<script>alert(’a’)</script>

Reluctant Kleene Star

Page 4: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 4

Bigger PictureObjective: Automatic Discovery of

Vulnerabilities

04/13/2010

Symbolic

Execution

Test

Replayer

Bytecode

Attack Patter

n

StringConstrai

ntSolver

SUSHI

Page 5: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 5

Our ContributionAtomic Replacement ConstraintsConsider Two Semantics

GreedyReluctant

Modeling Using Finite State Transducer (FST)

Compact Representation of FSTSecurity Analysis

04/13/2010

Page 6: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 6

Finite State TransducerAccepts Regular RelationUnion, Concat,

CompositionIntersection, Complement

Used for Modeling Rewriting Rules [Kaplan94, Karttunen96]

04/13/2010

ε:11 2

34

a:2

b:3

A

(ab,123) ∈ L(A)

Page 7: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 7

Hierarchical FST &Modeling Declarative Semantics

04/13/2010

Id(∑* - ∑* r ∑*) r : ω

ε:ε

Id(∑* - ∑* r ∑*)

1 2 34

Identical Relation

Any String not Containing

patter r

Goal:

rS

Regular Search Pattern

Replacement

baaaa

}bbbbb,b,{

Page 8: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 8

Modeling Reluctant Semantics

2 StepsMark the beginning of patternDo the replacement

04/13/2010

Goal:

rS

-

baaaa

}bbb{

Key: Left-Most Matching

Page 9: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 201004/13/2010 9

a a b b c d a b c a b d

Input Word

a+b+c x

Search Pattern

#: εreluc(r)#’ : ω

ε: ε

Id(∑)

f1

s1

s2

Begin Marker# a # a b b c d # a b c a b d

x d x a b d

Page 10: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 10

The Challenge: Begin Marker

04/13/2010

a a b b c d a b c a b d

Input Word

# # #

a+b+c x

Search Pattern

#

Look-ahead Capability?

Non-determinism

3 Steps:(1)End marker(2)Generic end

marker(3)Begin marker

Page 11: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 11

Preliminary End Marker

04/13/2010

1 c: c

5

2 3

4

b: b

a: aε:$ b : b

a: a

A1

a+b+c x

Search Pattern

Idea: Start with End Marker for Reverse of

Search Pattern

Problem: Input tape accepts cb+a+

only!

Reversed Pattern

cb+a+

Page 12: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 12

Generic End Marker

04/13/2010

1

1

2

2,1

3

3,1

4

4,1

5

5,1c:c b:b a:a ε:$

b:ba:a

c:cc:c

a:a

b:b

c:c b:b

A2

cb+a+

Pattern

c c b a aInput Word

c c b a $ a $Output Word

Deterministic!

a:a

Page 13: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 13

Finally, the Begin Marker

04/13/2010

a+b+c x

Search Pattern

1

1

2

2,1

3

3,1

4

4,1

5

5,1

c:c b:b a:a ε:#

b:ba:a

c:c

c:c

a:a

b:b

c:c b:b

A3

0

ε:ε

ε:ε

ε:ε

Page 14: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 201004/13/2010 14

a a b b c d a b c a b d

Input Word

a+b+c x

Search Pattern

#: εreluc(r)#’ : ω

ε: ε

Id(∑)

f1

s1

s2

Begin Marker# a # a b b c d # a b c a b d

x d x a b d

Page 15: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 15

Greedy Semantics

04/13/2010

Goal:

rS

ba

aaa }b{

greedy

Challenge:

Look-ahead longest match

Page 16: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 1604/13/2010

Step 1: Begin Marker

Step 2: ND End Marker

Step 3: Pairing Markers

Step 4: Checking Match

Step 5: Check Longest

Step 6: Replacement

a+ x

Search Pattern

aabab

#a#ab#ab

#a#a$b#ab#a$#a$b#a$b

#a$#a$b#a$b

#a#a$b#a$b

#aa$b#a$b

xbxb

#a#ab#a$b

#aaba$b

Page 17: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 17

ApplicationsSolve String Constraints

04/13/2010

''uname*. OR ).*''|']([ˆ''uname

''']16,0[''pwd AND'']16,0[ ...' HERESELECT...W' ''''''

yx

Login Servlet

Input: user nameAfter filtering single quote and length restriction

Page 18: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 18

Solving Atomic Constraint

04/13/2010

Goal:

P rS

A1 Id(P)

Project to Input Tape

Solution

Page 19: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 19

SUSHI Constraint SolverSolves Simple Linear String Constraints

(SISE)Relies on

dk.brics.automaton for FSA operationsSelf-made Java package for FST operations

Supports 16-bit UnicodeCompact Transition Representation

04/13/2010

Type I

Type II

Type III

(I,I) (II,I) (III,II)

Page 20: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 20

Efficiency of Solver

04/13/2010

Benchmark Equations

}2,2{},{

nnbxnnba

}2,2{},{

nnbxnnba

}2,2{},{* nnbxnnba

}2,2{},{* nnbxnnba

1

2

3

4

Login Servlet

1.4 Seconds on 2Ghz PC

Flex SDKXSS AttackEquation Size: 565

74 SecondsShorter than Security Track #1022748

Page 21: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 21

Related WorkForward String Analysis

Christensen & Møller [SAS’03]Wasserman & Su [PLDI’07, ICSE’08]Bjørner & Tillmann [TACAS’09]

Backward String AnalysisKiezun & Ganesh [ISSTA’09]Yu & Bultan [SPIN’08, ASE’09]Fu [COMPSAC’07, TAVWEB’08]

Natural Language Processing* Kaplan and Kay [CL’1994]

04/13/2010

Our Contribution:

Precise Modeling of

Various Regular Substitution Semantics

Page 22: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 22

LimitationsSISE String Constraints

All Variables Appear on LHS (Once)No Easy Solution for Equation System YetNo string length

Future DirectionsEncoding string length in automataFinite model on bit-vector

04/13/2010

Page 23: Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010

NFM 2010 23

Questions?

04/13/2010