regular functions rajeev alur university of pennsylvania 1

47
Regular Functions Rajeev Alur University of Pennsylvania 1

Upload: john-hamilton

Post on 17-Jan-2016

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Functions

Rajeev Alur University of Pennsylvania

1

Page 2: Regular Functions Rajeev Alur University of Pennsylvania 1

Language Classes

What if we consider functions?From strings to natural numbersFrom strings to strings

--- Recursive

--- NP--- P

--- Linear-time--- Regular No essential change for

Recursive, NP, P, linear-time…

2

Page 3: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Languages

NaturalIntuitive operational model of finite-state automata

RobustAlternative characterizations and closure properties

AnalyzableDecidable questions: emptiness, equivalence…

ApplicationsAlgorithmic verification, text processing …

What is the analog of regularity for defining functions?

3

Page 4: Regular Functions Rajeev Alur University of Pennsylvania 1

Finite Automata with Cost Labels

C: Buy CoffeeS: Fill out a surveyM: End-of-month

C / 2 C / 1

S

MM

Maps a string over {C,S,M} to a cost value:Cost of a coffee is 2, but reduces to 1 after filling out

a survey until the end of the monthOutput is computed by implicitly adding up transition costs

Intuitive, analyzable, and many applications/extensionsBut expressiveness not theoretically robust

S

4

Page 5: Regular Functions Rajeev Alur University of Pennsylvania 1

Finite Automata with Cost Registers

C / x:=x+2 C / x:=x+1S

MM

Cost Register Automata:Finite control + Finite number of registersRegisters updated explicitly on transitionsRegisters are write-only (no tests allowed)Each (final) state associated with output register

xx:=0

x

S

5

Page 6: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA Example

C / x:=x+2 C / x:=x+1S

M / x:=0M / x:=0

At any time, x = cost of coffees during the current month

Cost register x reset to 0 at each end-of-month

xx:=0

x

S

6

Page 7: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA Example

C / x:=x+2

C / x:=x+1S / x:=y

M / y:=x M / y:=x

Filling out a survey gives discount for all coffees during that month

xx,y:=0

x

y:=y+1

S

7

Page 8: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA Example

C / y:=y+1

M / x:=min(x,y); y:=0

Output = minimum number of coffees consumed during a month

Updates use two operations: increment and min

min(x,y)

y:=0

x:=Infty

8

Page 9: Regular Functions Rajeev Alur University of Pennsylvania 1

Talk Outline

Definition of Regular Functions

Additive Regular Functions

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

9

Page 10: Regular Functions Rajeev Alur University of Pennsylvania 1

Cost Model

Cost Grammar G to define set of terms:Inc: t := c | (t+c)Plus: t := c | (t+t)Min-Inc: t := c | (t+c) | min(t,t)Inc-Scale: t := c | (t+c) | (t*d)

Interpretation [] for operations:Set D of cost valuesMapping operators to functions over D

Example interpretations for the Plus grammar:Set N of natural numbers with additionSet G* of strings with concatenation

10

Page 11: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Function

Definition parameterized by the cost model C=(D,G,[])

A (partial) function f:S*->D is regular w.r.t. the cost model C if there exists a string-to-tree transformation g such that

(1) for all strings w, f(w)=[g(w)](2) g is a regular string-to-tree transformation

11

Page 12: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular String-to-tree Transformations

Definition based on MSO (Monadic Second Order Logic) –definable graph-to-graph transformations (Courcelle)

Studied in context of syntax-directed program transformations, attribute grammars, and XML transformations

Operational model: Macro Tree Transducers (Engelfriet et al)

Recent proposals:Streaming String Transducers (POPL 2011)Streaming Tree Transducers (ICALP 2012)

12

Page 13: Regular Functions Rajeev Alur University of Pennsylvania 1

MSO-definable String-to-tree Transformations

MSO over stringsF := a(x) | X(x) | x=y+1 | ~ F | F & F | Exists x. F | Exists X. F

MSO-transduction from strings to trees: 1. Number k of copies

For each position x in input, output-tree has nodes x1, …xk

2. For each symbol a and copy c, MSO-formula Fa,c(x)

Output-node xc is labeled with a if Fa,c(x) holds for unique a

3. For copies c and d, MSO-formula Fc,d(x,y)

Output-tree has edge from node xc to node xd if Fc,d(x,y) holds

13

Page 14: Regular Functions Rajeev Alur University of Pennsylvania 1

Example Regular Function

Cost grammar Min-Inc: t := c | (t+c) | min(t,t)Interpretation: Natural numbers with usual meaning of + and minS={C,M}f(w) = Minimum number of C symbols between successive M’s

Infty 0 1 1 0 1 1 1

+ + + + +

min min

Input w= C C M C C C M

Tree:

Value = 2

14

Page 15: Regular Functions Rajeev Alur University of Pennsylvania 1

Properties of Regular Functions

Known properties of regular string-to-tree transformations imply:

If f and g are regular w.r.t. a cost model C, and L is a regular language, then “if L then f else g” is regular w.r.t. C

Reversal: define Rev(f)(w) = f(reverse(w)).If f is regular w.r.t. a cost model C, then so is Rev(f)

Costs grow linearly with the size of the input string:Term corresponding to a string w is O(|w|)

15

Page 16: Regular Functions Rajeev Alur University of Pennsylvania 1

Talk Outline

Additive Regular Functions

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

16

Page 17: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Functions over Commutative Monoid

Cost model: D with binary function +Interpretation for + is commutative, associative, with identity 0

Cost grammar G(+): t := c | (t+t)

Cost grammar G(+c): t := c | (t+c)

Thm: Regularity w.r.t. G(+) coincides with regularity w.r.t. G(+c)

Proof intuition: Show that rewriting terms such as (2+3)+(1+5) to (((2+3)+1)+5) is a regular tree-to-tree transformation, and use closure properties of tree transducers

17

Page 18: Regular Functions Rajeev Alur University of Pennsylvania 1

Additive Cost Register Automata

Additive Cost Register Automata:DFA + Finite number of registersEach register is initially 0Registers updated using assignments x := y + cEach final state labeled with output term x + c

Given commutative monoid (D,+,0), an ACRA defines a partial function from S* to D

C / x:=x+2, y:=y+1 C / x:=x+1

S / x:=y

M / y:=x M / y:=x

xx,y:=0

x

S

18

Page 19: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Functions and ACRAs

Thm: Given a commutative monoid (D,+,0), a function f:S*->D is definable using an ACRA iff it is regular w.r.t. grammar G(+).

Establishes ACRA as an intuitive, deterministic operational model to define this class of regular functions

Proof relies on the model of SSTT (Streaming string-to-tree transducers) that can define all regular string-to-tree transformations

19

Page 20: Regular Functions Rajeev Alur University of Pennsylvania 1

Single-Valued Weighted Automata

Weighted Automata:Nondeterministic automata with edges labeled with costs

Single-valued:Each string has at most one accepting path

Cost of a string:Sum of costs of transitions along the accepting path

Example: When you fill out a survey, each coffee during that month gets the discounted cost.

Locally nondeterministic, but globally single-valued Thm: ACRAs and single-valued weighted automata define

the same class of functions

20

Page 21: Regular Functions Rajeev Alur University of Pennsylvania 1

Decision Problems for ACRAs

Min-Cost: Given an ACRA M, find min {M(w) | w in S*}Solvable in Polynomial-timeShortest path in a graph with vertices (state, register)

Equivalence: Do two ACRAs define the same functionSolvable in Polynomial-timeBased on propagation of linear equalities in program graphs

Register Minimization: Given an ACRA M with k registers, is there an equivalent ACRA with < k registers?

Algorithm polynomial in states, and exponential in k

21

Page 22: Regular Functions Rajeev Alur University of Pennsylvania 1

Towards a Theory of Additive Regular Functions

Goal: Machine-independent characterization of regularitySimilar to Myhill-Nerode theorem for regular languagesRegisters should compute necessary auxiliary functions

Example: S = {C,S}f(w)= if w contains S then |w| else 2|w|f1(Ci)=i and f2(Ci)=2i are necessary and sufficient

Thm: Register complexity of a function is at least k iff there exist strings s0, … sm, loop-strings t1,…tm, and suffixes w1,…wm, and k distinct vectors c1,…ck such that for all numbers x1,…xm, f(s0 t1

x1 s1 t2x2 … sm wi) = Sj cij xj + di

22

Page 23: Regular Functions Rajeev Alur University of Pennsylvania 1

Talk Outline

String Transducers

Regular Functions over a Semiring

Conclusions + Open Problems

23

Page 24: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Functions for Non-Commutative Monoid

Cost model: G* with binary function concatenation

Interpretation for . is non-commutative, associative, identity e

Cost grammar G(.): t := s | (t . t) s is a string

Cost grammar G(.s): t := s | (t . s) | (s . t)

Thm: Regular functions w.r.t G(.) is a strict superset of regular functions w.r.t. G(.s)

Classical model of Sequential Transducers captures only a subset of regular functions w.r.t. G(.s) 24

Page 25: Regular Functions Rajeev Alur University of Pennsylvania 1

Streaming String Transducer: Delete

Finite state control + register x ranging over output strings

String variables explicitly updated at each step

Delete all S symbols

output x

S / x := x

x := e

C / x := x C

25

Page 26: Regular Functions Rajeev Alur University of Pennsylvania 1

Streaming String Transducer: Reverse

Symbols may be added to string variables at both ends

output x

S / x := S x

x := e

C / x := C x

26

Page 27: Regular Functions Rajeev Alur University of Pennsylvania 1

Streaming String Transducer: Regular Look Ahead

If input ends with C, then delete all S symbols, else reverse

output x

S / x:= Sx

x,y := eoutput y

C / x:=Cx; y:=yC

C / x:=Cx; y:=yC

S / x:=Sx

Register x equals reverse of the input so farRegister y equals input so far with all S’s

deleted27

Page 28: Regular Functions Rajeev Alur University of Pennsylvania 1

Streaming String Transducer: Concatenation

Registers can be concatenated

Example: Swap substring before first M with substring following last M

M M

M M

Key restriction: a variable can appear at most once on RHS [x,y] := [xy, e] allowed [x,y] := [xy, y] not allowed

28

Page 29: Regular Functions Rajeev Alur University of Pennsylvania 1

SST Properties

At each step, one input symbol is processed, and at most a constant number of output symbols are newly created

Output is bounded: Length of output = O(length of input)

SST transduction can be computed in linear time

Finite-state control: Registers not examined

SST cannot implement merge f(u1u2….uk # v1v2…vk) = u1v1u2v2….ukvk

Multiple registers are essential For f(w)=wk, k variables are necessary and sufficient

29

Page 30: Regular Functions Rajeev Alur University of Pennsylvania 1

Decision Problem: Type Checking

Pre/Post condition assertion: { L } T { L’ }Given a regular language L of input strings (pre-condition), an SST T, and a regular language L’ of output strings (post-condition), verify that for every w in L, T(w) is in L’

Thm: Type checking is solvable in polynomial-timeKey construction: Summarization

30

Page 31: Regular Functions Rajeev Alur University of Pennsylvania 1

Decision Problem: Equivalence

Functional Equivalence:Given SSTs T1 and T2 over same input/output alphabets, check whether they define the same transductions.

Thm: Equivalence is solvable in PSPACE(polynomial in states, but exponential in number of string variables)

No lower bound known

31

Page 32: Regular Functions Rajeev Alur University of Pennsylvania 1

Expressiveness

Thm: A function f:S*->G* is definable by an SST iff it is regular w.r.t. the cost model (G*, . , e)

1. SST definable transduction is MSO definable 2. MSO definable transduction can be captured by a two-way transducer (Engelfriet/Hoogeboom 2001) 3. SST can simulate a two-way transducer

Evidence of robustness of class of regular transductions

Closure properties1. Sequential composition: f1(f2(w))

2. Regular conditional choice: if w in L then f1(w) else f2(w)

32

Page 33: Regular Functions Rajeev Alur University of Pennsylvania 1

SST Applications

Equivalent class of single pass list processing programs with solvable program analysis problems (POPL 2011)

Algorithmic verification of retransmission protocols (network components as regular transformers over bit sequences; FORTE 2013)

OpportunitiesBEK: Transducer-based tool for analyzing string sanitizersFlashFill: Learning string transformations from examples

33

Page 34: Regular Functions Rajeev Alur University of Pennsylvania 1

function delete input ref curr; input data v; output ref result; output bool flag := 0; local ref prev; while (curr != nil) & (curr.data = v) { curr := curr.next; flag := 1; } result := curr; prev:= curr; if (curr != nil) then { curr := curr.next; prev.next := nil; while (curr != nil) { if (curr.data = v) then { curr := curr.next; flag := 1; } else { prev.next := curr; prev := curr; curr := curr.next; prev.next := nil; } }

Decidable Analysis: 1. Assertion checks 2. Pre/post condition 3. Full functional correctness

Algorithmic Verification of List-processing Programs

head tail

3 8 2

34

Page 35: Regular Functions Rajeev Alur University of Pennsylvania 1

Talk Outline

Regular Functions over a Semiring

Conclusions + Open Problems

35

Page 36: Regular Functions Rajeev Alur University of Pennsylvania 1

Regular Functions over Semiring

Cost Domain: Natural numbers + Infty

Operation Min: Commutative monoid with identity Infty

Operation +: Monoid with identity 0

Rules: a + Infty = Infty + a = Inftya+min(b,c) = min (a+b, a+c); min(b,c)+a = min(b+a,c+a)

Cost grammar MinInc: t := c | min(t,t) | (t+c)

Goal: Understand class of regular functions w.r.t. MinInc36

Page 37: Regular Functions Rajeev Alur University of Pennsylvania 1

Weighted Automata

Weighted Automata:Nondeterministic automata with edges labeled with costs

Interpreted over the semiring cost model: cost of string w = min of costs of all accepting paths over wcost of a path = sum of costs of all edges in a path

Widely studied (Handbook of Weighted Automata, Droste et al)

Minimum cost problem solvableEquivalence undecidable over (N, min, +)Not determinizableNatural model in many applicationsRecent interest in CAV community for quantitative

analysis37

Page 38: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA over Min-Inc Semiring

C / y:=y+1

M / x:=min(x,y); y:=0

Output = Minimum number of coffees consumed during a month

min(x,y)y:=0

x:=Infty

38

Page 39: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA(min,+c) = Weighted Automata

From WA to CRA(min,+c):Generalizes subset construction for determinizationFor every state q of WA, CRA maintains a register xq

xq = min of costs of all paths to q on input read so far

Update on a: xq := min { xp + c | p –(a,c)-> q is edge in WA}

From CRA(min,+c) to WA:State of WA = (state q of CRA, register x)min simulated by nondeterminismTo simulate p – (a, x:=min(y,z)) -> q in CRA,

add a-labeled edges from (p,y) and (p,z) to (q,x)Distributivity of + over min critical

39

Page 40: Regular Functions Rajeev Alur University of Pennsylvania 1

CRA(min,+c) > Min-Plus Regular Functions

Thm: The class of regular functions w.r.t. Min-Inc semiring is a strict subset of weighted automata

Above function is not regular: cost term is quadratic in input

C/1 S/1

M

S,M C,M

Input w: w1 M w2 M … M wn

Each wi in {C,S}*

ci = Number of C’s in wi

si = Number of S’s in wi

Cost(w) = minj { c1+…+cj+sj+1+…+sn}

40

Page 41: Regular Functions Rajeev Alur University of Pennsylvania 1

Machine Model for Semiring Regular Functions

Updates to registers must be copylessEach register appears at most once in a right-hand-sideUpdate [x,y] := [min(x,y),y] not allowedNecessary to maintain “linear” growth

Need ability to simulate substitutionRegister x carries two values c and dStands for the parameterized expression min(c, ?)+dBesides min and inc, can substitute ? with a value

Resulting model coincides with regular functions over semiring

Open: Decidability of equivalence over (N, min , +c)41

Page 42: Regular Functions Rajeev Alur University of Pennsylvania 1

Talk Outline

Conclusions + Open Problems

42

Page 43: Regular Functions Rajeev Alur University of Pennsylvania 1

Discounted Cost Regular Functions

Basic element: (cost c, discount d) Discounted sum: (c1,d1)*(c2,d2) = (c1+d1c2, d1d2) Example of non-commutative monoid Classical Model: Future discounting

Cost of a path: (c1,d1) * (c2,d2) * … * (cn,dn)

Polynomial-time algorithm for “generalized” shortest path Past discounting

Cost of a path: (cn,dn) * (cn-1,dn-1) * … * (c1,d1)

Same PTIME algorithm works for shortest paths Prioritized double discounting

Cost = (c1,d1) * … * (cn, dn) * (c’1,d’1) * … * (c’n,d’n)

Shortest path: NExpTime algorithm Open: Shortest path for Discounted Cost Register Automata

43

Page 44: Regular Functions Rajeev Alur University of Pennsylvania 1

Open Problems and Challenges

Complexity of equivalence of SSTs and STTsLarge gap between lower and upper bounds

Machine-independent characterization of regularitySupport functions needed to compute a function

Decidability of min-cost for discounted cost automata

Simpler/cleaner proofs of equivalence of machine models and MSO-definable transformations

44

Is equivalence of regular functions over (N, min, +c) decidable?

Page 45: Regular Functions Rajeev Alur University of Pennsylvania 1

Unexplored Directions

Probabilistic modelsMarkov chains / MDPs with regular rewards

Regular costs for infinite executionsInfinitary operators: Lim-average, Discounted-sumStarting point: Infinite-String-to-Tree Transducers

Regular costs for trees

Combinations of other operationsRegular functions over G(+,min): t := c | (t+t) | min(t,t)

45

Page 46: Regular Functions Rajeev Alur University of Pennsylvania 1

Conclusions

Cost Register AutomataWrite-only machines with multiple registers to store outputs

Regular FunctionsDefinition parameterized by allowed operationsBased on MSO-definable graph transformations / transducersAppealing closure properties

1. Cost at each step depends on a global regular property2. Order in which costs are combined is regular

Emerging theory Some results, new connectionsMany open problems and unexplored directions

46

Page 47: Regular Functions Rajeev Alur University of Pennsylvania 1

Acknowledgements and References

Streaming String Transducers(with P. Cerny; POPL’11, FSTTCS’10)

Transducers over Infinite Strings(with E. Filiot, A. Trivedi; LICS’12)

Streaming Tree Transducers(with L. D’Antoni; ICALP’12)

Regular Functions and Cost Register Automata(with L. D’Antoni, J. Deshmukh, M. Raghothaman, Y. Yuan; LICS’13)

Decision problems for Additive Cost Regular Functions(with M. Raghothaman; ICALP’13)

Infinite-String to Infinite-Term Regular Transformations(with A. Durand, A. Trivedi; LICS’13: Next session)

Min-cost problems for Discounted Sum Regular Functions(with S. Kannan, K. Tian, Y. Yuan; LATA’13) 47