automatically inferring quantified loop invariants by ...soonhok/talks/20101201.pdf · 1/12/2010...

57
Automatically Inferring Quantified Loop Invariants by Algorithmic Learning from Simple Templates Soonho Kong 1 Yungbum Jung 1 Cristina David 2 Bow-Yaw Wang 3 Kwangkeun Yi 1 1 Seoul National University 2 National University of Singapore 3 INRIA, Tsinghua University, and Academia Sinica APLAS’10@Shanghai

Upload: others

Post on 21-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automatically Inferring Quantified Loop Invariants by Algorithmic Learning from Simple Templates

Soonho Kong1 Yungbum Jung1 Cristina David2 Bow-Yaw Wang3 Kwangkeun Yi1

1 Seoul National University2 National University of Singapore

3 INRIA, Tsinghua University, and Academia Sinica

APLAS’10@Shanghai

Page 2: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Overview

Automated Technique

Output

InputAnnotated Loop

(Pre-/Post-conditions)

InputAtomic Propositions

A Loop Invariant(Quantified Formulae)

Page 3: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique

Overview

InputAnnotated Loop

(Pre-/Post-conditions)

InputAtomic Propositions

Algorithmic Learning

Query

Answer

Query

Answer. .

.

Output

A Loop Invariant(Quantified Formulae)

Page 4: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique

Overview

InputAnnotated Loop

(Pre-/Post-conditions)

InputAtomic Propositions

Decision Procedures

Algorithmic Learning

Query

Answer

Query

Answer. .

.

Output

(SMT solvers) A Loop Invariant(Quantified Formulae)

Page 5: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique

Overview

InputAnnotated Loop

(Pre-/Post-conditions)

DecisionProcedures

Algorithmic Learning

Query

Answer

Query

Answer. .

.

BooleanFormulae

QuantifiedFormulae

InputAtomic Propositions

Output

(SMT solvers) A Loop Invariant(Quantified Formulae)

Page 6: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique

Overview

InputAnnotated Loop

(Pre-/Post-conditions)

DecisionProcedures

Algorithmic Learning

Query

Answer

Query

Answer. .

.

Predicate Abstraction Boolean

FormulaeQuantifiedFormulae

InputAtomic Propositions

Output

A Loop Invariant(Quantified Formulae)

∀k.k < i ⇒ a[k] ≤ a[m]

Templates

Templates Input

∀k.[ ] (SMT solvers)

Page 7: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

(a) rm pkey{ i = 0 ∧ key �= 0 ∧ ¬ret ∧ ¬break}1 while(i < n ∧ ¬break) do2 if(pkeys[i] = key) then3 pkeyrefs[i]:=pkeyrefs[i] − 1;4 if(pkeyrefs[i] = 0) then5 pkeys[i]:=0; ret:=true;6 break :=true;7 else i:=i + 1;8 done{(¬ret ∧ ¬break) ⇒ (∀k.k < n ⇒ pkeys[k ] �= key)

∧(¬ret ∧ break) ⇒ (pkeys[i] = key ∧ pkeyrefs[i] �= 0)∧ ret ⇒ (pkeyrefs[i] = 0 ∧ pkeys[i] = 0) }

(c) devres{ i = 0 ∧ ¬ret }1 while i < n ∧ ¬ret do2 if tbl[i] = addr then3 tbl[i]:=0; ret:=true4 else5 i:=i + 16 end{(¬ret ⇒ ∀k. k < n ⇒ tbl[k] �= addr)

∧(ret ⇒ tbl[i] = 0) }

(b) selection sort{ i = 0 }

1 while i < n − 1 do2 min:=i;3 j :=i + 1;4 while j < n do5 if a[j] < a[min] then6 min:=j;7 j:=j + 1;8 done9 if i �=min then

10 tmp:=a[i];11 a[i]:=a[min];12 a[min]:=tmp;13 i:=i + 1;14 done

{(i ≥ n − 1)∧ (∀k1.k1 < n ⇒

(∃k2.k2 < n ∧ a[k1] =�a[k2]))}

Fig. 5. Benchmark Examples: (a) rm pkey from Linux InfiniBand driver, (b)

selection sort from [20], and (c) devres from Linux library.

Observe that our algorithm is able to infer an arbitrary quantifier-free formula

(over a fixed set of atomic propositions) to fill the hole in the given template. A

simple template such as ∀k.[] suffices to serve as a hint in our approach.

selection sort from [20] Consider the selection sort algorithm in Figure 5(b).

Let �a[] denote the content of the array a[] before the algorithm is executed. The

postcondition states that the contents of array a[] come from its old contents.

In this test case, we apply our invariant generation algorithm to compute an

invariant to establish the postcondition of the outer loop. For computing the

invariant of the outer loop, we make use of the inner loop’s specification.

We use the following set of atomic propositions: {k1 ≥ 0, k1 < i, k1 = i,k2 < n, k2 = n, a[k1] = �a[k2], i < n−1, i = min}. Using the template ∀k1.∃k2.[],our algorithm infers following invariants in different runs:

∀k1.(∃k2.[(k2 < n ∧ a[k1] = �a[k2]) ∨ k1 ≥ i]); and∀k1.(∃k2.[(k1 ≥ i ∨min = i ∨ k2 < n) ∧ (k1 ≥ i ∨ (min �= i ∧ a[k1] =� a[k2]))]).

Note that all membership queries are resolved randomly due to the alternation of

quantifiers in array theory. Still a simple random walk suffices to find invariants in

this example. Moreover, templates allow us to infer not only universally quantified

invariants but also first-order invariants with alternating quantifications. Inferring

arbitrary quantifier-free formulae over a fixed set of atomic propositions again

greatly simplifies the form of templates used in this example.

13

From Linux InfiniBand Driver

Templates 8 atomic propositions∀k.[ ]

Page 8: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

(a) rm pkey{ i = 0 ∧ key �= 0 ∧ ¬ret ∧ ¬break}1 while(i < n ∧ ¬break) do2 if(pkeys[i] = key) then3 pkeyrefs[i]:=pkeyrefs[i] − 1;4 if(pkeyrefs[i] = 0) then5 pkeys[i]:=0; ret:=true;6 break :=true;7 else i:=i + 1;8 done{(¬ret ∧ ¬break) ⇒ (∀k.k < n ⇒ pkeys[k ] �= key)

∧(¬ret ∧ break) ⇒ (pkeys[i] = key ∧ pkeyrefs[i] �= 0)∧ ret ⇒ (pkeyrefs[i] = 0 ∧ pkeys[i] = 0) }

(c) devres{ i = 0 ∧ ¬ret }1 while i < n ∧ ¬ret do2 if tbl[i] = addr then3 tbl[i]:=0; ret:=true4 else5 i:=i + 16 end{(¬ret ⇒ ∀k. k < n ⇒ tbl[k] �= addr)

∧(ret ⇒ tbl[i] = 0) }

(b) selection sort{ i = 0 }

1 while i < n − 1 do2 min:=i;3 j :=i + 1;4 while j < n do5 if a[j] < a[min] then6 min:=j;7 j:=j + 1;8 done9 if i �=min then

10 tmp:=a[i];11 a[i]:=a[min];12 a[min]:=tmp;13 i:=i + 1;14 done

{(i ≥ n − 1)∧ (∀k1.k1 < n ⇒

(∃k2.k2 < n ∧ a[k1] =�a[k2]))}

Fig. 5. Benchmark Examples: (a) rm pkey from Linux InfiniBand driver, (b)

selection sort from [20], and (c) devres from Linux library.

Observe that our algorithm is able to infer an arbitrary quantifier-free formula

(over a fixed set of atomic propositions) to fill the hole in the given template. A

simple template such as ∀k.[] suffices to serve as a hint in our approach.

selection sort from [20] Consider the selection sort algorithm in Figure 5(b).

Let �a[] denote the content of the array a[] before the algorithm is executed. The

postcondition states that the contents of array a[] come from its old contents.

In this test case, we apply our invariant generation algorithm to compute an

invariant to establish the postcondition of the outer loop. For computing the

invariant of the outer loop, we make use of the inner loop’s specification.

We use the following set of atomic propositions: {k1 ≥ 0, k1 < i, k1 = i,k2 < n, k2 = n, a[k1] = �a[k2], i < n−1, i = min}. Using the template ∀k1.∃k2.[],our algorithm infers following invariants in different runs:

∀k1.(∃k2.[(k2 < n ∧ a[k1] = �a[k2]) ∨ k1 ≥ i]); and∀k1.(∃k2.[(k1 ≥ i ∨min = i ∨ k2 < n) ∧ (k1 ≥ i ∨ (min �= i ∧ a[k1] =� a[k2]))]).

Note that all membership queries are resolved randomly due to the alternation of

quantifiers in array theory. Still a simple random walk suffices to find invariants in

this example. Moreover, templates allow us to infer not only universally quantified

invariants but also first-order invariants with alternating quantifications. Inferring

arbitrary quantifier-free formulae over a fixed set of atomic propositions again

greatly simplifies the form of templates used in this example.

13

rm pkey from Linux InfiniBand Driver Figure 5(a) is a while statement

extractedfrom Linux InfiniBand

driver.4 The conjuncts

in the postcondition

represent(1) if the loop terminates without break, all

elements of pkeys are

not equalto key (line 2); (2) if t

he loop terminates with break but ret is false

,

then pkeys[i] is equal to key (line 2) but pkeyrefs[i] is not equal to zero (line

4); (3) if ret is true after the loop, thenboth pkeyrefs[i ] (lin

e 4) and pkeys[i ]

(line 5) are equal to zero. From the postcondition, we guess that

an invariant

can be universally quantified

with k. Using the simple template ∀k.[] and the

set of atomic propositio

ns {ret , break , i <n, k < i , pkeys [i ] = 0, pkeys [i ] = key ,

pkeyrefs[i ] =0, pkeyrefs[k]

= key}, our algorithmfinds following

quantified

invariantsin different runs:

(∀k.(k < i) ⇒ pkeys[k ] �= key) ∧ (ret ⇒ pkeyrefs[i ] = 0 ∧ pkeys[i ] = 0)

∧ (¬ret ∧ break ⇒ pkeys[i ] = key ∧ pkeyrefs[i ] �= 0); and

(∀k.(¬ret ∨ ¬break ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0)) ∧ (pkeys[k] �= key ∨ k ≥ i)

∧(¬ret ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0 ∧ i < n ∧ break))

∧ (¬break ∨ pkeyrefs[i] �= 0 ∨ ret) ∧ (¬break ∨ pkeys[i] = key ∨ ret)).

In spite of undecidabilit

y of first-order theories

in Yices and random answers,

each of the 3000(= 6×500) runs

in our experiments infers

an invariant successfully

.

Moreover, several qua

ntified invariantsare found in each case among 500 runs.

This suggests that invariantsare abundant

. Note that the templates in the

test cases selection sort and tracepoint2have alternatin

g quantification.

Satisfiability of alternat

ing quantifiedformulae is in general un

decidable.That is

why both cases havesubstantia

lly more restarts than the others. In

terestingly,

our algorithm is able to generate a verifiable

invariantin each run. Our simple

randomized mechanismproves to

be effective even for most difficult cases.

7 Related Work

Comparing with the work [15] of generating quantifier-

free invariants,we develop

the followingtechnical e

xtensions.First, we integrate potential

counterexamples

in resolving equivalence query algorithm

(line 6 - 7 in Algorithm1, and line 3 in

Algorithm2) instead

of restarting. Due to the undecidab

ility of satisfiability of

quantifiedformulae, SMT solvers oft

en give potential counte

rexamples. We exploit

potentialcounterex

amples to enhance our algorithm. Second,

a new condition

(Definition 1) to answer positively in resolving

membership queries isproposed.

Without this conditio

n, we can answer negatively to membership queries.

In contrast toprevious t

emplate-based approache

s [20,9], our template is more

general asit allows a

rbitrary hole-fillingquantifier-

free formulae. Thetemplates

in [20] can only be filled with formulae over conjunctions of

predicatesfrom a

given set. Anydisjunctio

n must be explicitlyspecified

as part of a template.

In [9], the authors consider inva

riants of the form E ∧

�nj=1 ∀Uj(Fj ⇒ ej), where

E,Fj and ej must be quantifierfree finite conjuction

s of atomic facts.

4 The source code can be found in function rm pkey of drivers/infiniband/hw/

ipath/ipathmad.c in Linux 2.6.28

14

Find this invariant in 3 seconds

Page 9: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

(a) rm pkey{ i = 0 ∧ key �= 0 ∧ ¬ret ∧ ¬break}1 while(i < n ∧ ¬break) do2 if(pkeys[i] = key) then3 pkeyrefs[i]:=pkeyrefs[i] − 1;4 if(pkeyrefs[i] = 0) then5 pkeys[i]:=0; ret:=true;6 break :=true;7 else i:=i + 1;8 done{(¬ret ∧ ¬break) ⇒ (∀k.k < n ⇒ pkeys[k ] �= key)

∧(¬ret ∧ break) ⇒ (pkeys[i] = key ∧ pkeyrefs[i] �= 0)∧ ret ⇒ (pkeyrefs[i] = 0 ∧ pkeys[i] = 0) }

(c) devres{ i = 0 ∧ ¬ret }1 while i < n ∧ ¬ret do2 if tbl[i] = addr then3 tbl[i]:=0; ret:=true4 else5 i:=i + 16 end{(¬ret ⇒ ∀k. k < n ⇒ tbl[k] �= addr)

∧(ret ⇒ tbl[i] = 0) }

(b) selection sort{ i = 0 }

1 while i < n − 1 do2 min:=i;3 j :=i + 1;4 while j < n do5 if a[j] < a[min] then6 min:=j;7 j:=j + 1;8 done9 if i �=min then

10 tmp:=a[i];11 a[i]:=a[min];12 a[min]:=tmp;13 i:=i + 1;14 done

{(i ≥ n − 1)∧ (∀k1.k1 < n ⇒

(∃k2.k2 < n ∧ a[k1] =�a[k2]))}

Fig. 5. Benchmark Examples: (a) rm pkey from Linux InfiniBand driver, (b)

selection sort from [20], and (c) devres from Linux library.

Observe that our algorithm is able to infer an arbitrary quantifier-free formula

(over a fixed set of atomic propositions) to fill the hole in the given template. A

simple template such as ∀k.[] suffices to serve as a hint in our approach.

selection sort from [20] Consider the selection sort algorithm in Figure 5(b).

Let �a[] denote the content of the array a[] before the algorithm is executed. The

postcondition states that the contents of array a[] come from its old contents.

In this test case, we apply our invariant generation algorithm to compute an

invariant to establish the postcondition of the outer loop. For computing the

invariant of the outer loop, we make use of the inner loop’s specification.

We use the following set of atomic propositions: {k1 ≥ 0, k1 < i, k1 = i,k2 < n, k2 = n, a[k1] = �a[k2], i < n−1, i = min}. Using the template ∀k1.∃k2.[],our algorithm infers following invariants in different runs:

∀k1.(∃k2.[(k2 < n ∧ a[k1] = �a[k2]) ∨ k1 ≥ i]); and∀k1.(∃k2.[(k1 ≥ i ∨min = i ∨ k2 < n) ∧ (k1 ≥ i ∨ (min �= i ∧ a[k1] =� a[k2]))]).

Note that all membership queries are resolved randomly due to the alternation of

quantifiers in array theory. Still a simple random walk suffices to find invariants in

this example. Moreover, templates allow us to infer not only universally quantified

invariants but also first-order invariants with alternating quantifications. Inferring

arbitrary quantifier-free formulae over a fixed set of atomic propositions again

greatly simplifies the form of templates used in this example.

13

rm pkey from Linux InfiniBand Driver Figure 5(a) is a while statement

extractedfrom Linux InfiniBand

driver.4 The conjuncts

in the postcondition

represent(1) if the loop terminates without break, all

elements of pkeys are

not equalto key (line 2); (2) if t

he loop terminates with break but ret is false

,

then pkeys[i] is equal to key (line 2) but pkeyrefs[i] is not equal to zero (line

4); (3) if ret is true after the loop, thenboth pkeyrefs[i ] (lin

e 4) and pkeys[i ]

(line 5) are equal to zero. From the postcondition, we guess that

an invariant

can be universally quantified

with k. Using the simple template ∀k.[] and the

set of atomic propositio

ns {ret , break , i <n, k < i , pkeys [i ] = 0, pkeys [i ] = key ,

pkeyrefs[i ] =0, pkeyrefs[k]

= key}, our algorithmfinds following

quantified

invariantsin different runs:

(∀k.(k < i) ⇒ pkeys[k ] �= key) ∧ (ret ⇒ pkeyrefs[i ] = 0 ∧ pkeys[i ] = 0)

∧ (¬ret ∧ break ⇒ pkeys[i ] = key ∧ pkeyrefs[i ] �= 0); and

(∀k.(¬ret ∨ ¬break ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0)) ∧ (pkeys[k] �= key ∨ k ≥ i)

∧(¬ret ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0 ∧ i < n ∧ break))

∧ (¬break ∨ pkeyrefs[i] �= 0 ∨ ret) ∧ (¬break ∨ pkeys[i] = key ∨ ret)).

In spite of undecidabilit

y of first-order theories

in Yices and random answers,

each of the 3000(= 6×500) runs

in our experiments infers

an invariant successfully

.

Moreover, several qua

ntified invariantsare found in each case among 500 runs.

This suggests that invariantsare abundant

. Note that the templates in the

test cases selection sort and tracepoint2have alternatin

g quantification.

Satisfiability of alternat

ing quantifiedformulae is in general un

decidable.That is

why both cases havesubstantia

lly more restarts than the others. In

terestingly,

our algorithm is able to generate a verifiable

invariantin each run. Our simple

randomized mechanismproves to

be effective even for most difficult cases.

7 Related Work

Comparing with the work [15] of generating quantifier-

free invariants,we develop

the followingtechnical e

xtensions.First, we integrate potential

counterexamples

in resolving equivalence query algorithm

(line 6 - 7 in Algorithm1, and line 3 in

Algorithm2) instead

of restarting. Due to the undecidab

ility of satisfiability of

quantifiedformulae, SMT solvers oft

en give potential counte

rexamples. We exploit

potentialcounterex

amples to enhance our algorithm. Second,

a new condition

(Definition 1) to answer positively in resolving

membership queries isproposed.

Without this conditio

n, we can answer negatively to membership queries.

In contrast toprevious t

emplate-based approache

s [20,9], our template is more

general asit allows a

rbitrary hole-fillingquantifier-

free formulae. Thetemplates

in [20] can only be filled with formulae over conjunctions of

predicatesfrom a

given set. Anydisjunctio

n must be explicitlyspecified

as part of a template.

In [9], the authors consider inva

riants of the form E ∧

�nj=1 ∀Uj(Fj ⇒ ej), where

E,Fj and ej must be quantifierfree finite conjuction

s of atomic facts.

4 The source code can be found in function rm pkey of drivers/infiniband/hw/

ipath/ipathmad.c in Linux 2.6.28

14

Find this invariant in 3 seconds

rm pkey from Linux InfiniBand Driver Figure 5(a) is a while statement

extractedfrom Linux InfiniBand

driver.4 The conjuncts

in the postcondition

represent(1) if the loop terminates without break, all

elements of pkeys are

not equalto key (line 2); (2) if t

he loop terminates with break but ret is false

,

then pkeys[i] is equal to key (line 2) but pkeyrefs[i] is not equal to zero (line

4); (3) if ret is true after the loop, thenboth pkeyrefs[i ] (lin

e 4) and pkeys[i ]

(line 5) are equal to zero. From the postcondition, we guess that

an invariant

can be universally quantified

with k. Using the simple template ∀k.[] and the

set of atomic propositio

ns {ret , break , i <n, k < i , pkeys [i ] = 0, pkeys [i ] = key ,

pkeyrefs[i ] =0, pkeyrefs[k]

= key}, our algorithmfinds following

quantified

invariantsin different runs:

(∀k.(k < i) ⇒ pkeys[k ] �= key) ∧ (ret ⇒ pkeyrefs[i ] = 0 ∧ pkeys[i ] = 0)

∧ (¬ret ∧ break ⇒ pkeys[i ] = key ∧ pkeyrefs[i ] �= 0); and

(∀k.(¬ret ∨ ¬break ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0)) ∧ (pkeys[k] �= key ∨ k ≥ i)

∧(¬ret ∨ (pkeyrefs[i] =0 ∧ pkeys[i] = 0 ∧ i < n ∧ break))

∧ (¬break ∨ pkeyrefs[i] �= 0 ∨ ret) ∧ (¬break ∨ pkeys[i] = key ∨ ret)).

In spite of undecidabilit

y of first-order theories

in Yices and random answers,

each of the 3000(= 6×500) runs

in our experiments infers

an invariant successfully

.

Moreover, several qua

ntified invariantsare found in each case among 500 runs.

This suggests that invariantsare abundant

. Note that the templates in the

test cases selection sort and tracepoint2have alternatin

g quantification.

Satisfiability of alternat

ing quantifiedformulae is in general un

decidable.That is

why both cases havesubstantia

lly more restarts than the others. In

terestingly,

our algorithm is able to generate a verifiable

invariantin each run. Our simple

randomized mechanismproves to

be effective even for most difficult cases.

7 Related Work

Comparing with the work [15] of generating quantifier-

free invariants,we develop

the followingtechnical e

xtensions.First, we integrate potential

counterexamples

in resolving equivalence query algorithm

(line 6 - 7 in Algorithm1, and line 3 in

Algorithm2) instead

of restarting. Due to the undecidab

ility of satisfiability of

quantifiedformulae, SMT solvers oft

en give potential counte

rexamples. We exploit

potentialcounterex

amples to enhance our algorithm. Second,

a new condition

(Definition 1) to answer positively in resolving

membership queries isproposed.

Without this conditio

n, we can answer negatively to membership queries.

In contrast toprevious t

emplate-based approache

s [20,9], our template is more

general asit allows a

rbitrary hole-fillingquantifier-

free formulae. Thetemplates

in [20] can only be filled with formulae over conjunctions of

predicatesfrom a

given set. Anydisjunctio

n must be explicitlyspecified

as part of a template.

In [9], the authors consider inva

riants of the form E ∧

�nj=1 ∀Uj(Fj ⇒ ej), where

E,Fj and ej must be quantifierfree finite conjuction

s of atomic facts.

4 The source code can be found in function rm pkey of drivers/infiniband/hw/

ipath/ipathmad.c in Linux 2.6.28

14

Page 10: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Algorithmic Learning:CDNF Algorithm

Page 11: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

CDNF Algorithm

Query

CDNF Algorithm

Answer

...

Query

Answer

Teacher

BooleanFormula

λ

Actively learning a Boolean formula from membership and equivalence queries

(polynomial # of queries in and # of variables)

Teacher is required

Result

Bshouty, N.H.: Exact learning boolean functions via the monotone theory. Information and Computation 123 (1995) 146–153

Deriving Invariants by Algorithmic Learning,Decision Procedures, and Predicate Abstraction1

Yungbum Jung† and Soonho Kong† and Bow-Yaw Wang‡ and Kwangkeun Yi†

† School of Computer Science and Engineering, Seoul National University

{dreameye,soon,kwang}@ropas.snu.ac.kr‡ Institute of Information Science, Academia Sinica

[email protected]

We present a novel technique for finding loop invariants in propositional formulae by com-

bining algorithmic learning, decision procedures, and predicate abstraction. Given invariant

approximations derived from pre- and post-conditions, our new technique exploits the flexibil-

ity in invariants by a simple randomized mechanism.

Algorithmic learning has been applied to assumption generation in compositional reasoning.

In contrast to traditional techniques, the learning approach does not derive assumptions in an

off-line manner. It instead finds assumptions by interacting with a model checker progressively.

Since assumptions in compositional reasoning are generally not unique, algorithmic learning can

exploit the flexibility in assumptions to attain preferable solutions. Applications in verifying

concurrent systems have been reported.

Finding loop invariants follows a similar pattern. Invariants are often not unique. Indeed,

programmers derive invariants incrementally. They usually have their guesses of invariants in

mind, and gradually refine their guesses by observing program behavior more. Since in practice

there are many invariants for given pre- and post-conditions, programmers have more freedom in

deriving invariants. Yet traditional invariant generation techniques do not exploit the flexibility.

They have a similar impediment to traditional assumption generation.

We report our first findings in applying algorithmic learning to invariant generation. We

show that the three technologies (algorithmic learning, decision procedures, and predicate ab-

straction) can be arranged in concert to derive loop invariants in propositional (or, quantifier-

free) formulae. The new technique is able to generate invariants for some Linux device drivers

and SPEC2000 benchmarks without any help from static or dynamic analyses.

For a while loop, an exact learning algorithm for Boolean formulae searches for invariants

by asking queries. Queries can be resolved (not always, see below) by decision procedures

automatically. Recall that the learning algorithm generates only Boolean formulae but deci-

sion procedures work in propositional formulae. We thus perform predicate abstraction and

concretization to integrate the two components.

In reality, information about loop invariant is incomplete. Queries may not be resolvable

due to insufficient information. One striking feature of our learning approach is to exploit the

flexibility in invariants. When query resolution requires information unavailable to decision

procedures, we simply give a random answer. We surely could use static analysis to compute

soundly approximated information other than random answers. Yet there are so many invariants

for the given pre- and post-conditions. A little bit of incorrect information does not prevent

algorithmic learning from inferring correct invariants. Indeed, the learning algorithm is able to

derive invariants in our experiments by coin tossing.

The technique can be seen as a framework for invariant generation. Static analyzers can

contribute by providing information to algorithmic learning. Ours is hence orthogonal to existing

techniques.

1This work is to be presented at VMCAI’10

Deriving Invariants by Algorithmic Learning,Decision Procedures, and Predicate Abstraction1

Yungbum Jung† and Soonho Kong† and Bow-Yaw Wang‡ and Kwangkeun Yi†

† School of Computer Science and Engineering, Seoul National University

{dreameye,soon,kwang}@ropas.snu.ac.kr‡ Institute of Information Science, Academia Sinica

[email protected]

We present a novel technique for finding loop invariants in propositional formulae by com-

bining algorithmic learning, decision procedures, and predicate abstraction. Given invariant

approximations derived from pre- and post-conditions, our new technique exploits the flexibil-

ity in invariants by a simple randomized mechanism.

Algorithmic learning has been applied to assumption generation in compositional reasoning.

In contrast to traditional techniques, the learning approach does not derive assumptions in an

off-line manner. It instead finds assumptions by interacting with a model checker progressively.

Since assumptions in compositional reasoning are generally not unique, algorithmic learning can

exploit the flexibility in assumptions to attain preferable solutions. Applications in verifying

concurrent systems have been reported.

Finding loop invariants follows a similar pattern. Invariants are often not unique. Indeed,

programmers derive invariants incrementally. They usually have their guesses of invariants in

mind, and gradually refine their guesses by observing program behavior more. Since in practice

there are many invariants for given pre- and post-conditions, programmers have more freedom in

deriving invariants. Yet traditional invariant generation techniques do not exploit the flexibility.

They have a similar impediment to traditional assumption generation.

We report our first findings in applying algorithmic learning to invariant generation. We

show that the three technologies (algorithmic learning, decision procedures, and predicate ab-

straction) can be arranged in concert to derive loop invariants in propositional (or, quantifier-

free) formulae. The new technique is able to generate invariants for some Linux device drivers

and SPEC2000 benchmarks without any help from static or dynamic analyses.

For a while loop, an exact learning algorithm for Boolean formulae searches for invariants

by asking queries. Queries can be resolved (not always, see below) by decision procedures

automatically. Recall that the learning algorithm generates only Boolean formulae but deci-

sion procedures work in propositional formulae. We thus perform predicate abstraction and

concretization to integrate the two components.

In reality, information about loop invariant is incomplete. Queries may not be resolvable

due to insufficient information. One striking feature of our learning approach is to exploit the

flexibility in invariants. When query resolution requires information unavailable to decision

procedures, we simply give a random answer. We surely could use static analysis to compute

soundly approximated information other than random answers. Yet there are so many invariants

for the given pre- and post-conditions. A little bit of incorrect information does not prevent

algorithmic learning from inferring correct invariants. Indeed, the learning algorithm is able to

derive invariants in our experiments by coin tossing.

The technique can be seen as a framework for invariant generation. Static analyzers can

contribute by providing information to algorithmic learning. Ours is hence orthogonal to existing

techniques.

1This work is to be presented at VMCAI’10

|λ|

CDNF Algorithm

Query

CDNF Algorithm

Answer

...

Query

Answer

Teacher

BooleanFormula

λ

Actively learning a Boolean formula from membership and equivalence queries

(polynomial # of queries in and # of variables)

Teacher is required

Result

Bshouty, N.H.: Exact learning boolean functions via the monotone theory. Information and Computation 123 (1995) 146–153

Deriving Invariants by Algorithmic Learning,Decision Procedures, and Predicate Abstraction1

Yungbum Jung† and Soonho Kong† and Bow-Yaw Wang‡ and Kwangkeun Yi†

† School of Computer Science and Engineering, Seoul National University

{dreameye,soon,kwang}@ropas.snu.ac.kr‡ Institute of Information Science, Academia Sinica

[email protected]

We present a novel technique for finding loop invariants in propositional formulae by com-

bining algorithmic learning, decision procedures, and predicate abstraction. Given invariant

approximations derived from pre- and post-conditions, our new technique exploits the flexibil-

ity in invariants by a simple randomized mechanism.

Algorithmic learning has been applied to assumption generation in compositional reasoning.

In contrast to traditional techniques, the learning approach does not derive assumptions in an

off-line manner. It instead finds assumptions by interacting with a model checker progressively.

Since assumptions in compositional reasoning are generally not unique, algorithmic learning can

exploit the flexibility in assumptions to attain preferable solutions. Applications in verifying

concurrent systems have been reported.

Finding loop invariants follows a similar pattern. Invariants are often not unique. Indeed,

programmers derive invariants incrementally. They usually have their guesses of invariants in

mind, and gradually refine their guesses by observing program behavior more. Since in practice

there are many invariants for given pre- and post-conditions, programmers have more freedom in

deriving invariants. Yet traditional invariant generation techniques do not exploit the flexibility.

They have a similar impediment to traditional assumption generation.

We report our first findings in applying algorithmic learning to invariant generation. We

show that the three technologies (algorithmic learning, decision procedures, and predicate ab-

straction) can be arranged in concert to derive loop invariants in propositional (or, quantifier-

free) formulae. The new technique is able to generate invariants for some Linux device drivers

and SPEC2000 benchmarks without any help from static or dynamic analyses.

For a while loop, an exact learning algorithm for Boolean formulae searches for invariants

by asking queries. Queries can be resolved (not always, see below) by decision procedures

automatically. Recall that the learning algorithm generates only Boolean formulae but deci-

sion procedures work in propositional formulae. We thus perform predicate abstraction and

concretization to integrate the two components.

In reality, information about loop invariant is incomplete. Queries may not be resolvable

due to insufficient information. One striking feature of our learning approach is to exploit the

flexibility in invariants. When query resolution requires information unavailable to decision

procedures, we simply give a random answer. We surely could use static analysis to compute

soundly approximated information other than random answers. Yet there are so many invariants

for the given pre- and post-conditions. A little bit of incorrect information does not prevent

algorithmic learning from inferring correct invariants. Indeed, the learning algorithm is able to

derive invariants in our experiments by coin tossing.

The technique can be seen as a framework for invariant generation. Static analyzers can

contribute by providing information to algorithmic learning. Ours is hence orthogonal to existing

techniques.

1This work is to be presented at VMCAI’10

Deriving Invariants by Algorithmic Learning,Decision Procedures, and Predicate Abstraction1

Yungbum Jung† and Soonho Kong† and Bow-Yaw Wang‡ and Kwangkeun Yi†

† School of Computer Science and Engineering, Seoul National University

{dreameye,soon,kwang}@ropas.snu.ac.kr‡ Institute of Information Science, Academia Sinica

[email protected]

We present a novel technique for finding loop invariants in propositional formulae by com-

bining algorithmic learning, decision procedures, and predicate abstraction. Given invariant

approximations derived from pre- and post-conditions, our new technique exploits the flexibil-

ity in invariants by a simple randomized mechanism.

Algorithmic learning has been applied to assumption generation in compositional reasoning.

In contrast to traditional techniques, the learning approach does not derive assumptions in an

off-line manner. It instead finds assumptions by interacting with a model checker progressively.

Since assumptions in compositional reasoning are generally not unique, algorithmic learning can

exploit the flexibility in assumptions to attain preferable solutions. Applications in verifying

concurrent systems have been reported.

Finding loop invariants follows a similar pattern. Invariants are often not unique. Indeed,

programmers derive invariants incrementally. They usually have their guesses of invariants in

mind, and gradually refine their guesses by observing program behavior more. Since in practice

there are many invariants for given pre- and post-conditions, programmers have more freedom in

deriving invariants. Yet traditional invariant generation techniques do not exploit the flexibility.

They have a similar impediment to traditional assumption generation.

We report our first findings in applying algorithmic learning to invariant generation. We

show that the three technologies (algorithmic learning, decision procedures, and predicate ab-

straction) can be arranged in concert to derive loop invariants in propositional (or, quantifier-

free) formulae. The new technique is able to generate invariants for some Linux device drivers

and SPEC2000 benchmarks without any help from static or dynamic analyses.

For a while loop, an exact learning algorithm for Boolean formulae searches for invariants

by asking queries. Queries can be resolved (not always, see below) by decision procedures

automatically. Recall that the learning algorithm generates only Boolean formulae but deci-

sion procedures work in propositional formulae. We thus perform predicate abstraction and

concretization to integrate the two components.

In reality, information about loop invariant is incomplete. Queries may not be resolvable

due to insufficient information. One striking feature of our learning approach is to exploit the

flexibility in invariants. When query resolution requires information unavailable to decision

procedures, we simply give a random answer. We surely could use static analysis to compute

soundly approximated information other than random answers. Yet there are so many invariants

for the given pre- and post-conditions. A little bit of incorrect information does not prevent

algorithmic learning from inferring correct invariants. Indeed, the learning algorithm is able to

derive invariants in our experiments by coin tossing.

The technique can be seen as a framework for invariant generation. Static analyzers can

contribute by providing information to algorithmic learning. Ours is hence orthogonal to existing

techniques.

1This work is to be presented at VMCAI’10

|λ|

Page 12: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Membership QueryMembership Query asks whetherBoolean assignment satisfies the Boolean formula µ λ

MEM (µ)

MEM (µ) = Yes if µ |= λ

MEM (µ) = No if µ �|= λ

Example: XOR function

MEM({b1 = T, b2 = F}) = Yes

MEM({b1 = T, b2 = T}) = No

T ⊕ F = T

T ⊕ T = F

λ = b1 ⊕ b2

Membership QueryMembership Query asks whetherBoolean assignment satisfies the Boolean formula µ λ

MEM (µ)

MEM (µ) = Yes if µ |= λ

MEM (µ) = No if µ �|= λ

Example: XOR function

MEM({b1 = T, b2 = F}) = Yes

MEM({b1 = T, b2 = T}) = No

T ⊕ F = T

T ⊕ T = F

λ = b1 ⊕ b2

Yes

No

Page 13: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Equivalence QueryEquivalence Query asks whether the guessed Boolean formula is equivalent to ..

EQ(β)λβ

if β ≡ λ

YesEQ(b1 ∧ ¬b2 ∨ ¬b1 ∧ b2) = Yes

Example: XOR functionλ = b1 ⊕ b2

EQ((b1 ∧ ¬b2) ∨ (¬b1 ∧ b2)) = Yes

Page 14: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Equivalence Query

if β ≡ λ

Yes

if β �≡ λ ∧ (µ |= β ⊕ λ)

No with µ

β λ

µOtherwise, the teacher needs to provide a truth assignment as a counterexample .

Example: XOR function

Example: XOR function

λ =

λ =

b1 ⊕ b2

b1 ⊕ b2

T ⊕ T = FT ∨ T = T

∵EQ(b1 ∨ b2) = No with {b1 = T, b2 = T}

Equivalence Query asks whether the guessed Boolean formula is equivalent to ..

EQ(β)λβ

EQ(b1 ∧ ¬b2 ∨ ¬b1 ∧ b2) = YesEQ((b1 ∧ ¬b2) ∨ (¬b1 ∧ b2)) = Yes

Page 15: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Goal

Query

CDNF Algorithm

Answer

...Query

Answer

Teacherfor

an Invariant

A Loop Invariant

λResult

Implement a Teacherto guide CDNF algorithm infer an InvariantImplementing a Teacher for guiding CDNF

algorithm to find a quantified invariant

Goal

Page 16: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique Input

Annotated Loop(Pre-/Post-conditions)

DecisionProcedures

Algorithmic Learning

Query

Answer

Query

Answer. .

.

Predicate Abstraction Boolean

FormulaeQuantifiedFormulae

InputAtomic Propositions

Output

A Loop Invariant(Quantified Formulae)

∀i.[i < n ⇒ A[i] = 0]

First Issue

Templates

Templates Input

∀k.[ ] (SMT solvers)

Page 17: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Automated Technique Input

Annotated Loop(Pre-/Post-conditions)

DecisionProcedures

Algorithmic Learning

Query

Answer

Query

Answer. .

.

Predicate Abstraction Boolean

FormulaeQuantifiedFormulae

InputAtomic Propositions

Output

A Loop Invariant(Quantified Formulae)

∀i.[i < n ⇒ A[i] = 0]

First Issue

Templates

Templates Input

∀k.[ ] (SMT solvers)

Page 18: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Relating Domains

We want to find a Quantified invariantwhile the CDNF algorithm finds a Boolean formula.

Problem:

BooleanFormula

QuantifiedFormula

PropositionalFormula

VMCAI’10APLAS’10

Templates

∀k.k < i ⇒ a[k] ≤ a[m] k < i ⇒ a[k] ≤ a[m] ¬bk<i ∨ ba[k]≤a[m]

∀k.[ ]Predicate Abstraction

Page 19: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Quantified Formula Boolean Formula

Teacher CDNF Algorithm

Answering Queries

∀k.k < i ⇒ a[k] ≤ a[m] ¬bk<i ∨ ba[k]≤a[m]

Page 20: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Quantified Formula Boolean Formula

Teacher CDNF Algorithm

Answering Queries

YesYes

If teacher says “Yes” then it should really mean “Yes”

∀k.k < i ⇒ a[k] ≤ a[m] ¬bk<i ∨ ba[k]≤a[m]

Page 21: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Quantified Formula Boolean Formula

Teacher CDNF Algorithm

Answering Queries

YesYes

If teacher says “Yes” then it should really mean “Yes”

No No

If teacher says “No” then it should really mean “No”

∀k.k < i ⇒ a[k] ≤ a[m] ¬bk<i ∨ ba[k]≤a[m]

Page 22: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

Page 23: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

Page 24: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

A Counter Example

if ∀i.i < 10 ⇒ ∀i.i < 1 then i < 10 �⇒ i < 1i < 10 ⇒ i < 1

Page 25: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

A Counter Example

if ∀i.i < 10 ⇒ ∀i.i < 1 then i < 10 �⇒ i < 1i < 10 ⇒ i < 1

Page 26: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

A Counter Example

if ∀i.i < 10 ⇒ ∀i.i < 1 then i < 10 �⇒ i < 1i < 10 ⇒ i < 1

Page 27: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Answering Queries

if t[θ1] ⇒ t[θ2] then θ1 ⇒ θ2

If teacher says “Yes” then it should really mean “Yes”

If teacher says “No” then it should really mean “No”if t[θ1] �⇒ t[θ2] then θ1 �⇒ θ2

Well-formedness condition

A Counter Example

if ∀i.i < 10 ⇒ ∀i.i < 1 then i < 10 �⇒ i < 1i < 10 ⇒ i < 1

Page 28: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Second Issue

The teacher is asked to answer questions about invariants without knowing invariants.

Problem:

Page 29: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Second Issue

The teacher is asked to answer questions about invariants without knowing invariants.

Problem:

We use approximations and random answers

Solution:

Page 30: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Invariant PropertiesFor the annotated loop

An Invariant I must satisfy all the following conditions:

(A) ( holds when entering the loop)

(B) ( holds at each iteration)

(C) ( gives after leaving the loop)

Let θ ∈ PropA be a quantifier-free formula. We write t[θ] to denote the first-order

formula obtained by replacing the hole in t[] with θ. Observe that any first-order

formula can be transformed into the prenex normal form; it can be expressed in

the form of a proper template.

A precondition Pre(ρ, S) for ρ ∈ Pred with respect to a statement S is a

first-order formula that guarantees ρ after the execution of the statement S. Let{δ} while κ do S {�} be an annotated loop and t[] ∈ τ be a template. The

invariant generation problem with template t[] is to compute a first-order formula

t[θ] such that (1) δ ⇒ t[θ]; (2) ¬κ ∧ t[θ] ⇒ �; and (3) κ ∧ t[θ] ⇒ Pre(t[θ], S).Observe that the condition (2) is equivalent to t[θ] ⇒ �∨κ. We have δ ⇒ t[θ] andt[θ] ⇒ �∨κ for any invariant t[θ]. δ and �∨κ are subsequently called the strongest

under-approximation and weakest over-approximation to invariants respectively.

A valuation ν is an assignment of natural numbers to integer variables and

truth values to Boolean variables. If A is a set of atomic propositions and Var(A)

is the set of variables occurred in A, ValVar(A) denotes the set of valuations for

Var(A). A valuation ν is a model of a first-order formula ρ (written ν |= ρ) if ρevaluates to T under ν. Let B be a set of Boolean variables. We write BoolB for

the class of Boolean formulae over Boolean variables B. A Boolean valuation µ is

an assignment of truth values to Boolean variables. The set of Boolean valuations

for B is denoted by ValB. A Boolean valuation µ is a Boolean model of the

Boolean formula β (written µ |= β) if β evaluates to T under µ.Given a first-order formula ρ, a satisfiability modulo theories (SMT) solver [6,16]

returns a model of ν if it exists. In general, SMT solver is incomplete over quan-

tified formulae and may return a potential model (written SMT (ρ)!→ ν). It

returns UNSAT (written SMT (ρ) → UNSAT ) if the solver proves the formula

unsatisfiable. Note that an SMT solver can only err when it returns a (potential)

model. If UNSAT is returned, the input formula is certainly unsatisfiable.

CDNF Learning Algorithm [3] The CDNF (Conjunctive Disjunctive Normal

Form) algorithm is an exact algorithm that computes a representation for any

target λ ∈ BoolB by asking a teacher queries. The teacher is required to resolve

two types of queries:

– Membership query MEM (µ) where µ ∈ ValB . If the valuation µ is a Boolean

model of the target Boolean formula λ, the teacher answers YES . Otherwise,

the teacher answers NO ;

– Equivalence query EQ(β) where β ∈ BoolB . If the target Boolean formula λis equivalent to β, the teacher answers YES . Otherwise, the teacher gives a

counterexample. A counterexample is a valuation µ ∈ ValB such that β and

λ evaluate to different truth values under µ.

For a Boolean formula λ ∈ BoolB, define |λ|CNF and |λ|DNF to be the sizes of

minimal Boolean formulae equivalent to λ in conjunctive and disjunctive normal

forms respectively. The CDNF algorithm infers any target Boolean formula

λ ∈ BoolB with a polynomial number of queries in |λ|CNF , |λ|DNF , and |B| [3].

5

δ ⇒ I I

I

II ∧ ¬κ ⇒ �I ∧ κ ⇒ Pre(I, S)

Page 31: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Invariant PropertiesFor the annotated loop

An Invariant I must satisfy all the following conditions:

(A) ( holds when entering the loop)

(B) ( holds at each iteration)

(C) ( gives after leaving the loop)

Observation #1In equivalence query we can say “YES” by checking these conditions.

Let θ ∈ PropA be a quantifier-free formula. We write t[θ] to denote the first-order

formula obtained by replacing the hole in t[] with θ. Observe that any first-order

formula can be transformed into the prenex normal form; it can be expressed in

the form of a proper template.

A precondition Pre(ρ, S) for ρ ∈ Pred with respect to a statement S is a

first-order formula that guarantees ρ after the execution of the statement S. Let{δ} while κ do S {�} be an annotated loop and t[] ∈ τ be a template. The

invariant generation problem with template t[] is to compute a first-order formula

t[θ] such that (1) δ ⇒ t[θ]; (2) ¬κ ∧ t[θ] ⇒ �; and (3) κ ∧ t[θ] ⇒ Pre(t[θ], S).Observe that the condition (2) is equivalent to t[θ] ⇒ �∨κ. We have δ ⇒ t[θ] andt[θ] ⇒ �∨κ for any invariant t[θ]. δ and �∨κ are subsequently called the strongest

under-approximation and weakest over-approximation to invariants respectively.

A valuation ν is an assignment of natural numbers to integer variables and

truth values to Boolean variables. If A is a set of atomic propositions and Var(A)

is the set of variables occurred in A, ValVar(A) denotes the set of valuations for

Var(A). A valuation ν is a model of a first-order formula ρ (written ν |= ρ) if ρevaluates to T under ν. Let B be a set of Boolean variables. We write BoolB for

the class of Boolean formulae over Boolean variables B. A Boolean valuation µ is

an assignment of truth values to Boolean variables. The set of Boolean valuations

for B is denoted by ValB. A Boolean valuation µ is a Boolean model of the

Boolean formula β (written µ |= β) if β evaluates to T under µ.Given a first-order formula ρ, a satisfiability modulo theories (SMT) solver [6,16]

returns a model of ν if it exists. In general, SMT solver is incomplete over quan-

tified formulae and may return a potential model (written SMT (ρ)!→ ν). It

returns UNSAT (written SMT (ρ) → UNSAT ) if the solver proves the formula

unsatisfiable. Note that an SMT solver can only err when it returns a (potential)

model. If UNSAT is returned, the input formula is certainly unsatisfiable.

CDNF Learning Algorithm [3] The CDNF (Conjunctive Disjunctive Normal

Form) algorithm is an exact algorithm that computes a representation for any

target λ ∈ BoolB by asking a teacher queries. The teacher is required to resolve

two types of queries:

– Membership query MEM (µ) where µ ∈ ValB . If the valuation µ is a Boolean

model of the target Boolean formula λ, the teacher answers YES . Otherwise,

the teacher answers NO ;

– Equivalence query EQ(β) where β ∈ BoolB . If the target Boolean formula λis equivalent to β, the teacher answers YES . Otherwise, the teacher gives a

counterexample. A counterexample is a valuation µ ∈ ValB such that β and

λ evaluate to different truth values under µ.

For a Boolean formula λ ∈ BoolB, define |λ|CNF and |λ|DNF to be the sizes of

minimal Boolean formulae equivalent to λ in conjunctive and disjunctive normal

forms respectively. The CDNF algorithm infers any target Boolean formula

λ ∈ BoolB with a polynomial number of queries in |λ|CNF , |λ|DNF , and |B| [3].

5

δ ⇒ I I

I

II ∧ ¬κ ⇒ �I ∧ κ ⇒ Pre(I, S)

Page 32: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Invariant PropertiesFor the annotated loop

An Invariant I must satisfy all the following conditions:

(A) ( holds when entering the loop)

(B) ( holds at each iteration)

(C) ( gives after leaving the loop)

Observation #1In equivalence query we can say “YES” by checking these conditions.

Let θ ∈ PropA be a quantifier-free formula. We write t[θ] to denote the first-order

formula obtained by replacing the hole in t[] with θ. Observe that any first-order

formula can be transformed into the prenex normal form; it can be expressed in

the form of a proper template.

A precondition Pre(ρ, S) for ρ ∈ Pred with respect to a statement S is a

first-order formula that guarantees ρ after the execution of the statement S. Let{δ} while κ do S {�} be an annotated loop and t[] ∈ τ be a template. The

invariant generation problem with template t[] is to compute a first-order formula

t[θ] such that (1) δ ⇒ t[θ]; (2) ¬κ ∧ t[θ] ⇒ �; and (3) κ ∧ t[θ] ⇒ Pre(t[θ], S).Observe that the condition (2) is equivalent to t[θ] ⇒ �∨κ. We have δ ⇒ t[θ] andt[θ] ⇒ �∨κ for any invariant t[θ]. δ and �∨κ are subsequently called the strongest

under-approximation and weakest over-approximation to invariants respectively.

A valuation ν is an assignment of natural numbers to integer variables and

truth values to Boolean variables. If A is a set of atomic propositions and Var(A)

is the set of variables occurred in A, ValVar(A) denotes the set of valuations for

Var(A). A valuation ν is a model of a first-order formula ρ (written ν |= ρ) if ρevaluates to T under ν. Let B be a set of Boolean variables. We write BoolB for

the class of Boolean formulae over Boolean variables B. A Boolean valuation µ is

an assignment of truth values to Boolean variables. The set of Boolean valuations

for B is denoted by ValB. A Boolean valuation µ is a Boolean model of the

Boolean formula β (written µ |= β) if β evaluates to T under µ.Given a first-order formula ρ, a satisfiability modulo theories (SMT) solver [6,16]

returns a model of ν if it exists. In general, SMT solver is incomplete over quan-

tified formulae and may return a potential model (written SMT (ρ)!→ ν). It

returns UNSAT (written SMT (ρ) → UNSAT ) if the solver proves the formula

unsatisfiable. Note that an SMT solver can only err when it returns a (potential)

model. If UNSAT is returned, the input formula is certainly unsatisfiable.

CDNF Learning Algorithm [3] The CDNF (Conjunctive Disjunctive Normal

Form) algorithm is an exact algorithm that computes a representation for any

target λ ∈ BoolB by asking a teacher queries. The teacher is required to resolve

two types of queries:

– Membership query MEM (µ) where µ ∈ ValB . If the valuation µ is a Boolean

model of the target Boolean formula λ, the teacher answers YES . Otherwise,

the teacher answers NO ;

– Equivalence query EQ(β) where β ∈ BoolB . If the target Boolean formula λis equivalent to β, the teacher answers YES . Otherwise, the teacher gives a

counterexample. A counterexample is a valuation µ ∈ ValB such that β and

λ evaluate to different truth values under µ.

For a Boolean formula λ ∈ BoolB, define |λ|CNF and |λ|DNF to be the sizes of

minimal Boolean formulae equivalent to λ in conjunctive and disjunctive normal

forms respectively. The CDNF algorithm infers any target Boolean formula

λ ∈ BoolB with a polynomial number of queries in |λ|CNF , |λ|DNF , and |B| [3].

5

δ ⇒ I I

I

II ∧ ¬κ ⇒ �I ∧ κ ⇒ Pre(I, S)

strongest under-approximation

of an invariant

weakest over-approximation

of an invariant

δ ⇒ I ⇒ κ ∨ �Observation #2

Page 33: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “YES”, if the guess satisfies invariant conditions.

Equivalence Query Resolution

(A) ( holds when entering the loop)

(B) ( holds at each iteration)

(C) ( gives after leaving the loop)

δ ⇒ I I

I

II ∧ ¬κ ⇒ �I ∧ κ ⇒ Pre(I, S)

Page 34: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “YES”, if the guess satisfies invariant conditions.

2. Otherwise, we need a counter example to answer “No”.

No

Guess

No, with found a counterexample

Equivalence Query Resolution

Under ApproximationOver Approximation

Case 1

Page 35: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

No, with found a counterexample

Equivalence Query Resolution

1. “YES”, if the guess satisfies invariant conditions.

2. Otherwise, we need a counter example to answer “No”.

Under ApproximationOver Approximation

Guess

No

Case 2

Page 36: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

No, with found a counterexample

Equivalence Query Resolution

1. “YES”, if the guess satisfies invariant conditions.

2. Otherwise, we need a counter example to answer “No”.

Under ApproximationOver Approximation

Guess

No

Case 1 & 2

No

Page 37: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

No, with found a counterexample

Equivalence Query Resolution

1. “YES”, if the guess satisfies invariant conditions.

2. Otherwise, we need a counter example to answer “No”.

Under ApproximationOver Approximation

Guess

No

Case 1 & 2

No

Guess

Case 3

Cannot find a counterexample.

Page 38: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

No, with found a counterexample

Equivalence Query Resolution

1. “YES”, if the guess satisfies invariant conditions.

2. Otherwise, we need a counter example to answer “No”.

Guess

No

Case 1 & 2

No

Guess

Case 3

A random counterexample.

NoNo

Case 3

?

γ∗(µ)

Unknown

Random Answer!Answer Yes / No

Yes

No

Case 1 & 2

1. “NO”, if is unsatisfiable.

2. Use approximations to answer the query.

γ∗(µ)

Membership Query Resolution: MEM (µ)

Page 39: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

i = 0 ∧ i = 1

Page 40: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

Answer Yes

Yes

Under ApproximationOver Approximation

2. Use approximations to answer the query.

model

Case 1

Page 41: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

Yes

Under ApproximationOver Approximation

2. Use approximations to answer the query.

model

No

Case 2

Answer/No

Page 42: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

Yes

Under ApproximationOver Approximation

2. Use approximations to answer the query.

No

Case 1 & 2

Answer/Yes or No

Yes

Page 43: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

Yes

Under ApproximationOver Approximation

2. Use approximations to answer the query.

model

No

Case 1 & 2

Answer/Yes or No

Case 3

? Unknown

Yes

Page 44: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

1. “NO”, if the model is unsatisfiable.

Membership Query Resolution

Yes

2. Use approximations to answer the query.

model

No

Case 1 & 2

Answer/Yes or No

Case 3

? Unknown

Yes

Answer/Yes or No randomly

Case 3

?

γ∗(µ)

Unknown

Random Answer!Answer Yes / No

Yes

No

Case 1 & 2

1. “NO”, if is unsatisfiable.

2. Use approximations to answer the query.

γ∗(µ)

Membership Query Resolution: MEM (µ)

Page 45: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Effect of Random Answer

Under Approximation

Over Approximation

? Unknown

Membership Query

model

Page 46: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Effect of Random Answer

Under Approximation

Over Approximation

Invariants

? Unknown

Both of the random answers can lead to an invariant.

Membership Query

model

I1

I2

Page 47: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Effect of Random Answer

Under Approximation

Over Approximation

Invariants

Yes

Membership Query

model

I1

“Yes” leads to I1

Page 48: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Effect of Random Answer

Under Approximation

Over Approximation

Invariants

No

Membership Query

model

“No” leads to

I2

I2

Page 49: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Random Algorithm

• Random Membership and Equivalence query resolution causes conflict!

• Then we simply restart the whole algorithm Case 3

?

γ∗(µ)

Unknown

Random Answer!Answer Yes / No

Yes

No

Case 1 & 2

1. “NO”, if is unsatisfiable.

2. Use approximations to answer the query.

γ∗(µ)

Membership Query Resolution: MEM (µ)

Page 50: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Random Algorithm

• Random Membership and Equivalence query resolution causes conflict!

• Then we simply restart the whole algorithm Case 3

?

γ∗(µ)

Unknown

Random Answer!Answer Yes / No

Yes

No

Case 1 & 2

1. “NO”, if is unsatisfiable.

2. Use approximations to answer the query.

γ∗(µ)

Membership Query Resolution: MEM (µ)

Memoization could not save the time :(Because the search space is huge

Page 51: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Random Algorithm

• Random Membership and Equivalence query resolution causes conflict!

• Then we simply restart the whole algorithm Case 3

?

γ∗(µ)

Unknown

Random Answer!Answer Yes / No

Yes

No

Case 1 & 2

1. “NO”, if is unsatisfiable.

2. Use approximations to answer the query.

γ∗(µ)

Membership Query Resolution: MEM (µ)

Memoization could not save the time :(Because the search space is huge

22n

where is #atomic propositionsn

Page 52: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

We always verify the conditions before say “Yes”.

SMT solvers are not complete but sound

for quantified formulae.

It’s still Sound

(A) ( holds when entering the loop)

(B) ( holds at each iteration)

(C) ( gives after leaving the loop)

δ ⇒ I I

I

II ∧ ¬κ ⇒ �I ∧ κ ⇒ Pre(I, S)

Why?When resolving equivalence query

1. “YES”, if the guess satisfies invariant conditions.

Page 53: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Experiment Results

Program Template AP MEM EQ MEM EQ ITER Time

max 7 5,968 1,742 65% 26% 269 5.7s

selection_sort 6 9,630 5,832 100% 4% 1,672 9.6s

devres 7 2,084 1,214 91% 21% 310 0.9s

rm_pkey 8 2,204 919 67% 20% 107 2.5s

tracepoint1 4 246 195 61% 25% 31 0.3s

tracepoint2 7 33,963 13,063 69% 5% 2,088 157.6s

∀k.[ ]

∀k1.∃k2.[ ]

∀k.[ ]

∀k.[ ]

∃k.[ ]

∀k1.∃k2.[ ]

Total� �� � Total� �� �Total RandomAverage of 500 runs

Page 54: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Experiment Results

Program Template AP MEM EQ MEM EQ ITER Time

max 7 5,968 1,742 65% 26% 269 5.7s

selection_sort 6 9,630 5,832 100% 4% 1,672 9.6s

devres 7 2,084 1,214 91% 21% 310 0.9s

rm_pkey 8 2,204 919 67% 20% 107 2.5s

tracepoint1 4 246 195 61% 25% 31 0.3s

tracepoint2 7 33,963 13,063 69% 5% 2,088 157.6s

∀k.[ ]

∀k1.∃k2.[ ]

∀k.[ ]

∀k.[ ]

∃k.[ ]

∀k1.∃k2.[ ]

Total� �� � Total� �� �Total Random

1. Simple templates are enough

– Membership Query: After a few equivalence queries, a membership query

asks whether�{i ≥ n, m = 0, i = 0, k ≥ n, a[k] ≤ a[m], a[m] ≥ a[i]} is a

part of an invariant. The teacher replies YES since the query is included in

the precondition and therefore should also be included in an invariant.

– Membership Query: The membership query MEM (�{i < n, m = 0, i �=

0, k < n, a[k] > a[m], k < i, a[m] ≥ a[i]}) is not resolvable because the

template is not well-formed (Definition 1) by the given membership query.

In this case, the teacher gives a random answer (YES or NO). Interestingly,

each answer leads to a different invariant for this query. If the answer is YES ,

we find an invariant ∀k.(i < n∧k ≥ i)∨ (a[k] ≤ a[m])∨ (k ≥ n); if the answeris NO , we find another invariant ∀k.(i < n ∧ k ≥ i) ∨ (a[k] ≤ a[m]) ∨ (k ≥n ∧ k ≥ i). This shows how our approach exploits a multitude of invariants

for the annotated loop.

1.2 Organization

We organize this paper as follows. After preliminaries in Section 2, we present

problems and solutions in Section 3. Our abstraction is briefly described in

Section 4. The details of our technique are described in Section 5. We report

experiments in Section 6, discuss related work in Section 7, then conclude in

Section 8.

2 Preliminaries

The abstract syntax of our simple imperative language is given below:

Stmt�= nop | Stmt; Stmt | x := Exp | b := Prop | a[Exp] := Exp |

a[Exp] := nondet | x := nondet | b := nondet |if Prop then Stmt else Stmt | { Pred } while Prop do Stmt { Pred }

Exp�= n | x | a[Exp] | Exp+ Exp | Exp− Exp

Prop�= F | b | ¬Prop | Prop ∧ Prop | Exp < Exp | Exp = Exp

Pred�= Prop | ∀x.Pred | ∃x.Pred | Pred ∧ Pred | ¬Pred

The language has two basic types: Booleans and natural numbers. A term in Expis a natural number; a term in Prop is a quantifier-free formula and of Boolean

type; a term in Pred is a first-order formula. The keyword nondet is used for

unknown values from user’s input or complex structures (e.g, pointer operations,

function calls, etc.). In an annotated loop {δ} while κ do S {�}, κ ∈ Prop is

its guard, and δ, � ∈ Pred are its precondition and postcondition respectively.

Quantifier-free formulae of the forms b, π0 < π1, and π0 = π1 are called atomic

propositions. If A is a set of atomic propositions, then PropA and PredA denote

the set of quantifier-free and first-order formulae generated from A, respectively.

A template t[] ∈ τ is a finite sequence of quantifiers followed by a hole to be

filled with a quantifier-free formula in PropA.

τ�= [] | ∀I.τ | ∃I.τ.

4

Average of 500 runs

Page 55: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Experiment Results

2. Random algorithm works well

Program Template AP MEM EQ MEM EQ ITER Time

max 7 5,968 1,742 65% 26% 269 5.7s

selection_sort 6 9,630 5,832 100% 4% 1,672 9.6s

devres 7 2,084 1,214 91% 21% 310 0.9s

rm_pkey 8 2,204 919 67% 20% 107 2.5s

tracepoint1 4 246 195 61% 25% 31 0.3s

tracepoint2 7 33,963 13,063 69% 5% 2,088 157.6s

∀k.[ ]

∀k1.∃k2.[ ]

∀k.[ ]

∀k.[ ]

∃k.[ ]

∀k1.∃k2.[ ]

Total� �� � Total� �� �Total Random

1. Simple templates are enough

– Membership Query: After a few equivalence queries, a membership query

asks whether�{i ≥ n, m = 0, i = 0, k ≥ n, a[k] ≤ a[m], a[m] ≥ a[i]} is a

part of an invariant. The teacher replies YES since the query is included in

the precondition and therefore should also be included in an invariant.

– Membership Query: The membership query MEM (�{i < n, m = 0, i �=

0, k < n, a[k] > a[m], k < i, a[m] ≥ a[i]}) is not resolvable because the

template is not well-formed (Definition 1) by the given membership query.

In this case, the teacher gives a random answer (YES or NO). Interestingly,

each answer leads to a different invariant for this query. If the answer is YES ,

we find an invariant ∀k.(i < n∧k ≥ i)∨ (a[k] ≤ a[m])∨ (k ≥ n); if the answeris NO , we find another invariant ∀k.(i < n ∧ k ≥ i) ∨ (a[k] ≤ a[m]) ∨ (k ≥n ∧ k ≥ i). This shows how our approach exploits a multitude of invariants

for the annotated loop.

1.2 Organization

We organize this paper as follows. After preliminaries in Section 2, we present

problems and solutions in Section 3. Our abstraction is briefly described in

Section 4. The details of our technique are described in Section 5. We report

experiments in Section 6, discuss related work in Section 7, then conclude in

Section 8.

2 Preliminaries

The abstract syntax of our simple imperative language is given below:

Stmt�= nop | Stmt; Stmt | x := Exp | b := Prop | a[Exp] := Exp |

a[Exp] := nondet | x := nondet | b := nondet |if Prop then Stmt else Stmt | { Pred } while Prop do Stmt { Pred }

Exp�= n | x | a[Exp] | Exp+ Exp | Exp− Exp

Prop�= F | b | ¬Prop | Prop ∧ Prop | Exp < Exp | Exp = Exp

Pred�= Prop | ∀x.Pred | ∃x.Pred | Pred ∧ Pred | ¬Pred

The language has two basic types: Booleans and natural numbers. A term in Expis a natural number; a term in Prop is a quantifier-free formula and of Boolean

type; a term in Pred is a first-order formula. The keyword nondet is used for

unknown values from user’s input or complex structures (e.g, pointer operations,

function calls, etc.). In an annotated loop {δ} while κ do S {�}, κ ∈ Prop is

its guard, and δ, � ∈ Pred are its precondition and postcondition respectively.

Quantifier-free formulae of the forms b, π0 < π1, and π0 = π1 are called atomic

propositions. If A is a set of atomic propositions, then PropA and PredA denote

the set of quantifier-free and first-order formulae generated from A, respectively.

A template t[] ∈ τ is a finite sequence of quantifiers followed by a hole to be

filled with a quantifier-free formula in PropA.

τ�= [] | ∀I.τ | ∃I.τ.

4

Average of 500 runs

Page 56: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Conclusion

• Algorithmic Learning + Decision Procedures + Predicate Abstraction + Simple Template=> Quantified Invariant Generation Technique

• Exploits the flexibility in invariants by randomized mechanism.

• Static/Dynamic Analysis can help with tighter approximations on invariants.

• Apply the CDNF algorithm to your own problems.

Page 57: Automatically Inferring Quantified Loop Invariants by ...soonhok/talks/20101201.pdf · 1/12/2010  · selection sort from [20], and (c) devres from Linux library. Observe that our

Conclusion

• Algorithmic Learning + Decision Procedures + Predicate Abstraction + Simple Template=> Quantified Invariant Generation Technique

• Exploits the flexibility in invariants by randomized mechanism.

• Static/Dynamic Analysis can help with tighter approximations on invariants.

• Apply the CDNF algorithm to your own problems.

Thanks!