certification of computational results greg bronevetsky

31
Certification of Computational Results Greg Bronevetsky

Upload: edita

Post on 14-Jan-2016

18 views

Category:

Documents


0 download

DESCRIPTION

Certification of Computational Results Greg Bronevetsky. Background. Technique proposed by Gregory F. Sullivan Dwight S. Wilson Gerald B. Masson All from Johns Hopkins CS Department. Overview. Trying to do fault detection without the severe overhead of replication. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Certification of Computational Results Greg Bronevetsky

Certification of Computational Results

Greg Bronevetsky

Page 2: Certification of Computational Results Greg Bronevetsky

Background

• Technique proposed by • Gregory F. Sullivan• Dwight S. Wilson• Gerald B. Masson

• All from Johns Hopkins CS Department.

Page 3: Certification of Computational Results Greg Bronevetsky

Overview

• Trying to do fault detection without the severe overhead of replication.

• Certification Trails are a manual approach that has that programmer provide additional code to have the program check itself.

• A program generates a certification trail that details its work.

• A checker program can use this trail to verify that the output is correct in asymptotically less time.

• Several examples provided. No automation.

Page 4: Certification of Computational Results Greg Bronevetsky

Roadmap

• We will cover some algorithms to which the Certification Trails technique has been applied• Sorting• Convex Hull• Heap Data Structures

• The addition of Certification Trails and the creation of the Checker is done manually by the programmer in all cases.

Page 5: Certification of Computational Results Greg Bronevetsky

Trail for Sorting

• In order to verify the output of a sorting algorithm we must check that• The sorted items are a permutation of the

original input items.• The sorted items appear in a non-decreasing

order in the sorter's output.

• Thus, the trail should contain all the items in their original order, each labeled with its location in the sorted list.

Page 6: Certification of Computational Results Greg Bronevetsky

Sorting Checker

• A Sorting Checker must:• Use the labels to place all elements into their

sorted spots and verify that this results in a non-decreasing order.

• Verify that no two elements are placed in the same location in the ordered list.

• The Sorter takes O(n2) or O(n log n) time.• The Checker takes O(n) time.• Checker is asymptotically faster than Sorter.

Page 7: Certification of Computational Results Greg Bronevetsky

Convex Hull Problem

Given a set of points on a 2D plane, find a subset of points that forms a convex hull around all the points.

Page 8: Certification of Computational Results Greg Bronevetsky

Convex Hull: Step 1

P1 is the

point with the least x-coordinate.

P6

P2

P8

P3

P5

P1

P7

P4

Points sorted in order of increasing slope relative to P1

Page 9: Certification of Computational Results Greg Bronevetsky

Convex Hull: Invariant

P6

P2

P8

P3

P5

P1

P7

P4

All the points not on the Hull are inside a triangle formed by P1 and two

successive points on the Hull.

Page 10: Certification of Computational Results Greg Bronevetsky

Convex Hull: Invariant

P6

P2

P8

P3

P5

P1

P7

P4

We know that P3 is not a Hull point because the clockwise angle between

lines and

is ≥ 180º.

≥ 180º

P2P3 P3P4

Page 11: Certification of Computational Results Greg Bronevetsky

Convex Hull: Invariant

P6

P2

P8

P3

P5

P1

P7

P4

< 180º

Note that if clockwise angle between lines and

is < 180º, then P

3 is a Hull pointP2P3 P3P4

Page 12: Certification of Computational Results Greg Bronevetsky

Convex Hull Algorithm

• Add P1, P

2 and P

3 to the Hull.

(Note: P1, P

2 and P

n must be on the Hull.)

• For Pk = P

4 to P

n• ... trying to add P

k to the Hull ...

• Let QA and Q

B be the two points most recently

added to the Hull:• While the angle formed by Q

A, Q

B and P

k ≥180

• remove QB from the Hull since it is inside the triangle: P1, QA, Pk.

• Add Pk to the Hull.

Page 13: Certification of Computational Results Greg Bronevetsky

Trail for Convex Hull

• Augment Program to • Output {q

1, q

2, ..., q

m} = the indexes of the points

on the hull.• Output a proof of correctness for {x

1, x

2, ..., x

r} =

all points not on the Hull in the form of the triangle that contains it.

Point not on Convex Hull 3 Surrounding PointsP

3P

1, P

2, P

4

P7

P1, P

6, P

8

Page 14: Certification of Computational Results Greg Bronevetsky

Convex Hull Checker

Checker must check that:• There is a 1-1 correspondence between input

points and {q1, q

2, ..., q

m} U {x

1, x

2, ..., x

r}.

• All points in the triangle proofs correspond to input points.

• Each point in in the triangle proofs actually lies in the given triangle.

• Every triple of supposed Hull points forms a convex angle.

• There is a unique locally maximal point on the hull.

Page 15: Certification of Computational Results Greg Bronevetsky

Asymptotic Runtimes

• Original Convex Hull Algorithm takes O(n log n) time to sort and the Hull construction loop takes only O(n) time.O(n log n)-time total.

• Convex Hull Checker runs thru the set of points once for each check.O(n)-time total.

• Checker asymptotically faster than Original.

Page 16: Certification of Computational Results Greg Bronevetsky

Certification Trails for Data Structures

• Lets have a data structure for storing value/key pairs, ordered lexicographically:(key, val) < (key', val') iff val<val' or (val=val' and key<key')

• Operations:• member(key): returns whether key is mapped to

some val.• insert(key, val): inserts a pair (key, val) into the

data structure.• delete(key): deletes the pair that contains key.

Page 17: Certification of Computational Results Greg Bronevetsky

Data Structure Specs

• Data Structure Operations• changekey(key, newval): executed when the pair

(key, oldval) exists in the data structure. Removes this pair and inserts the pair (key, newval)

• deletemin(): deletes the smallest pair (according to the ordering). Returns “empty” if the data structure contains no pairs.

• predecessor(key): returns the key of the pair thatimmediately precedes key's pair or “smallest”if there is no such pair.

• empty(): returns whether the data structure is empty.

Page 18: Certification of Computational Results Greg Bronevetsky

Data Structure Implementation

• Such a Data Structure can be implemented via an AVL tree, a red-black tree or a b-tree.

• Most operations will take O(log n) time.• We can augement implementations to

generate a certification trail:• insert(key, val): output the key of the

predecessor of the newly inserted pair (key, val). If there is no predecessor, output “smallest”.

• changekey(key, newval): output predecessor of the new pair (key, newval). If there is no predecessor, output “smallest”.

Page 19: Certification of Computational Results Greg Bronevetsky

Data Structure Checker

• A Checker for any program using the above data structure can use the certification trail to implement a much faster data structure.

• All operations can be done in O(1) time.

• Resulting program will be faster than original program. Maybe asymptotically faster.

Page 20: Certification of Computational Results Greg Bronevetsky

Optimized Data Structure

• A doubly linked list of (key, val) pairs, sorted according to the pair ordering relation.

• An array indexed by keys, containing pointers to (key, val) pairs corresponding to the indexes.

• The first pair (with key=0) contains value=sm, which is defined to be smaller than any other possible value.

Page 21: Certification of Computational Results Greg Bronevetsky

Optimized Data Structure

• Optimized data structure operations:• insert(key, val):

• Read from trail prec_key = the key of the pair preceding the new (key, val) pair.

• Check that it is a valid index.• Look at the pair pointed to by array[prec_key].

• Verify that it is ≠null.• Place the (key, val) pair at index key, following the

(prec_key, prec_val) pair. • Check that before the insert() array[key] was =null.• Ensure that (key, val) is greater than its

predecessor and less than its successor.

Page 22: Certification of Computational Results Greg Bronevetsky

Optimized Insert Example

Result of the call insert(5, 62)

Page 23: Certification of Computational Results Greg Bronevetsky

Optimized Data Structure

• Optimized data structure operations:• delete(key): Remove the pair pointed to by array[key].

• Ensure that array[key]≠null.• changekey(key, newval): Call delete(key), followed by

insert(key, newval). These calls will check all necessary conditions.

• deletemin(): Look at the pair that follows the pair (0,sm) (pointed to by array[0]).

• If no such pair, return “empty”. • Else, if there exists pair (key, val), then remove it and

set array[key] to null.• empty(): Return whether there is a pair following the pair

(0,sm).

Page 24: Certification of Computational Results Greg Bronevetsky

Optimized Data Structure

• Optimized data structure operations:• member(key): return whether array[key]=null.• predecessor(key):

• Look at the pair pointed to by array[key].• Follow its backward link to its predecessor pair.• If the predecessor pair is (0,sm) then return “smallest”.• Else, return the key field of that pair.

• Note that all the operations can be done in O(1) time.

Page 25: Certification of Computational Results Greg Bronevetsky

Shortest Path

• A Shortest Path algorithm was implemented using the above algorithm.

• The original program used the original data structure that produced a certification trail.

• The checker version was identical to the original except that its data structure was the optimized version that used the trail.

• Original runtime = O(m•log n)• Checker runtime = O(m)• (m=number of edges, n=number of nodes)

Page 26: Certification of Computational Results Greg Bronevetsky

Performance: Sort

• Basic Algorithm – Sorting algorithm with no certification trails.

• 1st Execution – Sorter that produces certification trail.• 2nd Execution – Checking algorithm that uses the trail.• Speedup – factor of improvement of 2nd vs Basic.• %Savings – of 1st + 2nd trails execution over running

Basic twice.

Size Basic Speedup % SavingsAlgorithm (Generates Trail) (Uses Trail)

10000 0.28 0.30 0.04 7.00 39.29%50000 1.80 1.90 0.19 9.47 41.94%

100000 3.96 4.08 0.41 9.66 43.31%500000 23.95 24.69 2.14 11.19 43.99%1000000 50.23 51.57 4.38 11.47 44.31%

1st Execution 2nd Execution

Page 27: Certification of Computational Results Greg Bronevetsky

Performance: Sort

Size Basic Speedup % SavingsAlgorithm (Generates Trail) (Uses Trail)

10000 0.28 0.30 0.04 7.00 39.29%50000 1.80 1.90 0.19 9.47 41.94%

100000 3.96 4.08 0.41 9.66 43.31%500000 23.95 24.69 2.14 11.19 43.99%1000000 50.23 51.57 4.38 11.47 44.31%

1st Execution 2nd Execution

Page 28: Certification of Computational Results Greg Bronevetsky

Performance: Convex HullSize Basic Speedup % Savings

Algorithm (Generates Trail) (Uses Trail)5000 0.61 0.62 0.07 8.73 43.62%10000 1.33 1.34 0.14 9.56 44.54%25000 3.68 3.68 0.36 10.22 45.12%50000 7.68 7.74 0.71 10.75 44.94%

100000 16.23 16.30 1.43 11.35 45.39%200000 33.93 34.37 2.84 11.94 45.16%

1st Execution 2nd Execution

Page 29: Certification of Computational Results Greg Bronevetsky

Performance: Shortest PathSize Basic Speedup % Savings(n,m) Algorithm (Generates Trail) (Uses Trail)

100,1000 0.04 0.05 0.02 2.00 12.50%250,2500 0.15 0.16 0.06 2.50 26.67%500,5000 0.31 0.33 0.11 2.82 29.03%

100,10000 0.70 0.76 0.23 3.04 29.29%2000,20000 1.58 1.67 0.45 3.51 32.91%2500,25000 2.06 2.15 0.55 3.75 34.47%

1st Execution 2nd Execution

Page 30: Certification of Computational Results Greg Bronevetsky

Summary of Experiments

• The overhead of generating a certification trail is about 2%.

• The checker run is much faster than the original. It can be run on much slower hardware or use a formally verified language.

Page 31: Certification of Computational Results Greg Bronevetsky

Application to Byzantine Failures

• Current technique is completely manual. No known way to automatically convert a program to generate a trail.

• We may develop libraries that use the Certification Trails technique, allowing us to catch errors in a large fraction of a program.

• Door open to Failure Recovery: when an error is detected the checker goes back to using original code to redo the work.