07-05-hart exact.ppt

21
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. EXACT The EXperimental Algorithmics Computational Toolkit William E. Hart Jonathan W. Berry Robert Heaphy Cynthia A. Phillips Discrete Algorithms and Math, 1415 Sandia National Laboratories

Upload: softwarecentral

Post on 29-Jun-2015

393 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 07-05-Hart EXACT.ppt

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration

under contract DE-AC04-94AL85000.

EXACT

The EXperimental Algorithmics Computational Toolkit

William E. HartJonathan W. Berry

Robert HeaphyCynthia A. Phillips

Discrete Algorithms and Math, 1415Sandia National Laboratories

Page 2: 07-05-Hart EXACT.ppt

Slide 2

Overview

GOAL: Provide a software framework for defining and analyzing computational experiments

• Managing computational experiments– Systematic control is needed for large-scale

experimentation– Design of experiments to limit the cost of experimentation– Archiving experimental results in a standard manner– Integration of statistical analysis capabilities

• Applications– Experimental evaluation of heuristics– Comparisons between algorithms – Robust (user) parameter settings (over many problem

domains)

Page 3: 07-05-Hart EXACT.ppt

Slide 3

Overview (Motivation Continued)

• Software testing– Automation of tests – Flexible notion of what a “test” means– Integration with diagnostic tools (e.g. valgrind, lcov)– Distributed test management and test summary

Observation: testing of large complex software begins to look like a computational experiment

Example: integer programming solver– Lots of algorithmic parameters– Lots of hard test problems– Costly tests

Page 4: 07-05-Hart EXACT.ppt

Slide 4

Related Work

• ExpLab– Interactive scripts for setting up and performing

computational experiments, including tools to archive data

• Reseach Assistant– Strong focus on archiving of data/environment/software

to ensure reproducibility

• Condor– Distributed execution framework

• Software Testing Frameworks– There are many of these…– Focus: execution of codes and evaluation of final “result”

Page 5: 07-05-Hart EXACT.ppt

Slide 5

EXACT’s Niche

Strengths:– Integration of DOE tools– Experiment automation– Self-contained, portable tool– Very generic application interface– Support for generic “Analysis” modules– Simple specification of parallel tests

Domain of Application:– Experiments to test theoretical results– “Horse Race” experiments– Benchmarking– Software testing

Page 6: 07-05-Hart EXACT.ppt

Slide 6

EXACT Data/Execution (1)

exact hashfn.study.xml

DOE factors_file

The EXACT script can generate an experimental design with an external code. By default, EXACT uses a full factorial design.

This process generates a set of experimental treatments that will be executed in this

experiment.

Page 7: 07-05-Hart EXACT.ppt

Slide 7

EXACT Data/Execution (2)

exact hashfn.study.xml

hashfn_script test.in test.out test.log

The EXACT script launches a user-defined script to execute each experimental treatment. Measurements are extracted from the *.log file

to generate a *.out file.

Repetitions with random number seeds can also be specified

Page 8: 07-05-Hart EXACT.ppt

Slide 8

EXACT Data/Execution (3)

exact hashfn.study.xml

hashfn

exp1 exp2

Output files are combined to generate a

*.results.xml file of experimental measurements.

One or more results files can be analyzed to

generate a *.analysis.xml

file.

Page 9: 07-05-Hart EXACT.ppt

Slide 9

XML Description

<experimental-study name=“example1”> <tags> <tag> example </tag> </tags>

<experiment name=“ht”> <factors> <factor name=“hashfn”> <level> Jenkins </level> <level> FNV </level> </factor> </factors>

Page 10: 07-05-Hart EXACT.ppt

Slide 10

XML example continued

<controls> <executable> hash_script </executable></controls></experiment>

<analysis name=“LoadFactorUB” type=“validation”> <data experiment = “ht”/> <options> _measurement = LoadFactor _value = 0.75 </options></analysis>

</experimental-study>

Page 11: 07-05-Hart EXACT.ppt

Slide 11

EXACT Input File

_exact_debug 0_experiment_name example1.ht_test_name 3_num_trials 1

Seed $PSEUDORANDOM_SEED

_factor_1_name hashfn_factor_1_level level_1_factor_1_value Jenkins

_factor_2_name collisions_factor_2_level level_2_factor_2_value linear-probing

Page 12: 07-05-Hart EXACT.ppt

Slide 12

EXACT Measurement File

“Number of Evaluations” numeric/integer 110

“Best Value” numeric/double 0.0001231

“Termination Condition” text/string “Max Evals Limit”

exit_status numeric/integer 0

Page 13: 07-05-Hart EXACT.ppt

Slide 13

XML Specification with Experimental Options

<factors> <factor name="search"> <level> </level> <level>initialDive=true</level> <level>initialDive=true integralityDive=true</level> </factor> <factor name="problem"> <level>_data=bm23 _optimum=34 _opttol=1e-8</level> <level>_data=p0033 _optimum=3089

_opttol=1e-6</level> </factor></factors>

Page 14: 07-05-Hart EXACT.ppt

Slide 14

The FAST Project

Overview:– FAST supports a generic mechanism for nightly testing

and data gathering– Supports general-purpose clients and servers for nightly

testing and code evaluations.– Uses CVS commits to work around restrictive firewalls

Impact– General framework for coordinating nightly builds– Supports “code checks” – analyses that assist in SW

management•Bugzilla summaries, commit activity, copyright

documentation, analysis of subversion externals

Page 15: 07-05-Hart EXACT.ppt

Slide 15

EXACT/FAST Impact

• Software testing– Being used to manage computational tests for several

code projects: DAKOTA, Acro, SPOT, Zoltan, …

• Interactive experimentation– Being used to for computational experiments in ongoing

research

• Bug diagnosis– Has found “bugs” not reported by the previous testing

techniques in Acro and Zoltan– Nightly archive has been useful for archealogical

debugging

Page 16: 07-05-Hart EXACT.ppt

Slide 16

EXAMPLE: Acro Software Quality Management

Support for SQA activities is critical to successful software development

Software stability is necessary for application impact– Acro solvers are integrated into frameworks like DAKOTA– PICO solver application is being deployed to the EPA

FAST/EXACT– Integration of distributed testing results– Configuration and build statistics– General-purpose computer experiments

•Computational experimental design•Validation, benchmarking, performance comparisons,

etc…

Page 17: 07-05-Hart EXACT.ppt

Slide 17

Acro SQA

Nightly testing– Application interface tests

•DAKOTA, AutoDock, PDock– Per-project build tests

•Different configuration setups (e.g. with/without MPI)– Solver validation tests– Portability tests – builds on 10+ different systems– Target platforms

•Linux, Solaris, Darwin, IRIX, OSF, AIX, Cygwin, RedStorm, Windows

Nightly email summaries w/ links to detailed web page summary

Nightly summary of code stats: commits, documentation info, copyright blurbs, etc…

Page 18: 07-05-Hart EXACT.ppt

Slide 18

Current/Future Directions

• Experimantal Design– More DOE tools and DOE analysis

• Experimental Control– Randomization of experiments, blocking, etc.

• Statistical analysis tools– Interface with R

• GUI Interface– Setup XML input and experimental script automatically

• Documentation (esp. an EXACT tutorial)

Page 19: 07-05-Hart EXACT.ppt

Slide 19

More Information (and downloads)

Released under gnu lesser public license:

http://software.sandia.gov/Acro/EXACT

http://software.sandia.gov/Acro/FAST

Please send questions/comments to me:

Bill [email protected]

Page 20: 07-05-Hart EXACT.ppt

Slide 20

Thank You!

Page 21: 07-05-Hart EXACT.ppt

Slide 21

EXACT Capabilities

Types of Experimental Designs:

– Full factorial– Orthogonal and Nearly Orthogonal Arrays (Dr. Xu, UCLA)

Types of Analyses:

– Validation of experimental measurements– Baseline comparisons between experiments– Comparison of relative performance– Graphical code coverage summary (using lcov)– Statistical analysis tools (from R) are being integrated