using r for the design and analysis of computer ...abramson+peachey.pdf · using r for the design...
TRANSCRIPT
Using R for the design and analysis of computer experiments with the Nimrod toolkit 1
Using R for the design andanalysis of computer
experiments with the Nimrodtoolkit
Neil Diamond1, David Abramson2,Tom Peachey2
1. Department of Econometrics and Business Statistics
2. Caulfield School of Information Technology
1 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 2
Computer Experiments
The design and analysis of computerexperiments to explore the behavior of complexsystems is becoming increasingly important inscience and engineering.
2 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 2
Computer Experiments
The design and analysis of computerexperiments to explore the behavior of complexsystems is becoming increasingly important inscience and engineering.At least two books on the topic:
2 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 2
Computer Experiments
The design and analysis of computerexperiments to explore the behavior of complexsystems is becoming increasingly important inscience and engineering.At least two books on the topic:
The Design and Analysis of ComputerExperiments. T. J. Santner, B. J. Williamns, W.INotz. (2003), Springer: New York.
2 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 2
Computer Experiments
The design and analysis of computerexperiments to explore the behavior of complexsystems is becoming increasingly important inscience and engineering.At least two books on the topic:
The Design and Analysis of ComputerExperiments. T. J. Santner, B. J. Williamns, W.INotz. (2003), Springer: New York.Design and Modeling for Computer Experiments.K-T. Fang, R. Li, A. Sudjianto. (2006), Chapman& Hall/CRC: London.
2 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 2
Computer Experiments
The design and analysis of computerexperiments to explore the behavior of complexsystems is becoming increasingly important inscience and engineering.At least two books on the topic:
The Design and Analysis of ComputerExperiments. T. J. Santner, B. J. Williamns, W.INotz. (2003), Springer: New York.Design and Modeling for Computer Experiments.K-T. Fang, R. Li, A. Sudjianto. (2006), Chapman& Hall/CRC: London.
Some R packages-more on that later.
2 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 3
Nimrod
Developed by Computer Scientists at MonashUniversity’s eScience and Grid EngineeringLaboratory.
3 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 3
Nimrod
Developed by Computer Scientists at MonashUniversity’s eScience and Grid EngineeringLaboratory.
Automates the formulation, running, andcollation of the individual experiments.
3 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 3
Nimrod
Developed by Computer Scientists at MonashUniversity’s eScience and Grid EngineeringLaboratory.
Automates the formulation, running, andcollation of the individual experiments.
Includes a distributed scheduling componentthat can manage the scheduling of individualjobs.
3 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 4
Nimrod Set of Tools
Nimrod contains tools to
perform a complete parameter sweep across allpossible combinations (Nimrod/G),
4 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 4
Nimrod Set of Tools
Nimrod contains tools to
perform a complete parameter sweep across allpossible combinations (Nimrod/G),
search using non-linear optimization algorithms(Nimrod/O),
4 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 4
Nimrod Set of Tools
Nimrod contains tools to
perform a complete parameter sweep across allpossible combinations (Nimrod/G),
search using non-linear optimization algorithms(Nimrod/O),
or use fractional factorial design techniques(Nimrod/E).
4 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 4
Nimrod Set of Tools
Nimrod contains tools to
perform a complete parameter sweep across allpossible combinations (Nimrod/G),
search using non-linear optimization algorithms(Nimrod/O),
or use fractional factorial design techniques(Nimrod/E).
4 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 4
Nimrod Set of Tools
Nimrod contains tools to
perform a complete parameter sweep across allpossible combinations (Nimrod/G),
search using non-linear optimization algorithms(Nimrod/O),
or use fractional factorial design techniques(Nimrod/E).
These can be run stand-alone or accessed via theNimrod portal
4 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
Ecology
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
Ecology
Quantum Chemistry
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
Ecology
Quantum Chemistry
CAD Digital Simulation
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
Ecology
Quantum Chemistry
CAD Digital Simulation
Antenna Design
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 5
Nimrod Applications
Nimrod has been used in an extensive range ofapplications
Air Pollution Studies
Laser Physics
Ecology
Quantum Chemistry
CAD Digital Simulation
Antenna Design
Cardiac Modelling
5 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 6
Workflow Engines
There are a number of workflow engines whichprovide scientists with an environment withwhich they can manage data, the workflows ofthe various analytical steps in theirinvestigation, and summaries of findings.
6 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 6
Workflow Engines
There are a number of workflow engines whichprovide scientists with an environment withwhich they can manage data, the workflows ofthe various analytical steps in theirinvestigation, and summaries of findings.
Although existing workflow systems can specifyarbitrary parallel programs, they are typicallynot effective with large and variable parallelism.
6 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 6
Workflow Engines
There are a number of workflow engines whichprovide scientists with an environment withwhich they can manage data, the workflows ofthe various analytical steps in theirinvestigation, and summaries of findings.
Although existing workflow systems can specifyarbitrary parallel programs, they are typicallynot effective with large and variable parallelism.
Similarly, Nimrod was not designed to executearbitrary workflows.
6 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 6
Workflow Engines
There are a number of workflow engines whichprovide scientists with an environment withwhich they can manage data, the workflows ofthe various analytical steps in theirinvestigation, and summaries of findings.
Although existing workflow systems can specifyarbitrary parallel programs, they are typicallynot effective with large and variable parallelism.
Similarly, Nimrod was not designed to executearbitrary workflows.
Thus, it is difficult to run sweeps overworkflows, and workflows containing sweeps.
6 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 7
Nimrod/K
To overcome these problems, a new tool(Nimrod/K) is being developed, based on theKepler workflow engine (Kepler Core, 2009).
7 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 7
Nimrod/K
To overcome these problems, a new tool(Nimrod/K) is being developed, based on theKepler workflow engine (Kepler Core, 2009).
It leverages a number of the techniquesdeveloped in the earlier Nimrod tools fordistributing tasks to the Grid.
7 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 7
Nimrod/K
To overcome these problems, a new tool(Nimrod/K) is being developed, based on theKepler workflow engine (Kepler Core, 2009).
It leverages a number of the techniquesdeveloped in the earlier Nimrod tools fordistributing tasks to the Grid.
Kepler allows the user to specify R expressionsand access R objects as part of the scientificworkflow.
7 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 8
Example Workflow
8 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 9
Statistical Approach to ComputerExperiments
Unlike physical experiments, repeatedexperiments give the same results.
9 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 9
Statistical Approach to ComputerExperiments
Unlike physical experiments, repeatedexperiments give the same results.
Model the output as the realisation of astochastic process with a correlation structurethat depends on a distance to other points inthe experiment.
9 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 9
Statistical Approach to ComputerExperiments
Unlike physical experiments, repeatedexperiments give the same results.
Model the output as the realisation of astochastic process with a correlation structurethat depends on a distance to other points inthe experiment.
Allows estimates of untried experiments.
9 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 9
Statistical Approach to ComputerExperiments
Unlike physical experiments, repeatedexperiments give the same results.
Model the output as the realisation of astochastic process with a correlation structurethat depends on a distance to other points inthe experiment.
Allows estimates of untried experiments.
Gives an estimate of the uncertainty.
9 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 10
Computer Experiments-Designs
Simplest method-Latin Hypercubes
10 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 10
Computer Experiments-Designs
Simplest method-Latin Hypercubes
Other more sophisticated methods includeOrthogonal Arrays and Scrambled Nets.
10 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 10
Computer Experiments-Designs
Simplest method-Latin Hypercubes
Other more sophisticated methods includeOrthogonal Arrays and Scrambled Nets.
Various space filling designs.
10 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 11
Computer Experiments-Model
Response = Linear Model + Departure
y(x) = β + z(x)
E (z(x) = 0
Cov(z(t), z(u)) = σ2z
d∏
j=1
Rj(tj , uj)
Rj(tj , uj) = exp [−θj(tj − uj)pj ]
11 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 12
MLE of θ, p, β, and σ2
Reduces to numerically optimising
−
1
2(n ln σ2 + ln detRD)
RD = Matrix of correlations for design points
β = (1TR−1D 1T )−11TR−1
D y
σ2 =1
n(y − 1β)TR−1
D (y − 1β)
12 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 13
Best Linear Unbiased Predictor foran untried x
yx = β + rT (x)R−1D (y − 1β)
where
r(x) = [R(x1, x), R(x2, x), . . . , R(xn, x)]T
Design point : [x1, x2, . . . , xn] Untried Input : x
Interpolates the data points.
13 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCO
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCOEmulator
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCOEmulatorApproximator
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCOEmulatorApproximatorCalibrator
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCOEmulatorApproximatorCalibrator
mlegp: an R package for Gaussian processmodeling and sensitivity analysis
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 14
Implementations in R
BACCOEmulatorApproximatorCalibrator
mlegp: an R package for Gaussian processmodeling and sensitivity analysis
Certainly others . . .
14 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 15
Example Workflow
15 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 16
Latin Hypercube Actor
16 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 17
Latin Hypercube Design
17 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 18
Nimrod/K Actor
Nimrod takes the
experimental design and
controls the running of
the experiments and
collation of results.
18 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 18
Nimrod/K Actor
Nimrod takes the
experimental design and
controls the running of
the experiments and
collation of results.
Passes the results onto
mlegp actor which fits the
Gaussian model to the
data.
18 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 19
mlegp predictions Actor
Takes fitted model and
predicts at a grid of
untried inputs.
19 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 19
mlegp predictions Actor
Takes fitted model and
predicts at a grid of
untried inputs.
Inputs are the granularity
of the grid, and which are
the primary and
conditioning inputs.
19 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 19
mlegp predictions Actor
Takes fitted model and
predicts at a grid of
untried inputs.
Inputs are the granularity
of the grid, and which are
the primary and
conditioning inputs.
Uses Lattice graphics to
produce a visualisation of
the surface.
19 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 20
Visualisation
x1x2
yhat
: x3 0
x1x2
yhat
: x3 8.33
x1x2
yhat
: x3 16.67
x1x2
yhat
: x3 25
0
2
4
6
8
10
20 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 21
Key Message
Computer Experiments are very important.
21 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 21
Key Message
Computer Experiments are very important.
Many tools in R both to design and analysecomputer experiments.
21 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 21
Key Message
Computer Experiments are very important.
Many tools in R both to design and analysecomputer experiments.
Nimrod tools are convenient in managing theexecution of the computer experiments.
21 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 21
Key Message
Computer Experiments are very important.
Many tools in R both to design and analysecomputer experiments.
Nimrod tools are convenient in managing theexecution of the computer experiments.
Using Nimrod/K takes advantage of the Keplerworkflow engine.
21 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 21
Key Message
Computer Experiments are very important.
Many tools in R both to design and analysecomputer experiments.
Nimrod tools are convenient in managing theexecution of the computer experiments.
Using Nimrod/K takes advantage of the Keplerworkflow engine.
Kepler and R are integrated, making it easy touse existing packages in R for computerexperiments, and extends their usefulness.
21 / 22
Using R for the design and analysis of computer experiments with the Nimrod toolkit 22
MeSsAGE Lab
Monash eScience and Grid Engineering Laboratoryhttp://messagelab.monash.edu.au/
22 / 22