boston talk

63
ABC model choice: beyond summary selection Christian P. Robert Universit´ e Paris-Dauphine, Paris & University of Warwick, Coventry JSM 2014, Boston August 6, 2014

Upload: christian-robert

Post on 10-May-2015

1.896 views

Category:

Science


0 download

DESCRIPTION

JSM 2014, Model choice session

TRANSCRIPT

Page 1: Boston talk

ABC model choice: beyond summary selection

Christian P. RobertUniversite Paris-Dauphine, Paris & University of Warwick, Coventry

JSM 2014, Boston

August 6, 2014

Page 2: Boston talk

Outline

Joint work with J.-M. Cornuet, A. Estoup, J.-M. Marin, and P.Pudlo

Intractable likelihoods

ABC methods

ABC model choice via random forests

Conclusion

Page 3: Boston talk

Intractable likelihood

Case of a well-defined statistical model where the likelihoodfunction

`(θ|y) = f (y1, . . . , yn|θ)

I is (really!) not available in closed form

I cannot (easily!) be either completed or demarginalised

I cannot be (at all!) estimated by an unbiased estimator

I examples of latent variable models of high dimension,including combinatorial structures (trees, graphs), missingconstant f (x |θ) = g(y , θ)

/Z (θ) (eg. Markov random fields,

exponential graphs,. . . )

c© Prohibits direct implementation of a generic MCMC algorithmlike Metropolis–Hastings which gets stuck exploring missingstructures

Page 4: Boston talk

Intractable likelihood

Case of a well-defined statistical model where the likelihoodfunction

`(θ|y) = f (y1, . . . , yn|θ)

I is (really!) not available in closed form

I cannot (easily!) be either completed or demarginalised

I cannot be (at all!) estimated by an unbiased estimator

c© Prohibits direct implementation of a generic MCMC algorithmlike Metropolis–Hastings which gets stuck exploring missingstructures

Page 5: Boston talk

Getting approximative

Case of a well-defined statistical model where the likelihoodfunction

`(θ|y) = f (y1, . . . , yn|θ)

is out of reach

Empirical A to the original B problem

I Degrading the data precision down to tolerance level ε

I Replacing the likelihood with a non-parametric approximation

I Summarising/replacing the data with insufficient statistics

Page 6: Boston talk

Getting approximative

Case of a well-defined statistical model where the likelihoodfunction

`(θ|y) = f (y1, . . . , yn|θ)

is out of reach

Empirical A to the original B problem

I Degrading the data precision down to tolerance level ε

I Replacing the likelihood with a non-parametric approximation

I Summarising/replacing the data with insufficient statistics

Page 7: Boston talk

Getting approximative

Case of a well-defined statistical model where the likelihoodfunction

`(θ|y) = f (y1, . . . , yn|θ)

is out of reach

Empirical A to the original B problem

I Degrading the data precision down to tolerance level ε

I Replacing the likelihood with a non-parametric approximation

I Summarising/replacing the data with insufficient statistics

Page 8: Boston talk

Approximate Bayesian computation

Intractable likelihoods

ABC methodsabc of ABCSummary statisticABC for model choice

ABC model choice via randomforests

Conclusion

Page 9: Boston talk

Genetic background of ABC

ABC is a recent computational technique that only requires beingable to sample from the likelihood f (·|θ)

This technique stemmed from population genetics models, about15 years ago, and population geneticists still significantlycontribute to methodological developments of ABC.

[Griffith & al., 1997; Tavare & al., 1999]

Page 10: Boston talk

ABC methodology

Bayesian setting: target is π(θ)f (x |θ)When likelihood f (x |θ) not in closed form, likelihood-free rejectiontechnique:

Foundation

For an observation y ∼ f (y|θ), under the prior π(θ), if one keepsjointly simulating

θ′ ∼ π(θ) , z ∼ f (z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y,then the selected

θ′ ∼ π(θ|y)

[Rubin, 1984; Diggle & Gratton, 1984; Tavare et al., 1997]

Page 11: Boston talk

ABC methodology

Bayesian setting: target is π(θ)f (x |θ)When likelihood f (x |θ) not in closed form, likelihood-free rejectiontechnique:

Foundation

For an observation y ∼ f (y|θ), under the prior π(θ), if one keepsjointly simulating

θ′ ∼ π(θ) , z ∼ f (z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y,then the selected

θ′ ∼ π(θ|y)

[Rubin, 1984; Diggle & Gratton, 1984; Tavare et al., 1997]

Page 12: Boston talk

ABC methodology

Bayesian setting: target is π(θ)f (x |θ)When likelihood f (x |θ) not in closed form, likelihood-free rejectiontechnique:

Foundation

For an observation y ∼ f (y|θ), under the prior π(θ), if one keepsjointly simulating

θ′ ∼ π(θ) , z ∼ f (z|θ′) ,

until the auxiliary variable z is equal to the observed value, z = y,then the selected

θ′ ∼ π(θ|y)

[Rubin, 1984; Diggle & Gratton, 1984; Tavare et al., 1997]

Page 13: Boston talk

A as A...pproximative

When y is a continuous random variable, strict equality z = y isreplaced with a tolerance zone

ρ(y, z) 6 ε

where ρ is a distanceOutput distributed from

π(θ)Pθ{ρ(y, z) < ε}def∝ π(θ|ρ(y, z) < ε)

[Pritchard et al., 1999]

Page 14: Boston talk

A as A...pproximative

When y is a continuous random variable, strict equality z = y isreplaced with a tolerance zone

ρ(y, z) 6 ε

where ρ is a distanceOutput distributed from

π(θ)Pθ{ρ(y, z) < ε}def∝ π(θ|ρ(y, z) < ε)

[Pritchard et al., 1999]

Page 15: Boston talk

ABC recap

Likelihood free rejectionsamplingTavare et al. (1997) Genetics

1) Set i = 1,

2) Generate θ ′ from the

prior distribution,

3) Generate z ′ from the

likelihood f (·|θ ′),

4) If ρ(η(z ′),η(y)) 6 ε, set

(θi , zi ) = (θ ′, z ′) and

i = i + 1,

5) If i 6 N, return to 2).

Keep θ’s such that the distancebetween the correspondingsimulated dataset and theobserved dataset is smallenough.

Tuning parameters

I ε > 0: tolerance level,

I η(z): function thatsummarizes datasets,

I ρ(η,η ′): distance betweenvectors of summarystatistics

I N: size of the output

Page 16: Boston talk

Which summary?

Fundamental difficulty of the choice of the summary statistic whenthere is no non-trivial sufficient statistics [except when done by theexperimenters in the field]

I Loss of statistical information balanced against gain in dataroughening

I Approximation error and information loss remain unknown

I Choice of statistics induces choice of distance functiontowards standardisation

I may be imposed for external/practical reasons (e.g., DIYABC)

I may gather several non-B point estimates

I can learn about efficient combination

Page 17: Boston talk

Which summary?

Fundamental difficulty of the choice of the summary statistic whenthere is no non-trivial sufficient statistics [except when done by theexperimenters in the field]

I Loss of statistical information balanced against gain in dataroughening

I Approximation error and information loss remain unknown

I Choice of statistics induces choice of distance functiontowards standardisation

I may be imposed for external/practical reasons (e.g., DIYABC)

I may gather several non-B point estimates

I can learn about efficient combination

Page 18: Boston talk

Which summary?

Fundamental difficulty of the choice of the summary statistic whenthere is no non-trivial sufficient statistics [except when done by theexperimenters in the field]

I Loss of statistical information balanced against gain in dataroughening

I Approximation error and information loss remain unknown

I Choice of statistics induces choice of distance functiontowards standardisation

I may be imposed for external/practical reasons (e.g., DIYABC)

I may gather several non-B point estimates

I can learn about efficient combination

Page 19: Boston talk

attempts at summaries

How to choose the set of summary statistics?

I Joyce and Marjoram (2008, SAGMB)

I Nunes and Balding (2010, SAGMB)

I Fearnhead and Prangle (2012, JRSS B)

I Ratmann et al. (2012, PLOS Comput. Biol)

I Blum et al. (2013, Statistical science)

I EP-ABC of Barthelme & Chopin (2013, JASA)

Page 20: Boston talk

ABC for model choice

Intractable likelihoods

ABC methodsabc of ABCSummary statisticABC for model choice

ABC model choice via random forests

Conclusion

Page 21: Boston talk

ABC model choice

ABC model choiceA) Generate large set of

(m,θ, z) from the Bayesian

predictive, π(m)πm(θ)fm(z|θ)

B) Keep particles (m,θ, z) such

that ρ(η(y),η(z)) 6 ε

C) For each m, return

pm = proportion of m among

remaining particles

If ε is tuned so that end numberof particles is k , then pm is ak-nearest neighbor estimate of

p({M = m

}∣∣∣η(y))Approximating posteriorprobabilities of models is aregression problem where

I response is 1{M = m

},

I co-variables are thesummary statistics η(z),

I loss is, e.g., L2

(conditional expectation)

Prefered method forapproximating posteriorprobabilities in DIYABC islocal polylogistic regression

Page 22: Boston talk

Machine learning perspective

ABC model choice

A) Generate a large set

of (m,θ, z)’s from

Bayesian predictive,

π(m)πm(θ)fm(z|θ)

B) Use machine learning

tech. to infer on

π(m∣∣η(y))

In this perspective:

I (iid) “data set” referencetable simulated duringstage A)

I observed y becomes anew data point

Note that:

I predicting m is aclassification problem⇐⇒ select the bestmodel based on amaximal a posteriori rule

I computing π(m|η(y)) isa regression problem⇐⇒ confidence in eachmodel

classification much simplerthan regression (e.g., dim. ofobjects we try to learn)

Page 23: Boston talk

Warning

the lost of information induced by using non sufficientsummary statistics is a real problem

Fundamental discrepancy between the genuine Bayesfactors/posterior probabilities and the Bayes factors based onsummary statistics. See, e.g.,

I Didelot et al. (2011, Bayesian analysis)

I Robert, Cornuet, Marin and Pillai (2011, PNAS)

I Marin, Pillai, Robert, Rousseau (2014, JRSS B)

I . . .

We suggest using instead a machine learning approach that candeal with a large number of correlated summary statistics:

E.g., random forests

Page 24: Boston talk

ABC model choice via random forests

Intractable likelihoods

ABC methods

ABC model choice via random forestsRandom forestsABC with random forestsIllustrations

Conclusion

Page 25: Boston talk

Leaning towards machine learning

Main notions:

I ABC-MC seen as learning about which model is mostappropriate from a huge (reference) table

I exploiting a large number of summary statistics not an issuefor machine learning methods intended to estimate efficientcombinations

I abandoning (temporarily?) the idea of estimating posteriorprobabilities of the models, poorly approximated by machinelearning methods, and replacing those by posterior predictiveexpected loss

Page 26: Boston talk

Random forests

Technique that stemmed from Leo Breiman’s bagging (orbootstrap aggregating) machine learning algorithm for bothclassification and regression

[Breiman, 1996]

Improved classification performances by averaging overclassification schemes of randomly generated training sets, creatinga “forest” of (CART) decision trees, inspired by Amit and Geman(1997) ensemble learning

[Breiman, 2001]

Page 27: Boston talk

Growing the forest

Breiman’s solution for inducing random features in the trees of theforest:

I boostrap resampling of the dataset and

I random subset-ing [of size√t] of the covariates driving the

classification at every node of each tree

Covariate xτ that drives the node separation

xτ ≷ cτ

and the separation bound cτ chosen by minimising entropy or Giniindex

Page 28: Boston talk

Breiman and Cutler’s algorithm

Algorithm 1 Random forests

for t = 1 to T do//*T is the number of trees*//Draw a bootstrap sample of size nboot 6= nGrow an unpruned decision treewhile leaves are not monochrome do

Select ntry of the predictors at randomDetermine the best split from among those predictors

end whileend forPredict new data by aggregating the predictions of the T trees

Page 29: Boston talk

ABC with random forests

Idea: Starting with

I possibly large collection of summary statistics (s1i , . . . , spi )(from scientific theory input to available statistical softwares,to machine-learning alternatives)

I ABC reference table involving model index, parameter valuesand summary statistics for the associated simulatedpseudo-data

run R randomforest to infer M from (s1i , . . . , spi )

Page 30: Boston talk

ABC with random forests

Idea: Starting with

I possibly large collection of summary statistics (s1i , . . . , spi )(from scientific theory input to available statistical softwares,to machine-learning alternatives)

I ABC reference table involving model index, parameter valuesand summary statistics for the associated simulatedpseudo-data

run R randomforest to infer M from (s1i , . . . , spi )

at each step O(√p) indices sampled at random and most

discriminating statistic selected, by minimising entropy Gini loss

Page 31: Boston talk

ABC with random forests

Idea: Starting with

I possibly large collection of summary statistics (s1i , . . . , spi )(from scientific theory input to available statistical softwares,to machine-learning alternatives)

I ABC reference table involving model index, parameter valuesand summary statistics for the associated simulatedpseudo-data

run R randomforest to infer M from (s1i , . . . , spi )

Average of the trees is resulting summary statistics, highlynon-linear predictor of the model index

Page 32: Boston talk

Outcome of ABC-RF

Random forest predicts a (MAP) model index, from the observeddataset: The predictor provided by the forest is “sufficient” toselect the most likely model but not to derive associated posteriorprobability

I exploit entire forest by computing how many trees lead topicking each of the models under comparison but variabilitytoo high to be trusted

I frequency of trees associated with majority model is no propersubstitute to the true posterior probability

I And usual ABC-MC approximation equally highly variable andhard to assess

Page 33: Boston talk

Outcome of ABC-RF

Random forest predicts a (MAP) model index, from the observeddataset: The predictor provided by the forest is “sufficient” toselect the most likely model but not to derive associated posteriorprobability

I exploit entire forest by computing how many trees lead topicking each of the models under comparison but variabilitytoo high to be trusted

I frequency of trees associated with majority model is no propersubstitute to the true posterior probability

I And usual ABC-MC approximation equally highly variable andhard to assess

Page 34: Boston talk

Posterior predictive expected losses

We suggest replacing unstable approximation of

P(M = m|xo)

with xo observed sample and m model index, by average of theselection errors across all models given the data xo ,

P(M(X ) 6= M|xo)

where pair (M,X ) generated from the predictive∫f (x |θ)π(θ,M|xo)dθ

and M(x) denotes the random forest model (MAP) predictor

Page 35: Boston talk

Posterior predictive expected losses

Arguments:

I Bayesian estimate of the posterior error

I integrates error over most likely part of the parameter space

I gives an averaged error rather than the posterior probability ofthe null hypothesis

I easily computed: Given ABC subsample of parameters fromreference table, simulate pseudo-samples associated withthose and derive error frequency

Page 36: Boston talk

toy: MA(1) vs. MA(2)

Comparing an MA(1) and an MA(2) models:

xt = εt − ϑ1εt−1[−ϑ2εt−2]

Earlier illustration using first two autocorrelations as S(x)[Marin et al., Stat. & Comp., 2011]

Result #1: values of p(m|x) [obtained by numerical integration]and p(m|S(x)) [obtained by mixing ABC outcome and densityestimation] highly differ!

Page 37: Boston talk

toy: MA(1) vs. MA(2)

Difference between the posterior probability of MA(2) given eitherx or S(x). Blue stands for data from MA(1), orange for data fromMA(2)

Page 38: Boston talk

toy: MA(1) vs. MA(2)

Comparing an MA(1) and an MA(2) models:

xt = εt − ϑ1εt−1[−ϑ2εt−2]

Earlier illustration using two autocorrelations as S(x)[Marin et al., Stat. & Comp., 2011]

Result #2: Embedded models, with simulations from MA(1)within those from MA(2), hence linear classification poor

Page 39: Boston talk

toy: MA(1) vs. MA(2)

Simulations of S(x) under MA(1) (blue) and MA(2) (orange)

Page 40: Boston talk

toy: MA(1) vs. MA(2)

Comparing an MA(1) and an MA(2) models:

xt = εt − ϑ1εt−1[−ϑ2εt−2]

Earlier illustration using two autocorrelations as S(x)[Marin et al., Stat. & Comp., 2011]

Result #3: On such a small dimension problem, random forestsshould come second to k-nn ou kernel discriminant analyses

Page 41: Boston talk

toy: MA(1) vs. MA(2)

classification priormethod error rate (in %)

LDA 27.43Logist. reg. 28.34SVM (library e1071) 17.17“naıve” Bayes (with G marg.) 19.52“naıve” Bayes (with NP marg.) 18.25ABC k-nn (k = 100) 17.23ABC k-nn (k = 50) 16.97Local log. reg. (k = 1000) 16.82Random Forest 17.04Kernel disc. ana. (KDA) 16.95True MAP 12.36

Page 42: Boston talk

Evolution scenarios based on SNPs

Three scenarios for the evolution of three populations from theirmost common ancestor

Page 43: Boston talk

Evolution scenarios based on SNPs

Model 1 with 6 parameters:

I four effective sample sizes: N1 for population 1, N2 forpopulation 2, N3 for population 3 and, finally, N4 for thenative population;

I the time of divergence ta between populations 1 and 3;

I the time of divergence ts between populations 1 and 2.

I effective sample sizes with independent uniform priors on[100, 30000]

I vector of divergence times (ta, ts) with uniform prior on{(a, s) ∈ [10, 30000]⊗ [10, 30000]|a < s}

Page 44: Boston talk

Evolution scenarios based on SNPs

Model 2 with same parameters as model 1 but the divergence timeta corresponds to a divergence between populations 2 and 3; priordistributions identical to those of model 1Model 3 with extra seventh parameter, admixture rate r . For thatscenario, at time ta admixture between populations 1 and 2 fromwhich population 3 emerges. Prior distribution on r uniform on[0.05, 0.95]. In that case models 1 and 2 are not embeddeded inmodel 3. Prior distributions for other parameters the same as inmodel 1

Page 45: Boston talk

Evolution scenarios based on SNPs

Set of 48 summary statistics:Single sample statistics

I proportion of loci with null gene diversity (= proportion of monomorphicloci)

I mean gene diversity across polymorphic loci[Nei, 1987]

I variance of gene diversity across polymorphic loci

I mean gene diversity across all loci

Page 46: Boston talk

Evolution scenarios based on SNPs

Set of 48 summary statistics:Two sample statistics

I proportion of loci with null FST distance between both samples[Weir and Cockerham, 1984]

I mean across loci of non null FST distances between both samples

I variance across loci of non null FST distances between both samples

I mean across loci of FST distances between both samples

I proportion of 1 loci with null Nei’s distance between both samples[Nei, 1972]

I mean across loci of non null Nei’s distances between both samples

I variance across loci of non null Nei’s distances between both samples

I mean across loci of Nei’s distances between the two samples

Page 47: Boston talk

Evolution scenarios based on SNPs

Set of 48 summary statistics:Three sample statistics

I proportion of loci with null admixture estimate

I mean across loci of non null admixture estimate

I variance across loci of non null admixture estimated

I mean across all locus admixture estimates

Page 48: Boston talk

Evolution scenarios based on SNPs

For a sample of 1000 SNIPs measured on 25 biallelic individualsper population, learning ABC reference table with 20, 000simulations, prior predictive error rates:

I “naıve Bayes” classifier 33.3%

I raw LDA classifier 23.27%

I ABC k-nn [Euclidean dist. on summaries normalised by MAD]25.93%

I ABC k-nn [unnormalised Euclidean dist. on LDA components]22.12%

I local logistic classifier based on LDA components withI k = 500 neighbours 22.61%

I random forest on summaries 21.03%

(Error rates computed on a prior sample of size 104)

Page 49: Boston talk

Evolution scenarios based on SNPs

For a sample of 1000 SNIPs measured on 25 biallelic individualsper population, learning ABC reference table with 20, 000simulations, prior predictive error rates:

I “naıve Bayes” classifier 33.3%

I raw LDA classifier 23.27%

I ABC k-nn [Euclidean dist. on summaries normalised by MAD]25.93%

I ABC k-nn [unnormalised Euclidean dist. on LDA components]22.12%

I local logistic classifier based on LDA components withI k = 1000 neighbours 22.46%

I random forest on summaries 21.03%

(Error rates computed on a prior sample of size 104)

Page 50: Boston talk

Evolution scenarios based on SNPs

For a sample of 1000 SNIPs measured on 25 biallelic individualsper population, learning ABC reference table with 20, 000simulations, prior predictive error rates:

I “naıve Bayes” classifier 33.3%

I raw LDA classifier 23.27%

I ABC k-nn [Euclidean dist. on summaries normalised by MAD]25.93%

I ABC k-nn [unnormalised Euclidean dist. on LDA components]22.12%

I local logistic classifier based on LDA components withI k = 5000 neighbours 22.43%

I random forest on summaries 21.03%

(Error rates computed on a prior sample of size 104)

Page 51: Boston talk

Evolution scenarios based on SNPs

For a sample of 1000 SNIPs measured on 25 biallelic individualsper population, learning ABC reference table with 20, 000simulations, prior predictive error rates:

I “naıve Bayes” classifier 33.3%

I raw LDA classifier 23.27%

I ABC k-nn [Euclidean dist. on summaries normalised by MAD]25.93%

I ABC k-nn [unnormalised Euclidean dist. on LDA components]22.12%

I local logistic classifier based on LDA components withI k = 5000 neighbours 22.43%

I random forest on LDA components only 23.1%

(Error rates computed on a prior sample of size 104)

Page 52: Boston talk

Evolution scenarios based on SNPs

For a sample of 1000 SNIPs measured on 25 biallelic individualsper population, learning ABC reference table with 20, 000simulations, prior predictive error rates:

I “naıve Bayes” classifier 33.3%

I raw LDA classifier 23.27%

I ABC k-nn [Euclidean dist. on summaries normalised by MAD]25.93%

I ABC k-nn [unnormalised Euclidean dist. on LDA components]22.12%

I local logistic classifier based on LDA components withI k = 5000 neighbours 22.43%

I random forest on summaries and LDA components 19.03%

(Error rates computed on a prior sample of size 104)

Page 53: Boston talk

Evolution scenarios based on SNPs

Posterior predictive error rates

Page 54: Boston talk

Evolution scenarios based on SNPs

Posterior predictive error rates

favourable: 0.010 error – unfavourable: 0.104 error

Page 55: Boston talk

Instance of ecological questions [message in a beetle]

I How did the Asian Ladybirdbeetle arrive in Europe?

I Why do they swarm rightnow?

I What are the routes ofinvasion?

I How to get rid of them?

[Lombaert & al., 2010, PLoS ONE]beetles in forests

Page 56: Boston talk

Worldwide invasion routes of Harmonia Axyridis

[Estoup et al., 2012, Molecular Ecology Res.]

Page 57: Boston talk

Asian Ladybirds [message in a beetle]

Comparing 10 scenarios of Asian beetle invasion beetle moves

Page 58: Boston talk

Asian Ladybirds [message in a beetle]

Comparing 10 scenarios of Asian beetle invasion beetle moves

classification prior error∗

method rate (in %)

raw LDA 38.94“naıve” Bayes (with G margins) 54.02k-nn (MAD normalised sum stat) 58.47

RF without LDA components 38.84RF with LDA components 35.32

∗estimated on pseudo-samples of 104 items drawn from the prior

Page 59: Boston talk

Back to Asian Ladybirds [message in a beetle]

Comparing 10 scenarios of Asian beetle invasion beetle moves

Random forest allocation frequencies

1 2 3 4 5 6 7 8 9 10

0.168 0.1 0.008 0.066 0.296 0.016 0.092 0.04 0.014 0.2

Posterior predictive error based on 20,000 prior simulations andkeeping 500 neighbours (or 100 neighbours and 10 pseudo-datasetsper parameter)

0.3682

Page 60: Boston talk

Back to Asian Ladybirds [message in a beetle]

Comparing 10 scenarios of Asian beetle invasion

posterior predictive error 0.368

Page 61: Boston talk

Conclusion

Key ideas

I π(m∣∣η(y)) 6= π

(m∣∣y)

I Rather than approximatingπ(m∣∣η(y)), focus on

selecting the best model(classif. vs regression)

I Assess confidence in theselection with posteriorpredictive expected losses

Consequences onABC-PopGen

I Often, RF � k-NN (lesssensible to high correlationin summaries)

I RF requires many less priorsimulations

I RF selects automaticallyrelevent summaries

I Hence can handle muchmore complex models

Page 62: Boston talk

Conclusion

Key ideas

I π(m∣∣η(y)) 6= π

(m∣∣y)

I Use a seasoned machinelearning techniqueselecting from ABCsimulations: minimise 0-1loss mimics MAP

I Assess confidence in theselection with posteriorpredictive expected losses

Consequences onABC-PopGen

I Often, RF � k-NN (lesssensible to high correlationin summaries)

I RF requires many less priorsimulations

I RF selects automaticallyrelevent summaries

I Hence can handle muchmore complex models

Page 63: Boston talk

Thank you!