data and complex models - university of bristolmadjl/data and complex models.pdf · models daniel...
TRANSCRIPT
![Page 1: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/1.jpg)
Data and Complex Models
Daniel LawsonUniversity of Bristol
work done at Biomathematics and Statistics Scotland
![Page 2: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/2.jpg)
Why bother with data?
Models are interesting in themselves – why bother with data?Adds legitimacy to the model
Concrete example of the models relevance
Gets the model looked at and used by other scientistsCitations!
<cynic>Applied journals produce a lot more paper volume... </cynic>
![Page 3: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/3.jpg)
Modelling conversation
Distinguished BiologistA. N. Other, Modeller
![Page 4: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/4.jpg)
Modelling conversation
I've got this great model that predicts really cool stuff is going to happen!
![Page 5: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/5.jpg)
Modelling conversation
Yes? But does it relate to reality?
I've got this great model that predicts really cool stuff is going to happen!
![Page 6: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/6.jpg)
Modelling conversation
Yes? But does it relate to reality?
Of course! I already said it was cool.
I've got this great model that predicts really cool stuff is going to happen!
![Page 7: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/7.jpg)
Modelling conversation
Yes? But does it relate to reality?
Of course! I already said it was cool.
I've got this great model that predicts really cool stuff is going to happen!
Really? Show me one example where it works in
practice.
![Page 8: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/8.jpg)
Modelling conversation
Yes? But does it relate to reality?
...
Of course! I already said it was cool.
I've got this great model that predicts really cool stuff is going to happen!
Really? Show me one example where it works in
practice.
![Page 9: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/9.jpg)
Talk Outline
MotivationModel testing overviewBayesian statistics overview
General approachMCMC method
Example 1: ODE model of gut bacteriaUses MCMC and stats methodology
Example 2: Ecological neutral modelTrue complex model with interesting predictionsUses Bayesian approach to do hypothesis testing
![Page 10: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/10.jpg)
Patterns in nature
Observe some interesting pattern in natureFrom Physics, Biology, Chemistry, etc
Create a modelReproduce the patternIs the model the real process?Many processes produce the same pattern!
![Page 11: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/11.jpg)
What causes the pattern?Best way to find out is with idealised experimental system
Test model assumptionsKnock out experiments, etc
Can’t always do this!Often can in traditional physicsComplex systems more difficult to study
Formal inference often neededMathematically interesting model is usually the limit of a more realistic & general model
Full model more suitable for testing
![Page 12: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/12.jpg)
Classical tests of models
Formal hypothesis testingGives clearest resultsHard to do in practiceHard to compare models
Information criterion (AIC, BIC)Informal “heuristic” for model fitCompare easily between related modelsCan be difficult to interpret for unrelated modelsPossible to “over fit” noise
Bayesian Model SelectionHardest to perform
![Page 13: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/13.jpg)
Information Criterion
AIC=2k−log P D∣M
Akaike Information Criterion:
Bayesian Information Criterion:
where k=number of parameters, n=number of datapoints
P(D|M) is the probability of the data given the model (the likelihood).
Both penalise parameters against model fit, but weight all parameters equally (not fair on complex models)
BIC=k log n−log P D∣M
![Page 14: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/14.jpg)
Example of pattern matching
Ecological neutral theory - all species are “just as good”Fixed N individuals equally likely to die or reproduce in a timestepMutations occur at rate p on reproduction, creating a new speciesWhat is the distribution of species sizes – the “Species Abundance Distribution”?
Does it match with data?
![Page 15: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/15.jpg)
Species Abundance Distribution
Is this evidence of neutral dynamics?Competing models just as good AICIs this even useful evidence?
Neutral prediction
Statistical curve fit
![Page 16: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/16.jpg)
Bayesian Parameter estimation
Posterior:
is the Likelihood of the data given the model is the Prior probability of the model parameters is the probability of the data – requires integration over all model parameters
Usually have to evaluate numerically
θ∣D=PD∣θ Pθ
P D
PD
Pθ
PD∣θ
θ
![Page 17: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/17.jpg)
(Insulting) Frequentist example of “Bayesian” Estimation
Have six dice labelled 1..6, dice x has x white faces and (6-x) black faces. One dice is taken at random and rolled. What is the probability the rolled dice was x given that the face observed is white?
Calculations:
Answer:
p x∣white=p white∣x p x
p white
p x=1/6
p white∣x =x /6
p white =∑x=1
6
p white∣x p x =21/36
p x∣white=x /21
![Page 18: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/18.jpg)
MCMC (Markov Chain Monte Carlo)Sample from the posterior probability distribution
Likelihood of parameters given some dataPrior : previous experimentsMetropolis-Hastings algorithm:
θ∝P D∣θ Pθ
min 1,π θ ' Q θ ' , θ π θ Qθ ,θ '
Select new parameters
Where is the proposal distribution.
Accept with probability
θ '~Q θθ '
θ '
Q
P D∣θ
P θ
![Page 19: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/19.jpg)
MCMC (2)
Build up Posterior Distribution over many iterations N
Parameter 1
Par
amet
er 2
Parameter 1P
aram
eter
2
Many iterations
Reject proposal: add to the parameter set.Accept: add .
Obtain estimator (can smooth)θ ~∑i=0
N
θ−θ i
N
4
21
23
5
14
θ '
θ
![Page 20: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/20.jpg)
MCMC (3)
If proposal distribution Q is irreducible and aperiodic:Guaranteed to obtain posterior as
But nothing said about finite N
Efficient (i.e. good at low N) if proposal distribution matches posterior distribution
And acceptance probability is not too low
N ∞
⇒ θ θ
![Page 21: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/21.jpg)
MCMC as a random walk in a potential
Random walk: probability of moving left q and right p with p+q=1:
MCMC: probability of moving left (Q symmetric):
MCMC in a potential:
q xp x−1
=exp−V x−1−V xT
q x =
min 1, x−1 x
min1,x−1x min1,
x1 x
V x T
=−log [ x [min1, x−1 x min1,
x1 x ]]
![Page 22: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/22.jpg)
Obtaining faster convergence
Methods includeReparameterisationReduction or simplification of parameter spaceDifficult in complex models
Annealing (“heating”)Decrease temperature with time to find high probability regionsOnly gets to equilibrium distribution, doesn’t explore it
Auxiliary variablesAugment parameter space to allow better mixingHard to apply in general case
![Page 23: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/23.jpg)
Reparameterisation
Parameter p1
Par
amet
er p
2
Parameter p1’
Par
amet
er p
2’
Linear transform(along principle
components)
Parameter p1
Par
amet
er p
2
Parameter r’
Par
amet
er Ө
Reparameterise
![Page 24: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/24.jpg)
Reparameterisation works if we can:Detector Calculate
the shape of the posterior
This is difficult in complex modelsIn practice, parameter region is often of too high dimensionWhy: random walk in D>3 doesn’t fill space!
Use of a “proper prior” guarantees convergenceBut is sometimes an unwarranted assumption
Convergence problems
Parameter p1
Par
amet
er p
2
Explored parameter region
![Page 25: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/25.jpg)
Example 1 – gut bacteria
Differential equation model of gut bacteria growthSeveral strains b
i competing for several
“substrate” resources sj
Interact via reaction products ak (short chain fatty
acids, or SCFA)Flow of all materials through the gut (modelled as several compartments)Detailed data from idealised experimental model available
![Page 26: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/26.jpg)
General form of equations
Change of bacteria = Bacterial growth – bacterial outflow
Growth rate is non-linear in density of bacteria and resource
Substrates and SCFA behave similarly
d B1
dt=B1∑
k
G B1, S k 1∑j
G B1,a j −k B1
G B1, x j =g ij B1t x j t
x j t K i k
Unsolvable non-linear model – equations not important!
![Page 27: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/27.jpg)
InferenceTry to establish qualitative and quantitative behaviourNot all parameters measurableSome time series data is availableLikelihood for differential equation model?Need a probabilistic model!Use stochastic measurement process
![Page 28: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/28.jpg)
Model behaviours
Competitive exclusion
Coexistence
Behaviour with single substrate:
NO CROSS FEEDING
CROSS FEEDING
Time
Bac
teria
Time
Bac
teria
![Page 29: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/29.jpg)
More model behaviours
Input driven extinction
Host competition exclusion
Time
Bac
teria
Time
Bac
teria
Time
Bac
teria
Time
Bac
teria
Periodic input
Host absorbs SCFA
![Page 30: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/30.jpg)
Likelihood from differential equation models
Normal distribution
Observations
Predictions
Errors
Observations
Predictions
Errors
Bacteria A
Bacteria B
SCFA 1
SCFA 2
log LD∣M =∑i
N yi , y xi , i
![Page 31: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/31.jpg)
Data in the gut model
Multiple experiments Ei under different conditions
Each leads to inconclusive inferencePosterior of E1 is summarised, used as prior for E2, etc. Some
parameter estimates never improve.
Combined approach may be better?No need to summariseHierarchical statistical model combines datasets
Dangers:Larger state space – might increase degrees of freedom!Model may be inconsistent between experiments?
![Page 32: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/32.jpg)
Hierarchical statistical model
i2i1 iE...
i=⟨i ⟩
i
i2'
i1'
iE'...
ij
Average parameter value
Variation between experiments
Known deviations
Experiment specificParameter values
Each parameter value hierarchically related to others
Allow for variation and predictable differences between experiments
Can in principle use full covariance matrix over all parameters
Can estimate if enough experiments available
i
![Page 33: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/33.jpg)
Example data
pH (low values are more acidic)
5.5
6.5
pH change
(SCFA 1)(SCFA 2)
Bacteria B Bacteria ABacteria C
![Page 34: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/34.jpg)
Results – prediction
Bacteria A
Bacteria B
Bacteria C
SCFA 1
SCFA 2
![Page 35: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/35.jpg)
Practical application
MCMC inference with hierarchical modelObtain full parameter distributionExplains experimental results in terms if fundamental
bacterial properties
Connect experiment and in-vivo behaviourExtended colonPeriodic food intake
Can predict for different scenarios e.g. effect of antibioticsDirect access to SCFA for health predictions
Care needed - model still too simple
![Page 36: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/36.jpg)
Example 2: Ecological Spatial Pattern model- Scottish Pine TreesRelate mathematically interesting neutral model
to real worldNeutral ecological model (i.e. no heritable
differences)Genetic differences observable through
chemistry (monoterpenes)Spatial data for monoterpenes – large
heterogeneity observedTheoretical models predict this – is the
prediction quantitatively correct?If not, monoterpenes are shown to be selected
![Page 37: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/37.jpg)
Modelled Forest Area
Observed plots
![Page 38: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/38.jpg)
Model ingredients
●Competition for space●Sexual reproduction of trees●Short-ranged seed dispersal●Longer ranged pollen dispersal●Pollen also arrives from outside modelled area●Monoterpenes determined genetically●Neutrality - parenting probability independent of monoterpenes
![Page 39: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/39.jpg)
Inference in model
Define “sufficient likelihood”: set of descriptors that capture all features
Spatial clustering of treesSpatial clustering of monoterpenesLong ranged correlations in monoterpenes
Model too slow for MCMCSample parameters using latin-hypercubeStatistically model the likelihoods to obtain an
approximation of the posterior distribution
![Page 40: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/40.jpg)
Hypothesis test
Obtain likelihood model for the local clustering of trees and monoterpenesThis is hard – requires approximation by variogram models
Obtain a sample from this LOCAL posteriorTest whether the data falls within the 95% confidence interval of the sample for the large scale clustering
![Page 41: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/41.jpg)
Point size proportional to LOCAL likelihood – point position random chosen by hypercube sampling
LOCALLikelihood surface estimated by statistical model of observed likelihoods
95% confidence interval
![Page 42: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/42.jpg)
Hypothesis test
Observed in data
p(D)<1%
Formal test for neutrality in the data
Model parameters sampled from local effect posterior
Consider log likelihood gain from using site number as an explicit factor
Its significantly more important in real data than model
Therefore real data is not explained by a neutral model
PosteriorPrediction
![Page 43: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/43.jpg)
Conclusions
Inference is possible for complex modelsBayesian formalism is the most appropriateMCMC allows sampling from a likelihood, if we can write one downCan formally test whether a model is incorrectLots of scope to mix statistics with complex systems problems!
![Page 44: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/44.jpg)
Conclusions
Inference is possible for complex modelsBayesian formalism is the most appropriateMCMC allows sampling from a likelihood, if we can write one downCan formally test whether a model is incorrectLots of scope to mix statistics with complex systems problems!Thank you for listening!
![Page 45: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/45.jpg)
Conclusions
Inference is possible for complex modelsBayesian formalism is the most appropriateMCMC allows sampling from a likelihood, if we can write one downCan formally test whether a model is incorrectLots of scope to mix statistics with complex systems problems!Thank you for listening!Questions and discussion - might this work on any of your problems? Pre-prints available... just ask
Bristol has a complex systems group with heavy stats involvement...
![Page 46: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/46.jpg)
Extra slides follow
![Page 47: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/47.jpg)
Hypothesis testing
Consider P(D;C|M) < P0
P0 is the threshold for the test, C is a condition to test on the data D
Requires careful formulation:Probability of exactly the data is often infinitesimalConsider probability of exceeding some thresholdUsually only possible for simple cases
Example: is a coin biased?Observe 10 throws, 8 of which are heads
P h≥8∣p h=0.5=∑x=8
10
nx 0.5n=0.054
![Page 48: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/48.jpg)
MCMC for process models
More restrictive prior, or increasing data:Reduces parameter region sizeReduces degrees of freedom in posteriorReduces total variance
# of effective degrees of freedom(in max. likelihood distribution)
Var
ianc
e in
pos
terio
r
MCMC “practical”
MCMC “possible” MCMC “impossible”
3
Must manually identify undeterminable parameter combinations in this region
1
Very slow
Theoretically cannot converge
True variance
Samplevariance
0
![Page 49: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/49.jpg)
Inferred parameter value
(1) Hierarchical model
(3) Non-hierarchical model
(2) Less data -Not measuring substrate output
Simulation study with multiple experiments:Competition at (a) pH 5.5 & (b) 6.5Bacteria A growth experiment at (c) pH 5.5 & (d) 6.5 3 scenarios considered: only scenario (1) has converged MCMC chain!
![Page 50: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/50.jpg)
0.00 0.10 0.20
02
46
810
a.pinene
N = 442 Bandw idth = 0.005385
Norm
alis
ed R
aw
Data
0.00 0.10 0.20
02
46
810
sabinene
N = 442 Bandw idth = 0.00545
Norm
alis
ed R
aw
Data
0.00 0.05 0.10 0.15
010
20
30
40
g.terpinene
N = 442 Bandw idth = 0.002179
Norm
alis
ed R
aw
Data
0.1 0.2 0.3 0.4 0.5
01
23
45
a.pinene
N = 442 Bandw idth = 0.01002
Square
-roote
d D
ata
0.1 0.2 0.3 0.4 0.5
01
23
45
sabinene
N = 442 Bandw idth = 0.009851
Square
-roote
d D
ata
0.0 0.1 0.2 0.3 0.40
24
68
10
g.terpinene
N = 442 Bandw idth = 0.007758
Square
-roote
d D
ata
Monoterpenes are genetically controlled and heritableDistributions can be well approximated by a weighted binomial distributionL genes (values 0 or 1), each contributes differently to monoterpene count
![Page 51: Data and Complex Models - University of Bristolmadjl/Data and Complex Models.pdf · Models Daniel Lawson University of Bristol work done at Biomathematics and Statistics Scotland](https://reader033.vdocuments.site/reader033/viewer/2022060506/5f1ee81287ec682f730f7d32/html5/thumbnails/51.jpg)
Variogram models