recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf ·...
TRANSCRIPT
![Page 1: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/1.jpg)
Recent developments in imprecise probabilitiesand probabilistic graphical models
Gert de Cooman
Ghent University, SYSTeMS
[email protected]://users.UGent.be/˜gdcooma
gertekoo.wordpress.com
ECAI 201231 August 2012
![Page 2: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/2.jpg)
What would I like to achieve and convey?
IMPRECISEPROBABILITIES
PROBABILISTICGRAPHICAL
MODELS
![Page 3: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/3.jpg)
IMPRECISE PROBABILITYMODELS
![Page 4: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/4.jpg)
Credal sets
![Page 5: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/5.jpg)
Mass functions and expectationsAssume we are uncertain about:I the value or a variable XI in a finite set of possible values X.
This is usually modelled by a probability mass function p on X:
p(x)≥ 0 and ∑x∈X
p(x) = 1;
With p we can associate a prevision/expectation operator Pp:
Pp(f ) := ∑x∈X
p(x) f (x) where f : X→ R.
If A⊆X is an event, then its probability is given by
Pp(A) = ∑x∈A
p(x) = Pp(IA).
![Page 6: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/6.jpg)
The simplex of all probability mass functions
Consider the simplex ΣX of all mass functions on X:
ΣX :=
{p ∈ RX
+ : ∑x∈X
p(x) = 1
}.
b
c
a
ΣX
(0,1,0)
(0,0,1)
(1,0,0)
b
c
a
ΣX
pu
![Page 7: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/7.jpg)
Credal sets
DefinitionA credal set M is a convex closed subset of ΣX .
b
c
a Mb
c
a
M
b
c
a
M
b
c
a
M
It is completely characterised by its set of extreme points ext(M ).
![Page 8: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/8.jpg)
Conditioning and credal sets
Suppose we have two variables X1 in X1 and X2 in X2.
A credal set for (X1,X2) jointly is a convex closed set of joint massfunctions p(x1,x2):
M ⊆ ΣX1×X2
This gives rise to a conditional model by applying Bayes’s Rule to eachmass function:
M |x2 := {p(·|x2) : p ∈M } .
Working with extreme points does the job too.
![Page 9: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/9.jpg)
Independence and credal sets
Suppose we have two variables X1 in X1 and X2 in X2.
Marginal models are credal sets for X1 and X2) separately:
M1 ⊆ ΣX1 and M2 ⊆ ΣX2
Their strong product is the joint credal set:
M1�M2 := CCH({p1 ·p2 : p1 ∈M1 and p2 ∈M2} .
This leads to a notion of strong independence.
![Page 10: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/10.jpg)
Lower previsions
![Page 11: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/11.jpg)
Lower and upper previsions
b
c
a
ΣX
P(I{c}) = 1/4
P(I{c}) = 4/7
Equivalent modelConsider the set L (X) = RX of all real-valued maps on X. We definetwo real functionals on L (X): for all f : X→ R
PM (f ) = min{Pp(f ) : p ∈M } lower prevision/expectationPM (f ) = max{Pp(f ) : p ∈M } upper prevision/expectation.
Observe thatPM (f ) =−PM (−f ).
![Page 12: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/12.jpg)
Basic properties of lower previsions
DefinitionWe call a real functional P on L (X) a lower prevision if it satisfies thefollowing properties:for all f and g in L (X) and all real λ ≥ 0:
1. P(f )≥min f [boundedness];2. P(f +g)≥ P(f )+P(g) [super-additivity];3. P(λ f ) = λP(f ) [non-negative homogeneity].
TheoremA real functional P is a lower prevision if and only if it is the lowerenvelope of some credal set M .
![Page 13: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/13.jpg)
Conditioning and lower previsions
Suppose we have two variables X1 in X1 and X2 in X2.
Consider for instance:I a joint lower prevision P1,2 for (X1,X2) defined on L (X1×X2);I a conditional lower prevision P2(·|x1) for X2 conditional on X1 = x1,
defined on L (X2), for all values x1 ∈X1.
CoherenceThese lower previsions P1,2 and P2(·|X1) must satisfy certain (joint)coherence criteria: compare with Bayes’s Rule and de Finetti’scoherence criteria for precise previsions
See the web site of SIPTA (www.sipta.org) for pointers to moredetails.
![Page 14: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/14.jpg)
Independence and lower previsions
Suppose we have two variables X1 in X1 and X2 in X2.
Definition (Epistemic irrelevance)X1 is epistemically irrelevant to X2 when learning the value of X1 doesnot change our beliefs about X2:
P1,2(f (X2)) = P2(f (X2)|x1) for all f ∈L (X2) and all x1 ∈X1
Important:Epistemic irrelevance is not a symmetrical notion!It is weaker than strong independence.
Epistemic independence (also weaker) is the symmetrised version.
![Page 15: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/15.jpg)
Sets of desirable gambles
![Page 16: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/16.jpg)
First steps: Peter Walley (2000)
@ARTICLE{walley2000,author = {Walley, Peter},title = {Towards a unified theory of imprecise probability},journal = {International Journal of Approximate Reasoning},year = 2000,volume = 24,pages = {125--148}
}
![Page 17: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/17.jpg)
First steps: Peter Williams (1977)
@ARTICLE{williams2007,author = {Williams, Peter M.},title = {Notes on conditional previsions},journal = {International Journal of Approximate Reasoning},year = 2007,volume = 44,pages = {366--383}
}
![Page 18: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/18.jpg)
Set of desirable gambles as a belief model
Gambles:A gamble f : X→ R is an uncertain reward whose value is f (X)
Set of desirable gambles:D ⊆L (X) is a set of gambles that a subject strictly prefers to zero
![Page 19: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/19.jpg)
Why work with sets of desirable gambles?
Working with sets of desirable gambles D:I is simpler, more intuitive and more elegantI is more general and expressive than (conditional) lower previsionsI gives a geometrical flavour to probabilistic inferenceI includes classical propositional logic as another special caseI shows that probabilistic inference and Bayes’ Rule are ‘logical’
inferenceI includes precise probability as one special caseI avoids problems with conditioning on sets of probability zero
![Page 20: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/20.jpg)
Most comprehensive approach so far: note on arXiv
![Page 21: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/21.jpg)
Introduction to Imprecise Probabilities
@BOOK{troffaes2012,title = {Introduction to Imprecise Probabilities},publisher = {Wiley},editor = {Augustin, Thomas and Coolen, Frank P. A.
and De Cooman, Gert and Troffaes, Matthias C. M.},note = {Due end 2012},
}
![Page 22: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/22.jpg)
IMPRECISE-PROBABILISTICGRAPHICAL MODELS
![Page 23: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/23.jpg)
Credal sets
![Page 24: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/24.jpg)
Credal networks: the special case of a tree
Basic conceptConsider a directed tree T, with a variable Xt attached to each nodet ∈ T.
X1
X2
X3 X4
X5
X6
X7
X8 X9
X10 X11
Each variable Xt assumes values in a set Xt.
![Page 25: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/25.jpg)
Credal trees: local uncertainty models
Local uncertainty model associated with each node tFor each possible value xm(t) ∈Xm(t) of the mother variable Xm(t), wehave a local conditional credal set
Mt|Xm(t)
which is a collection of credal sets
Mt|xm(t) ⊆ ΣXt for each xm(t) ∈Xm(t)
Xm(t)
Xs . . . Xt . . . Xs′
![Page 26: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/26.jpg)
Interpretation of the graphical structure
The graphical structure is interpreted as follows:Conditional on the mother variable, the non-parent non-descendants ofeach node variable are strongly independent of it and its descendants.
X1
X2
X3 X4
X5
X6
X7
X8 X9
X10 X11
![Page 27: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/27.jpg)
Lower previsions
![Page 28: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/28.jpg)
Credal trees: local uncertainty models
Local uncertainty model associated with each node tFor each possible value xm(t) ∈Xm(t) of the mother variable Xm(t), wehave a conditional lower prevision/expectation
Qt(·|xm(t)) : L (Xt)→ R
where
Qt(f |xm(t)) = lower prevision of f (Xt), given that Xm(t) = xm(t).
The local model Qt(·|Xm(t)) is a conditional lower prevision operator.
Xm(t)
Xs . . . Xt . . . Xs′
![Page 29: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/29.jpg)
Interpretation of the graphical structure
The graphical structure is interpreted as follows:Conditional on the mother variable, the non-parent non-descendants ofeach node variable are epistemically irrelevant to it and itsdescendants.
X1
X2
X3 X4
X5
X6
X7
X8 X9
X10 X11
![Page 30: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/30.jpg)
@ARTICLE{cooman2010,author = {{d}e Cooman, Gert and Hermans, Filip and Antonucci, Alessandro and Zaffalon, Marco},title = {Epistemic irrelevance in credal nets: the case of imprecise {M}arkov trees},journal = {International Journal of Approximate Reasoning},year = 2010,volume = 51,pages = {1029--1052},doi = {10.1016/j.ijar.2010.08.011}
}
![Page 31: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/31.jpg)
MePICTIr for updating a credal tree
For a credal tree we can find the joint model from the local modelsrecursively, from leaves to root.
Exact message passing algorithm
– credal tree treated as an expert system– linear complexity in the number of nodes
Python code
– written by Filip Hermans– testing and connection with strong independence in cooperation
with Marco Zaffalon and Alessandro Antonucci
Current (toy) applications in HMMscharacter recognition, air traffic trajectory tracking and identification,earthquake rate prediction
![Page 32: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/32.jpg)
@INPROCEEDINGS{cooman2011,author = {De Bock, Jasper and {d}e Cooman, Gert},title = {Imprecise probability trees: Bridging two theories of imprecise probability},booktitle = {ISIPTA ’09 -- Proceedings of the 6th International Symposium on Imprecise Probability: Theories and Applications},year = 2009,editor = {Coolen, Frank P. A. and {d}e Cooman, Gert and Fetz, Thomas and Oberguggenberger, Michael},address = {Innsbruck, Austria},publisher = {SIPTA}
}
![Page 33: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/33.jpg)
A HMM is a special credal tree
X1 X2 Xk Xn
O1 O2 Ok On
Q1 (·) Q2(·|X1) Qk(·|Xk−1) Qn(·|Xn−1)
S1(·|X1) S2(·|X2) Sk(·|Xk) Sn(·|Xn)
State sequence:
Output sequence:
![Page 34: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/34.jpg)
Maximal state sequences
Classically (Viterbi):Find the state sequence x̂1:n that maximises the posterior probabilityp(x1:n|o1:n) corresponding to a given observation sequence o1:n.
Maximality (under robust ordering):Define a partial order > on state sequences:
x̂1:n > x1:n iff p(x̂1:n|o1:n)> p(x1:n|o1:n) for all compatible p(·|o1:n)
Find the state sequences x̂1:n that are maximal: undominated by anyother state sequence.
![Page 35: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/35.jpg)
ESTIHMM for finding all maximal state sequences
Exact backward-forward algorithm
– developed by Jasper De Bock– finds all maximal state sequences that correspond to a given
observation sequence– quadratic complexity in the number of nodes [linear]– cubic complexity in the number of states [quadratic]– linear complexity in the number of maximal sequences. [linear]
Python code
– written by Jasper De Bock
Current (toy) applications in HMMscharacter recognition, finding gene islands
![Page 36: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/36.jpg)
Sets of desirable gambles
![Page 37: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/37.jpg)
@ARTICLE{moral2005,author = {Moral, Serafín},title = {Epistemic irrelevance on sets of desirable gambles},journal = {Annals of Mathematics and Artificial Intelligence},year = 2005,volume = 45,pages = {197--214},doi = {10.1007/s10472-005-9011-0}
}
![Page 38: Recent developments in imprecise probabilities and ...gdcooma/presentations/recipgm.pdf · Conditioning and lower previsions Suppose we have two variables X 1 in X 1 and X 2 in X](https://reader034.vdocuments.site/reader034/viewer/2022042804/5f5b8f69225f5f7ad359bca8/html5/thumbnails/38.jpg)
Most comprehensive approach so far: note on arXiv