learning equivalence classes of bayesian-network structures
DESCRIPTION
Learning Equivalence Classes of Bayesian-Network Structures. David M. Chickering Presented by Dmitry Zinenko. Heuristic Search. We are looking for the best state in the search space . Na ï vely: state = a particular DAG search space = all possible DAGs over our variables - PowerPoint PPT PresentationTRANSCRIPT
Learning Equivalence Classes of Bayesian-Network Structures
David M. Chickering
Presented by Dmitry Zinenko
Heuristic Search
We are looking for the best state in the search space. Naïvely: state = a particular DAG search space = all possible DAGs over
our variables Move between related states using
search operators. Naively: Egde addition/removal/inversion
Heuristic Search Challenges
Search space graph should be well-connected To reach good states quickly To avoid local maxima
Search space graph should not be too dense Computationally efficient scoring and
transformations
Equivalence
G1 and G2 are equivalent if the set of distributions that can be represented by them is identical
Equivalence is an equivalence relationship!
X Y
X Y
X Y PP
Score Equivalence
If all we care about is the probability distribution, all we need is the equivalence class
The scoring function should give equal scores to structures from the same class Called score equivalent
Why prefer one representation of the class to another?
Equivalence Classes Are Good For You
We are ultimately looking for a probability representation, not a particular DAG
Searching individual DAGs is bad: Some operators lead to the same class
Efficiency Bad state connectivity for greedy
Theorem 1 (Verma & Pearl 1990)
Two DAGs are equivalent if and only if they have the same skeletons and the same v-structures
X
Y
X
Y
Z
X
Y
ZZ
X
Y
Z
Partially Directed Acyclic Graph
A directed edge is called compelled in G, if for every G’ equivalent to G, that edge has the same direction
Otherwise we call it reversible
Partially Directed Acyclic Graph (PDAG) Contains both directed and undirected edges Does not contain any directed circles
Theorem 1 extends naturally to PDAGs A DAG is also a PDAG
CPDAG and Consistent Extension
Completed PDAG for Class(G) contains directed edges for the compelled edges of G undirected edges for the reversible edges of G
G is consistent extension of P if G has the same skeleton and v-structures Every directed edge in P has the same
orientation in G
X Y Z X Y Z
X Y WZ
CPDAGs And Equivalence
Every consistent extension of P is in Class(P)
If Pc is a completed PDAG, then every PDAG G in Class(Pc) is a consistent extension of Pc
If P1 and P2 are completed PDAGs that admit consistent extension, then P1=P2 if and only if Class(P1)=Class(P2) A completed PDAG uniquely represents its
equivalence class
DAG to CPDAG (Meek 1995)
Undirect all edges except those that are in the v-structures
Direct (mark as compelled) undirected edges that match particular patterns
X
Y
ZX
Y
Z X
Y
ZW
Constructing Consistent Extension (I)
“Theorem 26”: The undirected components of a CPDAG are chordal In any cycle of length >3 in a DAG, there must be a
v-structure!
Let {Ki} be the set of undirected components of a completed PDAG Pc. Let {Gi} be consistent extensions of {Ki}
A graph G that results from replacing each reversible edge in Ki with the directed edge from corresponding Gi is a consistent extension of Pc
Constructing Consistent Extension (II)
Use decreasing maximum cardinality search to direct edges in each one of the chordal components Property of dMCS: Every path between any pair
of non-adjacent x, y contains a node numbered higher than x or y
Resulting graph is a consistent extension of Pc
Works only on completed PDAGs
PDAG-to-DAG (Dor & Tarsi 1992)
Select a node x in P s.t. x has no outgoing edges Vertices adjacent to x form a clique
Direct all edges (x―y) toward x x becomes a sink
Remove x from P
Works only on any PDAG
Applying the Operators
Operators
The set of operators should: Ensure global connectivity
(completeness) and good connectivity in general
Be easy to check for applicability (validity)
Avoid redundancy Allow for efficient scoring
Local scoring – local changes in G cause “local” changes in score(G)
Score Decomposability
A scoring function S is decomposable if it is a product (or sum) of factors s, each depending only on one node and its parents
For example:
1 1
log | , log | , log | , ,n n
i i ii i
P D G P x G P x x G
X Y
X Y
1 | |log | log | , log | ,Z X Z Y XS G P Z P X Z P Y X
2 1 |log | , log |Y X YS G S G P Y X P Y
Z
Z
Used Operators
Operator Scoring
Chickering 1996a Apply the operator and score the
consistent extension (DAG) Drawbacks:
Need to apply PDAG-to-DAG for every operator
Local operators may cause non-local changes when applied to CPDAG
Cannot benefit from local scoring
Local Operator Scoring
InsertU Operator – “Theorem 34”
Let Pc be any completed PDAG for which nodes x and y are not adjacent.
If after adding an edge between x and y Pc admits a consistent extension, then
The edge x―y is reversible if and only if x and y have exactly the same parents in the original PDAG
InsertU Operator – “Theorem 6”
The insertion of the undirected edge x―y in a CPDAG Pc is valid if and only if: x and y have the same parents in Pc
every undirected path between x and y contains at least one of their common neighbors
Only if (+Theorem 34): Take the shortest undirected path from x to y
in Pc that does not include any common neighbor of x and y
Length at least 3 and has no chord After adding x―y becomes a cycle of
length 4
InsertU Operator – “Lemma 32”
Let Pc be any completed PDAG, and let x and y be any pair of nodes that are not adjacent.
There exists a consistent extension of Pc in which all the reversible edges adjacent to x are directed
away from x all the reversible edges between y and the common
neighbors of x and y are directed toward y all the other reversible edges adjacent to y are
directed away from y If and only if every undirected path between x and y
passes through a common neighbor of x and y
InsertU Operator – Theorem 6“If” proof outline
Use consistent extension from Lemma 32 as G
Add a directed edge x→y to G to get G’ (the other direction is symmetric)
Show that G’ is a consistent extension of P’ (P with the addition of the undirected edge x―y) G’ is acyclic Same skeleton Same v-structures
InsertU Operator – Theorem 6G’ is a DAG
Assume by contradiction that there is a directed path from y to x in G
All the reversible edges are directed away from x, so the last edge in that path w→x is compelled
Then w is a parent of x in P, and it must also be a parent of y
In G there is a cycle y→w→y
X Y
W
InsertU Operator – “Lemma 24”
Let Pc be a completed PDAG, and let P’ denote a PDAG that results from adding a single edge between x and y to Pc
Consider any consistent extension G of Pc, and G’ that results by inserting a directed edge between x and y in G
Then any v-structure in G’ but not in P’, or any v-structure in P’ but not in G’ must include the edge between x and y
InsertU Operator – Theorem 6G’ is a consistent extension of P’
By Lemma 24, any v-structure different between G’ and P’ must include the edge x―y
The v-structure must be in G’, because in P’ this edge is undirected
The other edge in the v-structure cannot be reversible in G’ x does not have reversible parents y’s reversible parents are adjacent to x
But any compelled parent of x or y is a parent of both Q.E.D
Local Operator Evaluation
Since the only difference between G and G’ is the edge x→y, we can use score decomposability to compute the score of P’ in O(1) time s(P’) = s(Pc)+s(y,Nx,y{x}y)-s(y,Nx,yy)
In general we do not need to transform the CPDAG to compute neighbor scores: Calculate scores for all the neighbor states
(locally!) Check operator validity (efficiently!) starting
from the highest score