graph: representation and traversal - fordham university · graph: representation and traversal...
TRANSCRIPT
Graph: representation and traversal CISC4080, Computer Algorithms
CIS, Fordham Univ.
Instructor: X. Zhang
Outline• Breath first search/traversal
• review
• Depth first search/traversal
• …
2
BFS(V, E, s)1. for each u in V - {s}
2. do color[u] = WHITE
3. d[u] ← ∞
4. pred[u] = NIL
5. color[s] = GRAY
6. d[s] ← 0
7. pred[s] = NIL
8. Q = empty
9. ENQUEUE(Q, s) Q: s
∞ 0 ∞ ∞
∞ ∞ ∞ ∞
r s t u
v w x y
∞ ∞ ∞
∞ ∞ ∞ ∞
r s t u
v w x y
r s t u
v w x y
3
BFS(V, E, s)10. while Q not empty
11. u ← DEQUEUE(Q)
12. for each v in Adj[u]
13. if color[v] = WHITE
14. then color[v] = GRAY
15. d[v] ← d[u] + 1
16. pred[v] = u
17. ENQUEUE(Q, v)
18. color[u] = BLACK
∞ 0 ∞ ∞
∞ 1 ∞ ∞
r s t u
v w x y
Q: w
Q: s∞ 0 ∞ ∞
∞ ∞ ∞ ∞
r s t u
v w x y
1 0 ∞ ∞
∞ 1 ∞ ∞
r s t u
v w x y
Q: w, r
4
Ideas
• Breath first traversal:
• Use FIFO queue to stores all grey nodes
• Explore nodes based upon their discovering time:
First-In(Discover) First-Out (Explore)
• Go as wide as possible, discover all nodes at one
hop, two hop, … k-hop away….
• Depth first traversal: • Explore nodes that are most recently discovered, go
as deep as possible
• Guess what data structure is used? 5
Ideas & Application• Breath first traversal:
• Go as wide as possible, discover all nodes at one hop, two hop,
… k-hop away….
• Find shortest (hop count) path from s to all reachable nodes
• Depth first traversal: • Explore nodes that are most recently discovered, go as deep as
possible and backtrack when stuck
• Used to discover cycle, topological sorting. Similar to puzzle
walking.
• Both color nodes, set pred[u] to predecessor node
6
Depth-First Traversal
• Input: G = (V, E)
• Idea:
1. Start exploring from a node (arbitrarily chosen)
2. Explore a node by following edge (if directed edge, in the direction of edge) to discover a neighboring node
3. Then explore most recently discovered node
7
Depth-First Traversal• Search “deeper” in graph whenever
possible • explore edge of most recently discovered
node v to find a new node
1. Say we start from u
2. explore edges of u, to discover v (now v is mostly recently discovered node)
3. explore edges of v to discover y
4. explore edges of y to discover x
8
u v w
x y z
Depth-First Traversal: Backtrack• After all neighbors of v have been
explored, “backtracks” to parent (predecessor) of v
5. explore edges of x, both v and u already discovered
6. x has no other (out-going) edge, backtrack to y (we discover x via y, so y is x’s parent)
7. backtrack to y, explore edges of y, no “white” neighbors
8. backtrack to v, no “white” neighbors
8. backtrack to u, u has another edge, leading to x, already discovered,
9. u has no parent (nowhere to backtrack). All nodes reachable from u has been explored…
9
u v w
x y z
Depth-First Traversal: visit all! • Continue until all nodes reachable
from original source have been discovered
• If undiscovered nodes remain, choose one of them as a new source and repeat search from that node
9. u has no parent (no where to backtrack).
10. Choose w (or z) to explore next //any white node
11. Follow edge (w,z) to discover z
12. z has no new neighbor, backtrack to w
13. w has no other edge (turn black), no parent
14. All nodes have been discovered, done!
10
u v w
x y z
DFS (G): summary• Start exploration from any src node (randomly selected) • Search “deeper” in graph whenever possible: keep on
following an edge of most recently discovered node v to discover a new neighbor node
• After all neighbors of v have been explored, “backtracks” to parent/predecessor of v
• Continue until all nodes reachable from original src have been discovered
• If undiscovered nodes remain, choose one of them as a new source and repeat search from that vertex
11
DFS: data structure• Use Color to denote state of nodes
• white: not discovered, not explored
• gray: discovered, in the process of being explored
• black: discovered, and done exploring • pred[u]: predecessor/parent node of node u
• previous node on the path to u
• i.e., we discover u via pred[u]
12
GRAY0
1 ≤ d[u] < f [u]
• d[u]– discovery time (when u turns gray) • f[u] – finish time (when u turns black)
• during (d[u], f[u]), node u is grey • Instead of using wall-clock, we maintain:
• virtual clock: an integer initialized to 0, incremented when something of interests happens, i.e., nodes are discovered/finished
DFS Data Structures
13
time
d[u] f[u] d[v] f[v]
DFS(V, E)1. for each u ∈ V
2. do color[u] ← WHITE 3. pred[u] ← NIL 4. time ← 0 5. for each u ∈ V
6. do if color[u] = WHITE 7. then DFS-VISIT(u) 8. //^^ DFS traversal from node u to 9. // discover all nodes reachable from u
u v w
x y z
14
DFS_VISIT(s): initialization
1. {
2. //initialization
3. color[s] = GRAY
4. pred[s] = NIL
5. S = empty //empty stack
6. S.Push(s)
r s t u
v w x y
15
//discover/explore all white nodes that are reachable from s and in depth first manner DFS_VISIT(s)
DFS_VISIT(s): cont’d
6. while S not empty
7. u ← S.top()
8. if there is a white node v in Adj[u]
9. time++, d[v]=time
10. color[v] = GRAY
11. pred[v] = u //discover v via u
12. S.push(v)
13. else //done with u
14. time++, f[u]=time
15. color[u] = BLACK
16. S.pop() //pop u from stack
17. // stack top element is parent of u
18. } //end of DFS_VISIT16
Use Stack S to backtrack: LIFO allows us to go back to parent/predecessor, and parent’s parent,… (i.e., backtrack)
r s t u
v w x y
a
b
DFS(): tracing
17
1. all nodes colored white 2. pick a white node arbitrarily, say t as src, 3. DFS_visit(src=t) 4. go back to 2 until no more white nodes
* label node with (d[u], f[u], pred[u]) * shade for color
Time: 0
r s t u
v w x y
a
b
Current node u:
New neighbor v:
Stack S:
DFS_VISIT(s)1. {
2. //initialization
3. color[s] = GRAY
4. time++; d[s] = time
5. S = empty //empty stack
6. S.Push(s)
18
7. while S not empty
8. u ← S.top()
9. if there is a white node v in Adj[u]
10. time++, d[v]=time
11. color[v] = GRAY
12. pred[v] = u //discover v via u
13. S.push(v)
14. else //done with u
15. time++, f[u]=time
16. color[u] = BLACK
17. S.pop() //pop u from stack
18. } //end of DFS_VISIT
Push a node u into a stack: start to explore u’s neighbors, neighbor’s neighbors
Pop a node: done with it, go back to parent
Like recursive calls! calling a function => push to call stack return from a function call => pop call stack
Recursive DFS-VISIT(u)
1. DFS_VISIT(u) 2. { 3. color[u] = GRAY
4. time++; d[u] = time
5. for each v ∈ Adj[u]
6. if color[v] = WHITE 7. pred[v] ← u 8. DFS-VISIT(v) 9. //end of for loop 10. color[u] ← BLACK //done with u 11. time ← time + 1 12. f[u] ← time 13. } //return means backtrack to caller
19
1. DFS(G=(V,E)) 2. {
4. time ← 0 5. for each u ∈ V
6. color[u] ← WHITE
7. pred[u] ← NIL 4. 5. for each u ∈ V
6. if color[u] = WHITE 7. DFS-VISIT(u) 8. }
Recursive DFS tracing
u v w
x y z
20
a b
Time: 0
1. DFS(G=(V,E))
2. {
4. time ← 0
5. for each u ∈ V
6. color[u] ← WHITE
7. pred[u] ← NIL
4.
5. for each u ∈ V
6. if color[u] = WHITE
7. DFS-VISIT(u)
8. }
Assume a is picked firt
Recursive DFS_VISIT(u=a)
u v w
x y z
21
a b
Time: 0
DFS (u=a)
Exercise
• Perform DFS
22
Analysis of DFS(V, E)1. for each u ∈ V
2. do color[u] ← WHITE
3. pred[u] ← NIL 4. time ← 0 5. for each u ∈ V
6. do if color[u] = WHITE
7. then DFS-VISIT(u)
Θ(|V|)
Θ(|V|) – without counting the time for DFS-VISIT
23
Analysis of DFS-VISIT(u)1. DFS_VISIT(u) 2. { 3. color[u] ← GRAY 4. time ← time+1 5. d[u] ← time 4. for each v ∈ Adj[u]
5. if color[v] = WHITE 6. pred[v] ← u 7. DFS-VISIT(v) 8. color[u] ← BLACK 9. time ← time + 1 10. f[u] ← time
11. }
iterates for |Adj[u]| times
DFS-VISIT is called exactly once for each vertex
Total: Σu∈V |Adj[u]| + Θ(|V|) =
Θ(|E|) = Θ(|V| + |E|)24
Next
• DFS Forest • Different edges: tree edge, back edge,…
• Application of DFS • cycle detection • topological sorting
25
DFS Tree Edge and DFS Forest1. color[u] ← GRAY 2. time ← time+1 3. d[u] ← time 4. for each v ∈ Adj[u]
5. do if color[v] = WHITE 6. then pred[v] ← u 7. DFS-VISIT(v) 8. color[u] ← BLACK //done with u 9. time ← time + 1
10. f[u] ← time
26
(u,v) is a tree edge.
1/8 2/7 9/12
4/5 3/6 10/11
u v w
x y z
BFC
B
When follow an edge of u to find a white neighbor v, then (u, v) is a tree edge
DFS Forest G’ (V’,E’) is a subgraph of G (V,E) • with V’=V //all nodes are included • E’={all tree edges in DFS} • Roots are the nodes from which we call DFS_VISIT
•
Edge Classification• In DFS, when follow an edge of u to
find its neighbor v,
• if v is WHITE : – (u, v) is a tree edge
– if v was first discovered by exploring edge (u, v)
• if v is GRAY: – (u, v) is a back edge, connecting a
vertex u to an ancestor node v in a depth first tree
– Self loops (in directed graphs) are also back edges
1/
u v w
x y z
1/ 2/
4/ 3/
u v w
x y z
B
27
Edge Classification• if v is BLACK, and d[u] < d[v]:
– (u,v) is forward edge, non-tree edge
that connects a vertex u to a descendant v in a depth first tree
• if v is BLACK and d[u] > d[v]:
– (u,v) is cross edge – Can go between vertices in same depth-first
tree (as long as there is no ancestor / descendant relation) or between different depth-first trees
– e.g., (w,y) in example
1/ 2/7
4/5 3/6
u v w
x y z
BF
1/8 2/7 9/
4/5 3/6
u v w
x y z
BFC
28
Example (cont.)
1/8 2/7
4/5 3/6
u v w
x y z
BF1/8 2/7 9/
4/5 3/6
u v w
x y z
BF1/8 2/7 9/
4/5 3/6
u v w
x y z
BFC
1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BFC 1/8 2/7 9/
4/5 3/6 10/
u v w
x y z
BFC
B
1/8 2/7 9/
4/5 3/6 10/11
u v w
x y z
BFC
B
1/8 2/7 9/12
4/5 3/6 10/11
u v w
x y z
BFC
B
The results of DFS may depend on: • The order in which nodes are explored in procedure DFS • The order in which the neighbors of a vertex are visited in DFS-VISIT
29
Predecessor and Descendant• u = pred[v] ⟺ DFS-VISIT(v) was called
during a search of u’s adjacency list
• u is “direct” predecessor of v
• v is “direct” descendant of u
• Vertex v is a descendant of vertex u in depth
first forest ⟺ v is discovered while u is gray
• if we follow predecessor pointers(a back
pointer to predecessor node) from v, we will
reach u
1/ 2/
3/
u v w
x y z
30
Other Properties of DFSCorollary
Vertex v is a descendant of u
⟺ d[u] < d[v] < f[v] < f[u]
i.e., v is discovered after u is discovered,
v is finished before u is finished
Verify this using the example
1/8 2/7 9/12
4/5 3/6 10/11
u
v
BFC
B
31
Parenthesis Theorem*In any DFS of a graph G, for all
u, v, exactly one of the
following holds: 1. [d[u], f[u]] and [d[v], f[v]] are
disjoint, and neither of u and v is a
descendant of the other
2. [d[v], f[v]] is entirely within [d[u],
f[u]] and v is a descendant of u
3. [d[u], f[u]] is entirely within [d[v],
f[v]] and u is a descendant of v
3/6 2/9 1/10
4/5 7/8 12/13uvwx
y z s
11/16
14/15
t
1 2 3 4 5 6 7 8 9 10 1311 12 14 15 16
s
z
t
v u
y w
x
(s (z (y (x x) y) (w w) z) s) v)(t (v (u u) t)
Well-formed expression: parenthesis are properly nested 32
Directed Acyclic Graph• DAG: A directed graph that has
no cycle • often used to represent
precedence of events or processes that have a partial order
• for some pairs, there is a precedence relation, i.e., PutOnSocks and PutOnShoes,
• but for some other pairs of events, there is no precedence relation between them, i.e., PutOnSocks and PutOnWatch
• How to decide whether a directed graph G=(V,E) has cycle or not? 33
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
cycle and back edge
• A directed graph is acyclic ⟺ a DFS on G yields no back edges (i.e., when exploring adjacent nodes of node u, we never see a gray node).
• Proof: acyclic ⇒ no back edge by contraposition
Assume back edge ⇒ prove cycle
If there is a back edge (u, v) (v is grey when exploring u) ⇒ v is an ancestor of u, i.e., v=pred[pred[…pred[u]..]
⇒ there is a path from v to u in G: v, …, u
⇒ v, …, u, v is a path (as there is an back edge (u, v)), yield a cycle
v
u
(u, v)
34
cycle and back edge (cont’d)
• A directed graph is acyclic ⟺ a DFS on G yields no back edges (i.e., when exploring adjacent nodes of node u, we never see a gray node).
• Proof: no back edge => acyclic by contrapositio • show cyclic => back edge • Consider shortest cycle: • Suppose among nodes in the cycle, v is
discovered first, • DFS discover all nodes that are reachable from v,
including u • when exploring u, we will reach v via a back edge (i.e.,
v is still GRAY, not yet finished)
v
u
(u, v)
35
Using DFS to detect cycle• A directed graph is acyclic ⟺ a DFS on G yields no
back edges (i.e., when exploring adjacent nodes of current node u, we never see a gray node).
• Is there a cycle?
36
u v w
x y z
Topological Sort: intro.
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
37
A DAG:
• nodes represent various steps in getting dressed
• edge (a,b) means a needs to be done before b
• e.g., need to put on undershorts before putting on pants
• How to get dressed, i.e., what to do first, second, third, … and last?
• Is there only one way?
jackettiebeltshirtwatchshoespantsundershortssocks
Topological Sort
jackettiebeltshirtwatchshoespantsundershortssocks
38
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
Topological sort of a DAG G = (V, E): is a linear sorting/ordering of nodes such that if there exists an edge (u, v), then u appears before v.
If we consider rearranging all nodes in one line:
the arrows on all edges are pointing to right:
Topological Sort Algorithm
39
• Given a DAG G=(V,E):
• Output: Topological ordering of nodes such that if there exists an edge (u, v), then u appears before v.
• Brute force way?
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
Topological Sort Algorithm
40
• Given a DAG G=(V,E):
• Output: Topological ordering of nodes such that for any u, v, if there exists an edge (u, v), then u appears before v.
• Fact: If there is a edge from u to v in a DAG, then during any DFS, u finishes at a later time than v (i.e., f[u]>f[v]) (to be proved)
• Algorithm: Run DFS, and then sort nodes in descending order of their finish time to get topological order
• node finish last is put first, … node finish first is put in last place
• Correctness:
• if there is a edge from u to v, then f[u]>f[v],
• then above algorithms put u before v in topological ordering:
…. u … v ….
Topological Sort
undershorts
pants
belt
socks
shoes
watch
shirt
tie
jacket
TOPOLOGICAL-SORT(V, E) 1. Call DFS(V, E), during which
when each node is finished, insert it into front of a linked list
2. Return linked list of nodes as topological order
1/
2/
3/4
5
6/78
9/10
11/
12/13/14
15
16 17/18
jackettiebeltshirtwatchshoespantsundershortssocks
Running time: Θ(|V| + |E|)
41
Topological Sort Algorithm
42
• Fact: If there is an edge from u to v in a DAG, then during a DFS (regardless of the choice of starting nodes), u always finishes at a later time than v (i.e., f[u]>f[v])
• (Prove by considering two possible cases) • case 1: if d[u]<d[v] (i.e., u is discovered before v is discovered):
• consider recursive version: DFS_VISIT(u) calls DFS_VISIT(v), similarly for non-recursive implementation
• so f[u]>f[v] (u finishes at a later time then v)
• case 2: if d[v]<d[u] (i.e., v is discovered before u is discovered)
• there is no cycle => there is no path from v to u (otherwise cycle exists v…u,v
• => u is not discovered during DFS_VISIT(v)
• =>v will finish before u starts, so f[v]<s[u]<f[u]
Summary• Graph everywhere: represent binary relation
• Graph Representation
• Adjacent lists, Adjacent matrix
• Path, Cycle, Tree, Connectivity
• Graph Traversal Algorithm: systematic way to explore graph (nodes)
• BFS yields a fat and short tree
• App: find shortest hop path from a node to other nodes
• DFS yields forest made up of lean and tall tree
• App: detect cycles and topological sorting (for DAG)
43