csci 3160 design and analysis of algorithms tutorial 3 chengyu lin
TRANSCRIPT
CSCI 3160 Design and Analysis of
Algorithms
Tutorial 3
Chengyu Lin
Outline• Union-find data structure
• Minimum spanning tree (MST)
• Kruskal’s algorithm
Union-find• A data structure for disjoint sets• n = number of members, forming disjoint
groups• Two members are in the same group if and
only if they have a common leader• Operations:
o Union: merge two groupso Find: name the leader of the group
• We do not really care about who the leader is; we only want to tell one group from another
Union-find• Idea: use a forest (= collection of trees)
o a group → a treeo leader → root of the tree
• Example:o Group 1: Alice, Bobo Group 2: Carol, Dave, Eve
• Store the height of each tree at the root
A
B C
D
E
1 1
Union-find• Find: return the root of the group
• Union: make the leader of one group the boss of the othero Our heuristic: make the root of the shorter
tree point to the root of the other tree
o If both trees are of the same height h, then the resulting tree has height h+1
Union-find in action• Initialize
o Everyone is his/her own boss
A B C D E0 0 0 0 0
Union-find in action• Union(A, B)
o Find(A) returns A and Find(B) returns Bo Make Alice the boss of Bobo Increase the height of the tree
A B C D E1 0 0 0 0
Union-find in action• Union(C, D)
o Find(C) returns C and Find(D) returns Do Make Carol the boss of Daveo Increase the height of the tree
A B C D E1 0 1 0 0
Union-find in action• Union(B, D)
o Find(B) returns A and Find(D) returns Co Make Alice the boss of Carolo Increase the height of the tree
A B C D E2 0 1 0 0
Your turn• Find(C)?
A B C D E2 0 1 0 0
Your turn• Find(C)?
o Find(C) returns A
A B C D E2 0 1 0 0
Your turn• Union(B, E)?
A B C D E2 0 1 0 0
Your turn• Union(B, E)?
o Find(B) returns A and Find(E) returns Eo Make Alice the boss of Eveo No increase in the tree height (Why?)
A B C D E2 0 1 0 0
Final result• Does this look more like a tree?
A
B C
D
E
2
0 1
0
0
Analysis• Height of each tree = O(log n)
• Cost of Find(): O(log n)
• Cost of Union(): O(log n)o Dominated by the cost of Find()
Minimum spanning tree
• G = (V, E): undirected, connected, (non-negatively) weighted
• Problem: find a subset of edges with minimum total weight such that all vertices in V are connected using these edgesA B
C D E
1
2
34 5 6
7
Total weight = 1 + 2 + 3 + 6 = 12
Kruskal’s algorithm• In words:
o Sort the edges in ascending order of weightso Initialize: set T = Øo While T is not a spanning tree
• Consider the next edge in the sorted list• If adding it to T does not cause a cycle, add it
• The union-find structure helps us check for cycleso Adding an edge corresponds to putting the two
endpoints into the same groupo Connecting two vertices in the same group causes a
cycle
Dry run• 1: Sort the edges
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
Dry run• 2: Set T = Ø
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 3: Consider the first edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 4: Consider the second edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 5: Consider the third edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 6:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 7:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 8:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Dry run• 9: We have find an MST!
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
1
2
34 5 6
7
A B
C D E
Analysis• Correctness
o In each step, we keep a set of edges T that is a subset of an MST
o Theorem: Let S be any tree in the forest (V, T). We can add the lightest edge from S to V-S to T, and the resulting set of edges is also a subset of an MST
• Space complexity = O(|V|+|E|)• Time complexity = O(|E|log|V|)
o Sorting alone takes time O(|E|log|V|) (Note that |E| = O(|V|2))
o Cycle checking takes O(log|V|) operations (Find())o Adding an edge also takes O(log|V|) operations
(Union())
Dry run revisited• 2: Set T = Ø
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
A B C D E0 0 0 0 0
Dry run revisited• 3: Consider the first edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
A B C D E1 0 0 0 0
Dry run revisited• 4: Consider the second edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
A B C D E1 0 1 0 0
Dry run revisited• 5: Consider the third edge
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
0A B C D E2 0 1 0
Dry run revisited• 6:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
0A B C D E2 0 1 0
Dry run revisited• 7:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
0A B C D E2 0 1 0
Dry run revisited• 8:
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
A B C D E2 0 1 0 0
Dry run revisited• 9: We have find an MST!
Edge Weight(A, B) 1(C, D) 2(B, D) 3(A, C) 4(A, D) 5(B, E) 6(D, E) 7
A B
C D E
A B C D E2 0 1 0 0
End• Questions