mo3c0e787002509a7
TRANSCRIPT
-
8/2/2019 MO3C0E787002509A7
1/5
COMP 271 Dave Mount
COMP 271: Lecture 14Prims and Baruvkas Algorithms for MSTs
Tuesday, Oct 16, 2001
Read: Chapt 24 in CLR. Baruvkas algorithm is not described in CLR.
Prims Algorithm: Prims algorithm is another greedy algorithm for minimum spanning trees. It dif-
fers from Kruskals algorithm only in how it selects the next safe edge to add at each step. Its
running time is essentially the same as Kruskals algorithm,
. There are two
reasons for studying Prims algorithm. The first is to show that there is more than one way to
solve a problem (an important lesson to learn in algorithm design), and the second is that Prims
algorithm looks very much like another greedy algorithm, called Dijkstras algorithm, that we will
study for a completely different problem, shortest paths. Thus, not only is Prims a different way
to solve the same MST problem, it is also the same way to solve a different problem. (Whatever
that means!)
Different ways to grow a tree: Kruskals algorithm worked by ordering the edges, and inserting them
one by one into the spanning tree, taking care never to introduce a cycle. Intuitively Kruskals
works by merging or splicing two trees together, until all the vertices are in the same tree.
In contrast, Prims algorithm builds the tree up by adding leaves one at a time to the current tree.
We start with a root vertex (it can be any vertex). At any time, the subset of edges forms a
single tree (in Kruskals it formed a forest). We look to add a single vertex as a leaf to the tree.
The process is illustrated in the following figure.
u
5
7
6
10
12
9
r
u
r
6
7
5
4
11
10
12
3
Figure 1: Prims Algorithm.
Observe that if we consider the set of vertices ! currently part of the tree, and its complement "
!
, we have a cut of the graph and the current set of tree edges respects this cut. Which
edge should we add next? The MST Lemma from the previous lecture tells us that it is safe to add
the light edge. In the figure, this is the edge of weight 4 going to vertex $ . Then $ is added to the
vertices of ! , and the cut changes. Note that some edges that crossed the cut before are no longer
crossing it, and others that were not crossing the cut are.
It is easy to see, that the key questions in the efficient implementation of Prims algorithm is how
to update the cut efficiently, and how to determine the light edge quickly. To do this, we will makeuse of a priority queue data structure. Recall that this is the data structure used in HeapSort. This
is a data structure that stores a set of items, where each item is associated with a key value. The
priority queue supports three operations.
Lecture 14 1 Fall 2001
-
8/2/2019 MO3C0E787002509A7
2/5
COMP 271 Dave Mount
insert(u, key): Insert$
with the key value key in
.u = extractMin(): Extract the item with the minimum key value in .
decreaseKey(u, new key): Decrease the value of $ s key value to new key.
A priority queue can be implemented using the same heap data structure used in heapsort. All of
the above operations can be performed in
time, where
is the number of items in the
heap.
What do we store in the priority queue? At first you might think that we should store the edges
that cross the cut, since this is what we are removing with each step of the algorithm. The problem
is that when a vertex is moved from one side of the cut to the other, this results in a complicated
sequence of updates. (Many edges are removed and many new edges need to be added.)
There is a much more elegant solution, and this is what makes Prims algorithm so nice. For
each vertex in $ "
! (not part of the current spanning tree) we associate $ with a key value
key[u], which is the weight of the lightest edge going from $ to any vertex in ! . We also store
in pred[u] the end vertex of this edge in ! . If there is not edge from $ to a vertex in "
! ,
then we set its key value to
. We will also need to know which vertices are in ! and which
are not. We do this by coloring the vertices in!
black.
Here is Prims algorithm. The root vertex
can be any vertex in
.
Prims Algorithm
Prim(G,w,r) {
for each (u in V) { // initialization
key[u] = +infinity;color[u] = white;
}
key[r] = 0; // start at root
pred[r] = nil;
Q = new PriQueue(V); // put vertices in Q
while (Q.nonEmpty()) { // until all vertices in MST
u = Q.extractMin(); // vertex with lightest edge
for each (v in Adj[u]) {
if ((color[v] == white) && (w(u,v) < key[v])) {
key[v] = w(u,v); // new lighter edge out of v
Q.decreaseKey(v, key[v]);
pred[v] = u;}
}
color[u] = black;
}
[The pred pointers define the MST as an inverted tree rooted at r]
}
The following figure illustrates Prims algorithm. The arrows on edges indicate the predecessor
pointers, and the numeric label in each vertex is the key value.
To analyze Prims algorithm, we account for the time spent on each vertex as it is extracted from
the priority queue. It takes
to extract this vertex from the queue. For each incident edge,we spend potentially
time decreasing the key of the neighboring vertex. Thus the time
is
deg
$
time. The other steps of the update are constant time. So the overall
running time is
Lecture 14 2 Fall 2001
-
8/2/2019 MO3C0E787002509A7
3/5
COMP 271 Dave Mount
46
5
2
10
9
8
4
2
8 7
92
6
5
2
10
9
8
4
9
8 7
92
6
5
2
10
9
8
47 8 7
92
6
5
2
10
9
8
8 7
92
6
5
2
10
9
8
47
92
6
5
2
10
9
8
48
8
Q: 2,5
5
Q: 2,2,5
Q: 1,2,10,?Q: 8,8,10,?,?4 8
1
Q: 4,8,?,?,?,?
Q:
2 2
4
8
?
?
?
?
1
1 11
118
8
10
?
? 2
10
?
1
5
2 2
5
2
Figure 2: Prims Algorithm.
deg
$
deg
$
deg
$
Since is connected,
is asymptotically no greater than
, so this is
. This is exactly
the same as Kruskals algorithm.
Baruvkas Algorithm: We have seen two ways (Kruskals and Prims algorithms) for solving the MST
problem. So, it may seem like complete overkill to consider yet another algorithm. This one is
called Baruvkas algorithm. It is actually the oldest of the three algorithms (invented in 1926, well
before the first computers). The reason for studying this algorithm is that of the three algorithms,
it is the easiest to implement on a parallel computer. Unlike Kruskals and Prims algorithms,
which add edges one at a time, Baruvkas algorithm adds a whole set of edges all at once to theMST.
Baruvkas algorithm is similar to Kruskals algorithm, in the sense that it works by maintaining
a collection of disconnected trees. Let us call each subtree a component. Initially, each vertex
is by itself in a one-vertex component. Recall that with stage of Kruskals algorithm, we add
the lightest-weight edge that connects two different components together. To prove Kruskals
algorithm correct, we argued (from the MST Lemma) that the lightest such edge will be safe to
add to the MST.
In fact, a closer inspection of the proof reveals that the cheapest edge leaving any component is
always safe. This suggests a more parallel way to grow the MST. Each component determines
the lightest edge that goes from inside the component to outside the component (we dont carewhere). We say that such an edge leaves the component. Note that two components might select
the same edge by this process. By the above observation, all of these edges are safe, so we may
add them all at once to the set of edges in the MST. As a result, many components will be
Lecture 14 3 Fall 2001
-
8/2/2019 MO3C0E787002509A7
4/5
COMP 271 Dave Mount
merged together into a single component. We then apply DFS to the edges of
, to identify thenew components. This process is repeated until only one component remains. A fairly high-level
description of Baruvkas algorithm is given below.
Baruvkas Algorithm
Baruvka(G=(V,E), w) {
initialize each vertex to be its own component;
A = {}; // A holds edges of the MST
do {
for (each component C) {
find the lightest edge (u,v) with u in C and v not in C;
add {u,v} to A (unless it is already there);
}
apply DFS to graph H=(V,A), to compute the new components;
} while (there are 2 or more components);
return A; // return final MST edges
There are a number of unspecified details in Baruvkas algorithm, which we will not spell out
in detail, except to note that they can be solved in
time through DFS. First, we may
apply DFS, but only traversing the edges of to compute the components. Each DFS tree will
correspond to a separate component. We label each vertex with its component number as part of
this process. With these labels it is easy to determine which edges go between components (since
their endpoints have different labels). Then we can traverse each component again to determine
the lightest edge that leaves the component. (In fact, with a little more cleverness, we can doall this without having to perform two separate DFSs.) The algorithm is illustrated in the figure
below.
1
13
4
11
10
3
14
9
2
15
12
8
7
6a
c
h
g
i
e
fb
d
1
13
4
11
10
3
14
9
2
15
12
8
7
6h
h
h
h
h
h
hh
h
1
13
4
11
10
3
14
9
2
15
12
8
7
6a
a
h
h
h
h
ha
a
1
13
4
11
10
3
14
9
2
15
12
8
7
6a
c
h
e
h
e
ha
c
Figure 3: Baruvkas Algorithm.
Analysis: How long does Baruvkas algorithm take? Observe that because each iteration involves doinga DFS, each iteration (of the outer do-while loop) can be performed in
time. The
question is how many iterations are required in general? We claim that there are never more than
iterations needed. To see why, let denote the number of components at some stage.
Lecture 14 4 Fall 2001
-
8/2/2019 MO3C0E787002509A7
5/5
COMP 271 Dave Mount
Each of the
components, will merge with at least one other component. Afterwards the numberof remaining components could be a low as 1 (if they all merge together), but never higher than
(if they merge in pairs). Thus, the number of components decreases by at least half with
each iteration. Since we start with
components, this can happen at most
time, until only
one component remains. Thus, the total running time is
time. Again, since
is connected,
is asymptotically no larger than
, so we can write this more succinctly as
. Thus all three algorithms have the same asymptotic running time.
Lecture 14 5 Fall 2001