The Minimum Spanning Tree of an Undirected Graph |
|
A graph often contains redundancy in that there can be multiple paths between two vertices. This redundancy may be desirable, for example to offer alternative routes in the case of breakdown or overloading of an edge (road, connection, phone line) in a network. However, we often require the cheapest sub-network that connects the vertices of a given graph. This must in fact be an unrooted tree, because there is only one path between any two vertices in a tree; if there is a cycle then at least one edge can be removed. The total cost or weight of a tree is the sum of the weights of the edges in the tree. We assume that the weight of every edge is greater than zero. Given a connected, undirected graph G=<V,E>, the minimum spanning tree problem is to find a tree T=<V,E'> such that E' subset_of E and the cost of T is minimal. Note that a minimum spanning tree is not necessarily unique. Recall that a tree over |V| vertices contains |V|-1 edges. A tree can be represented by an array of this many edges. Note on TreesAn unrooted tree is a connected, acyclic,
undirected graph.
Each leaf vertex has degree one,
Prim's AlgorithmPrim's algorithm for the minimum spanning tree problem follows the strategy of beginning with a small tree, i.e. <{v1},{ }>, and growing it until it includes all vertices in the given graph. Initially the tree contains just an arbitrary starting node v1. At each stage a vertex not yet in the tree but closest to (some vertex that is in) the tree is found. This closest vertex is added to the tree. Finding the vertex allows us to improve our knowledge of the distances of the remaining vertices to the tree. A set `done' contains the vertices in the tree. --Graph G = <V, E> done := {v1} --initial Tree is <{v1},{}> for vertex i in V-{v1} T[i] := E[1,i] --direct edges (possibly "missing") end for loop |V|-1 times --Inv: {T[v]|v in done} represents a min' spanning Tree -- of the nodes in done and -- {T[u]|u not in done} contains the shortest known -- distances from the (sub-)spanning Tree to -- such vertices u. find closest vertex to (sub-)spanning Tree in V - done done +:= {closest} add closest & edge T[closest] to (sub-)spanning Tree for vertex j in V - done T[j] := min(T[j], --update knowledge on paths, E[closest,j]) --perhaps better? end for end loop --- Prim's Minimum Spanning Tree Algorithm --- ComplexityPrim's algorithm is very similar to Dijkstra's single-source shortest path algorithm and indeed the given version of the former was created by simple edits to the latter. The principal difference is that the criteria for choosing a new vertex is proximity to the tree rather than to a source. Prim's algorithm also clearly takes O(|V|2) time. CorrectnessPrim's algorithm is correct because at each stage
it has built a minimum spanning tree over those vertices
in the set `done' which eventually contains all the vertices:
Kruskal's AlgorithmKruskal's algorithm for the minimum spanning tree problem
begins with many disjoint spanning trees,
a spanning forest.
It repeatedly joins two trees together until
a spanning tree of the entire given graph remains.
CorrectnessKruskal's algorithm certainly leads to a spanning tree T but it is necessary to prove that T is minimal. The invariant shows that this is so: The invariant is certainly true on the initial iteration. In the body of the loop, two partial minimum spanning trees T1 and T2 are joined by an edge `e'. The vertices in T1 and T2 must be connected somehow in a final minimum spanning tree T'. Suppose they could be connected more cheaply in T' than in T. Add e to T'. This would create a cycle but it could be broken by removing an edge from T to T'-T. This would leave the vertices in T1 and T2 connected but at a lower cost than in T' because e is chosen as the cheapest satisfactory edge. Therefore T1, T2 and e form a minimum spanning subtree which must be part of a minimum spanning tree of G. The invariant really is invariant. The algorithm continues to join subtrees until a single minimum cost tree remains, spanning all vertices. ComplexityThe time complexity of Kruskal's algorithm hinges on finding the shortest edge to join two subtrees and on the joining itself. A priority queue can be used to organise the edges. A heap is a suitable implementation; see heap-sort but remember that we require fast access to the smallest edge. The priority queue takes O(|E|log|E|) time to create and O(log|E|) time to find a shortest edge while maintaining the priority queue. The later is done |V|-1 times. The joining of two subtrees can be done in O(|V|log|V|) total time over all the iterations as seen below. Thus Kruskal's algorithm takes O(|E|log|E|) time overall. This is better than Prim's algorithm for sparse graphs for which |E|<<|V|2.
The forest is represented by a partition of the vertices of the graph. Each partial tree spans a subset of the vertices. An array gives the size and first member of each subset. A second array gives the subset of each member and links the members of a subset together.
DemonstrationThe HTML FORM below allows a random
edge-weighted, undirected graph
with |V| vertices to be created, manipulated and analysed.
The value `pr' is the probability of there being an edge <i,j>;
it controls the sparseness of the graph
and on average there will be pr*|V|*(|V|-1) edges:
Note that the edges, (vi,vj), in the tree produced by Prim's algorithm are sorted on vj, because the algorithm works by considering how best to incorporate each vj into the tree, gradually refining this knowledge (the tree starts as <{v1},{ }>). On the other hand, the edges in the tree produced by Kruskal's algorithm are sorted by their weight, because it considers short edges first. Notes
© L_A, Dept. Comp. Sci., U.W.A. 1984-1986, Department of Computer Science, Monash University 1986, and (HTML) School of Computer Science and Software Engineering 1999. |
|
|