Minimum Spanning Tree

Overview#

Spanning tree of a graph is a connected subgraph with no cycles (acyclic) that includes all the vertices.

A Minimum Spanning Tree (MST) of an edge-weighted graph is a spanning tree whose weight (the sum of the weights of the edges) is no larger than the weight of any other spanning tree.

Hence, a MST is a spanning tree that is:

• Connected
• Acyclic
• Includes all the vertices

Greedy Algorithm#

Cut Property

A cut in a graph is a partition of its vertices into two (non-empty) sets.

A crossing edge connects a vertex in one set with a vertex in the other.

Steps

• Find cut with no black crossing edges, color its min-weight edge black
• Repeat until V-1 edges are colored black

Kruskal's Algorithm#

Steps

• Consider edges in ascending order of weight (priority queue - min heap of edges by weight)

• Add next edge to tree T unless doing so creates a cycle

• Check with union find
• If we are connecting two vertices in the same set (component), we will obtain a cycle
• Stop when `V-1` edges are attained

Pseudocode#

from collections import dequeue
import heapq
# Pseudoclass - some methods have not been implemented
class KruskalMST:
# g is an edge-weighted graph
def __init__(self, g):
self.mst = deque()
pq = []
# heapq by default creates a min-heap
for edge in g.get_edges():
heapq.heappush(pq, edge)
uf = UnionFind(g.num_of_vertices())
while(len(pq) > 0 && len(self.mst) < g.num_of_vertices() - 1):
edge = heapq.heappop(pq)
v = edge.either()
w = edge.other()
if(!uf.connected(v, w)):
uf.union(v, w)
mst.append(edge)
def get_edges(self):
return self.mst

Time Complexity#

The overall time complexity for Kruskal's MST algorithm is `Elog(E)` in worst case. As shown in the table below, the dominating factor is `E` and `log(E)` which gives rise to the complexity.

The time complexity of `Elog(E)` can be represented as `Elog(V)` since number of edges is at most `V^2` and hence, `log(E) = log(V^2) = 2log(V)`.

Why is `E` at most `V^2`?

• Imagine a adjacency matrix. At most, every cell is filled, which is `V^2`.
OperationFrequencyTime per Operation
Build PQ1`E` (bottom-up construction )
Delete MinE`log(E)`
UnionV`log*(V)`
ConnectedE`log*(V)`

Note: `log**(V)` arises due to amortised bound using weight quick union with path compression

As per the table, we can see that the time complexity is dominated by `E*log(E)` times of `delete_min()` operation.