Unit 3 — Area of Study 3 (From the Study Design)
On completion of this unit the student should be able to evaluate and document algorithms and data representations, and solve a real-world problem, the solution for which requires the integration of algorithms and data types.

Note that this section is shorter because it mostly consists of a big project that you’ll have to do.

Proof by Induction

The idea behind proof by induction is that, if the previous statement is true, it follows that the next statement in the sequence is also true. You could think of it like a set of dominoes, all falling down: if the first falls down, then it will cause the second to fall down, which will cause the third to go, and so on [Hammack2009.10].

Say we want to prove that the statements \(S_1, S_2, S_3, ...\) are all true, then the following is how we’d do it with induction. First, prove that the first statement (\(S_1\)) is true. Second, given any positive integer \(k\), prove that the statement \(S_k\) being true implies \(S_{k+1}\) (This is also written sometimes as \(S_k \implies S_{k+1}\)). If both of these are done, it follows by induction that every \(S_n\) is true.

Let’s try and prove the statement "The sum of the first \(n\)th odd numbers is equal to \(n^2\)". As a mathematical statement: if \(n \in \mathbb{N}\), then \(1 + 3 + 5 + 7 + ... + (2n-1) = n^2\).

Proof. (induction).

  1. If \(n = 1\), then \(1 = 1^2\), which is true.

  2. Now we have to prove that \(S_k \implies S_{k+1}\) for \(k \geqslant 1\). We assume that \(1 + 3 + 5 + ... + (2k-1) = k^2\) is true. From this, we will try to show that \(S_{k+1}\) is also true.

    \[\begin{align*} 1 + 3 + 5 + ... + (2(k+1) - 1) &= (k+1)^2 \\ 1 + 3 + 5 + ... + (2k -1) + (2(k+1) - 1) &= (k +1)^2 \\ k^2 + (2(k+1) - 1) &= (k +1)^2 \\ k^2 + (2k + 2 -1) &= (k+1)^2 \\ k^2 + 2k +1 &= (k+1)^2 \\ (k+1)^2 &= (k+1)^2 \\ LHS &= RHS \end{align*}\]

    From this, \(1 + 3 + 5 + ... + (2(k+1) - 1) = (k+1)^2 \) is true. This proves that \(S_k \implies S_{k+1}\). By induction is follows that \(1 + 3 + 5 + 7 + ... + (2n-1) = n^2\) for every \(n \in \mathbb{N}\). \(\blacksquare\)

Each of the steps in induction have a specific name tied to them. The first step (1) is the base case, this is usually \(S_1\) or \(S_0\), and is often quite a simple statement to show as true. The assumption that \(S_k\) is true is called the induction hypothesis. The induction step (step 2) is showing that \(S_k \implies S_{k+1}\) is true. And finally, the statement is the closing line, stating that the original proposition is true by induction.

Loop Invariants

A Loop Invariant is statement about some relationship between your variables that is true both at the start and the end of each loop iteration. However, it may or may not be true in the middle of the loop. We use these loop invariants to help understand why an algorithm is correct [CLRS2009].

When we use a loop invariant to show that an algorithm works, we need to show the following about the loop invariant:

  • Initialisation: the invariant is true before the first iteration of the loop.

  • Maintenance: If it is true before an iteration of the loop, it also needs to be true at the beginning of the next loop as well.

  • Termination: When the loop ends, the loop invariant tells us something useful that helps us show that the algorithm is correct.

Note the similarity between this and how induction works. Showing that the loop invariant holds at the start is the base case, and showing that it is true from iteration to iteration is the inductive step [CLRS2009]. The termination property of the loop invariant is a key difference between induction and this, we basically "stop" the induction step when it terminates, rather than just going on forever.

Example

As an example, let’s try to show that this algorithm that squares a given number works.

pseudocode
function square(n):
    s <- 0
    i <- 0
    while i < n
        s <- s + n
        i <- i + 1
    return s

First we have to find a loop invariant (something that’s true at the start and end of every iteration). Well, after \(k\) iterations, \(s = kn\) and \(i = k\).

Now we can use proof by induction to show that this holds.

  1. Before entering the loop, \(k = 0\) (initialisation). So \(s = 0 \times 0 = 0\) and \(i = 0\). This is true.

  2. Assuming that \(s = kn\) and \(i = k\) are true (\(S_{k+1}\)), we need to show that this implies that \(S_{k + 1}\) is also true (maintenance). Because the algorithm adds \(n\) to \(s\), we would expect that after \(k+1\) iterations, \(s\) would equal the value of \(s\) from the previous iteration plus \(n\). This would be \(s = kn + n\).

    \[\begin{align*} s = (k+1)n & \text{ and } i = k + 1 \\ s = kn + n & \text{ and } i = k + 1 \end{align*}\]

Finally, we need to show that when the algorithm terminates, the loop invariant gives us information about the algorithm. In this case, the algorithm stops after \(n\) iterations. So \(s = n \times n = n^2 \). So the algorithm works. \(\blacksquare\)

Proofs for Specific Algorithms

We’ve already seen the proof by contradiction for Prim’s algorithm in the last topic, but know that you know about proof by induction and loop invariants, we can prove the following algorithms.

Dijkstra’s Algorithm

Recall that Dijkstra’s algorithm’s pseudocode looks something like:

pseudocode
function DIJKSTRA(graph, s):
    // s is the source node

    // initialise
    for each node in graph:
        node.distance = Infinity
        node.prev = null
    s.distance = 0

    S = []
    Q = graph.nodes

    while Q is not empty:
        u = lowest cost node in Q
        S.append(u)
        for each adjacent node (v) from u:􏰃
            // relaxation
            if v.distance > u.distance + edge(u, v).width:
                v.distance = u.distance + edge(u, v).width
                v.prev = u
        remove u from Q

To prove the above algorithm, we use the following loop invariant:

At the start of each iteration of the while loop, v.distance equals the same as the distance of the shortest path between source and v for each node v in S.

Initialisation

At the start, S == [], and so the invariant is true. There are no nodes to check.

Maintenance

We will now prove, by contradiction, that the distances in S are the shortest from s to every other node. Basically, we want to show that u.distance equals the shortest distance from s to u for the node u just added to the set in this iteration.

We will choose u such that it is the first node where its distance value is not equal to the shortest distance from s to u. (This will be important in the contradiction).

Let’s look at the start of the while loop, when u is added to S. And let’s say that there is a shortest path \(p\) from s to u. Before adding u to S, the path \(p\) would be between one node in S and one node in Q. Let’s say that the node \(y \in\) Q, and \(x \in\) S, such that s \(\leadsto x \to y \leadsto\) u (where \(\leadsto\) indicates a path containing any number of edges between those two nodes, and \(\to\) indicates a direct edge between the two nodes).

proof dijkstra
Figure 1. An image representing the connections between s, \(x\), \(y\), and u.

Because \(y\) appears before u on the shortest path from s to u, the shortest path from \(y\) to u must be a lower cost. From this we are able to say that \(y\tt{.distance}\) is less than or equal to u.distance. And thus we are also able to say that \(y\tt{.distance}\) is less than or equal to the cost of the shortest path from s to u.

But because both \(y\) and u were in Q when u was chosen, we know that u.distance is less than or equal to \(y\tt{.distance}\). Combining this with the inequality from the previous paragraph, we are able to see that they are actually an equality. Meaning that \(y\tt{.distance}\) is equal to u.distance, and because \(y\tt{.distance}\) is also equal to the cost of the shortest path between s and u, therefore u.distance equals the cost of the shortest path between s and u.

But this contradicts the initial choice of the node u. By contradiction we must conclude that u.distance equals the shortest distance from s to u. And therefore out loop invariant must be maintained at the start and end of the loop.

Termination

At the end of the loop, Q == [], which, along with the statement that Q == graph.nodes - S, implies that S contains all nodes in the graph. Thus, the v.distance equals the cost of the shortest path between s and v for all nodes v in the graph. \(\blacksquare\)

Bellman-Ford Algorithm

Here is what the pseudocode for the Bellman-Ford algorithm looks like:

pseudocode
function bellman-ford(graph, s):
    // s is the source node

    // initialise
    for each node in graph:
        node.distance = Infinity
        node.prev = null
    s.distance = 0

    for i = 1 to length(graph.nodes)-1:
        for each edge (u, v) in graph.edges:
            // relaxation
            if v.distance > u.distance + edge(u, v).width:
                v.distance = u.distance + edge(u, v).width
                v.prev = u

    // Detecting for negative weight cycles
    // Returns True if the graph contains none.
    for each edge (u, v) in graph.edges:
        if v.distance > u.distance + edge(u, v).width:
            return False
    return True

We will focus on proving that the algorithm finds the shortest path to all nodes from the source node. If you want to, you can prove that it is able to detect negative weight cycles.

Assuming that graph contains no negative weight cycles that are reachable from node s, then after length(graph.nodes)-1 iterations of the second for loop, we will have u.distance equal the shortest distance between s and u for all nodes u that are reachable from s.

Proof. Consider any node \(v\) that is reachable from \(s\), and let \(p\) be a shortest path of nodes from \(s\) to \(v\). Where \(p = {\tt [}v_0, v_1, v_2, ..., v_k {\tt ]}\), and \(v_0 = s\), and \(v_k = v\). Because shortest paths can only contain one edge at most once, \(p\) must have at most length(graph.nodes)-1 edges. And therefore \(k \leqslant\) length(graph.nodes)-1.

The for loop relaxes all edges in the graph length(graph.nodes)-1 times. In the edges relaxed by the \(i\text{th}\) iteration, there will be the edge \((v_{i-1}, v_i)\). Because we relax all edges along the path \(p\), \(v_k\tt.distance\) will equal the shortest distance from \(s\) to \(v_k\). \(\blacksquare\)

But we need to prove this new fact that if \((v_0,v_1), (v_1,v_2), ..., (v_{k-1}, v_k)\) are relaxed in order, then \(v_k\tt.distance\) equals the shortest distance from \(s\) to \(v_k\). And to do that, we will use induction.

  • Base case: if \(i = 0\), because no edges in \(p\) have been relaxed, \(v_0 = s\) and \(s\tt.distance\) equals the shortest distance from it to itself (this is set in the initialisation step of the algorithm)

  • Inductive step: We assume that \(v_{i-1}\tt.distance\) equals the shortest distance from \(s\) to \(v_{i-1}\). After relaxing the edge \((v_{i-1}, v_i)\), we know that \(v_i\tt.distance\) will equal the shortest distance from \(s\) to \(v_i\). This is because if we have \(s \leadsto u \to v\), and \(u\tt.distance\) equals the shortest distance from \(s\) to \(u\) before the relax call on \((u, v)\), then \(v\tt.distance\) will equal the shortest distance from \(s\) to \(v\) after the call. \(\blacksquare\)

Floyd-Warshall Algorithm

Recall that the Floyd-Warshall algorithm solves all-pairs shortest-paths for a graph. And that the pseudocode for the Floyd-Warshall algorithm looks like:

pseudocode
for each node k in the graph:
    for each node i in the graph:
        for each node j in the graph:
            ij = distance from i to j
            ikj = distance from i to k + distance from k to j
            compare ij and ikj
            update node distance with the shorter of the two

To prove that this algorithm works, we will assume that after \(n\) iterations of the outer loop, the path from node \(i\) to \(j\) will be the shortest path that uses only the first \(n\) nodes. When \(n = \tt length(graph.nodes)\) the algorithm will be done.

  • Base case: Initially, \(n = 0\) and so all distances between \(i\) and \(j\) nodes use zero nodes as there are no distance values assigned yet.

  • Induction step: Assuming that on the \(n\)th iteration, the distance between \(i\) and \(j\) is along the smallest length path between them that uses only the first \(n\) nodes.

    On the \(n + 1\)th iteration, either the shortest distance path doesn’t go via the \(n+1\)th node, and such the path was found on the previous iteration. Or the shortest path will use the \(n+1\)th node, and the data will be updated with this information.

    In either case, the distance being used is the shortest one, resulting in the shortest path between \(i\) and \(j\) using the first \(n+1\) nodes. When \(n = \tt length(graph.nodes)\), the algorithm will have considered all nodes and all possible paths involving \(n\) nodes. As such, the algorithm is correct. \(\blacksquare\)

Challenges
  1. Prove, by induction, that the following are true.

    1. Given a fully connected tree with \(n\) nodes, it will contain \(n-1\) edges.

  2. Prove, by using the loop invariant, that the following algorithms work. ..

Answers
  1. Induction proofs (these are only my examples, you could have proved them differently)

    1. Given a fully connected tree with \(n\) nodes, it will contain \(n-1\) edges.

      • Base case: If \(n = 3\), the tree will contain two edges which is \(n-1=2\). Therefore it works for the base step.

      • Induction hypothesis: Assuming that for \(n\) nodes, the graph contains \(n-1\) edges (Let the number of edges be \(E\)).

      • Induction step: Looking at when we add a node to the tree, we see that the number of nodes increases by one, but so does the number of edges for it to be fully connected. Therefore for \(n+1\) nodes, there needs to be \(E + 1\) edges.

        \[\begin{align*} E + 1 &= (n+1) - 1 \\ (n - 1) + 1 &= (n + 1) - 1\\ n &= n\\ LHS &= RHS ~~ \end{align*}\]
      • From this we are able to say that \(S_n \implies S_{n+1}\) and so by induction there are \(n-1\) edges on a fully connected tree with \(n\) nodes. \(\blacksquare\)

References