Unit 4 — Area of Study 1 (From the Study Design)
On completion of this unit the student should be able to establish the efficiency of simple algorithms and explain soft limits of computability.

RAM Model of Computation

Machine-independent algorithm design relies on a hypothetical type of computer called a Random Access Machine (RAM) ,[Skiena2020]. In this model of computing, we have a computer where:

  • Every simple operation (\(+\), \(-\), \(*\), \(=\), if, call) takes exactly one time step.

  • Loops (for, while) are not considered simple operations. Instead, they are made up of many single-step (simple) operations.

  • Each memory access takes one step (The RAM does not have any limit on the amount of memory it has).

In the RAM model, we measure the run time of an algorithm by counting the number of steps it takes on a given instance.

A Note on Accuracy

The RAM is a simple model of how computers work. It may not necessarily be the most accurate model. After all, multiplying two numbers tends to take longer than adding them on most computers, violating the first assumption of the RAM. Also, modern multi-threading loops will probably break the second assumption. And the time to access memory will depend on where that memory is stored, so there goes the third assumption as well.

But even with all of these issues, the RAM model gives us a good understanding of how an algorithm will behave on a computer. Which is why we use it.

The RAM model allows us to study and understand algorithms in a machine and language-independent way.

Best-, Worst-, and Average-Case Complexity

By using the RAM model, we can count how many steps our algorithm will take on any given input. But this is not enough. To know how good/bad our algorithm is in general, we need to know how it works across all possible instances.

To do this, we can plot the problem size (\(n\)) that we give to our algorithm against the number of steps it takes to complete, with a bunch of different cases for each size.

best worst average case

From this plot, we can determine three different functions:

Worst-Case Complexity

The function defined by the maximum number of steps taken in any instance of size \(n\). This is the curve passing through the highest point in each column.

Average-Case Complexity

The function defined by the average number of steps over all instances of size \(n\). This can also be called the expected time of the algorithm.

Best-Case Complexity

The function defined by the minimum number of steps taken in any instance of size \(n\). This is the curve passing through the lowest point in each column.

In algorithm analysis, we tend to focus on the worst-case complexity. This is because we can avoid figuring out what the "average" of our algorithm is, and we can focus purely on what is the absolute maximum amount of time our algorithm will take to run. (We rarely consider the best-case complexity because the input required for it is unlikely to arise).

Each one of these functions could be described as a mathematical function, like \(y=2x^2 + 4x - 2\) or any other function. But because these can get quite complicated, we simplify them by using Big Oh notation.

Big Oh notation

As mentioned before, these complexities can be represented as mathematical functions with respect to the problem size \(n\). But it is difficult to work precisely with these functions because they:

Have too many bumps

Because some algorithms tend to work better with data sizes of specific values, their worst-case complexity may fluctuate between values of \(n\). For example, binary search algorithms usually work faster with array of size \(n=2^k -1\) (where \(k\) is an integer), because the array divisions work out nicely.

Require too much detail to specify precisely

If you wanted to calculate the exact number of steps that the algorithm would run in the worst case, you would need to specify the algorithm to the detail of a complete computer program. Also, the precise function would probably rely on small programming details (see switch cases vs if statements).

For example, take the worst-case complexity function:

\[T(n) = 19324n^2 + 1230n - 110\]

This may look quite complicated, but it really tells us little more than that "the time to run grows quadratically with \(n\)."

It turns out that it’s much easier to consider the upper and lower bounds of complexity functions by using Big Oh notation, which simplifies our analysis by removing levels of detail that are not required.

Formal Definition [Levitin2006]
\(f(n) \in \O(g(n))\) means that \(c \cdot g(n)\) is an upper bound of \(f(n)\). Where \(c\) is a constant such that \(f(n)\leqslant c \cdot g(n)\) for every large value of \(n\) (where \(n \geqslant n_0\)).

In Big Oh notation, the functions \(f(n)= 2n + 3\) and \(h(n) = n\) are the same. This is because Big Oh notation ignores multiplicative and additive constants, that is, it doesn’t care about constants that multiply or add to \(n\).

The following values all have the same Big Oh:

\[n \text{, } 100n+2 \text{, } 0.5(10n + 2) ~~ \in \O(n)\]

The formal definition of \(O\) assumes that there is a constant \(n_0\) beyond which \(c \cdot g(n)\) is the upper bound. This is because we do not care about very small values of \(n\) (values to the left of \(n_0\)). This makes sense because we don’t really care about which algorithm sorts four items the fastest, but we do need to know which algorithm is faster when sorting 10,000 items.

big o

As an example, let’s determine what the Big Oh notation for \(f(n) = 3n^2−100n+6\) is. If \(c = 3\), then \(3n^2 \geqslant f(n)\) and so \(f(n) \in \O(n^2)\).

I should note that all functions that are \(\O(n)\) are also technically \(\O(n^2)\). This can get a little confusing, but we usually use the \(O\) with the smallest degree that is possible.

Generalised Classes of Efficiencies

Below is a table of the general form of different efficiency functions (\(g(n)\)) along with their informal names in order from best to worst efficiency. We tend to use the informal name when generally describing algorithms as we mightn’t particularly care about exact values.

Class Name Notes

\(1\)

Constant

The best possible efficiency, however there aren’t many actual algorithms that are this good. Things that happen in constant time tend to be just single calculations (\(a + b\)).

\(\log{n}\)

Logarithmic

Usually a result of cutting a problem’s size by a constant factor on each iteration. This is because for an algorithm to be logarithmic it can’t consider all of the input (\(n\)).

\(n\)

Linear

Algorithms that go over a set of size \(n\) exist in this class.

\(n \log{n}\)

Linearithmic

Most divide-and-conquer algorithms fit into this class (in fact, their value can be found with the master theorem, which we will discuss later).

\(n^2\)

Quadratic

Usually are algorithms with a loop inside of another loop.

\(n^3\)

Cubic

Same as quadratic, but with three loops, rather than two.

\(n^k\)

Polynomial

Basically any algorithm with \(k\) loops inside of each other.

\(2^n\)

Exponential

Usually are algorithms that generate all subsets of an \(n\)-element set

\(n!\)

Factorial

Usually are algorithms that generate all permutations of an \(n\)-element set (brute force tends to be an example of factorial).

Source: [Levitin2006].

We will look at figuring out what an algorithm’s complexity is a bit later.

Big Omega Notation

We’ve discussed notation for the upper bounds of the worst-case performance of an algorithm. But now we can discuss Big Omega notation, which is for the lower bound of an algorithm’s worst-case complexity.

big omega

A function \(f(n)\) is said to be in \(\Omega(g(n))\) if \(f(n)\) is bounded below by a constant multiple of \(g(n)\) for all large values of \(n\). That is, for a positive constant \(c\) and a non-negative integer \(n_0\), then \(f(n) \geqslant cg(n)\) for all \(n \geqslant n_0\). [Levitin2006]

For example, a function that has a complexity of \(n^3\) is in \(\Omega(n^2)\) because \(n^3 \geqslant n^2\) for all \(n\geqslant 0\) (if we make \(c = 1\) and \(n_0 = 0\)). This is because \(n^2\) is a lower bound of \(n^3\).

Big Theta Notation

If big oh is for the upper bound of worst-case performance, and big omega is for its lower bound, then big theta must be for the middle ground. Big-\(\Theta\) is bounding the function above and below. Because of this, the definition is a little bit more complicated, but not that much.

big theta

A function \(f(n)\) is in \(\Theta(g(n))\) if \(g(n)\) is bounded both above and below by constant multiples of \(g(n)\) for all large values of \(n\). That is, for positive constants \(c_1\) and \(c_2\) and a non-negative integer \(n_0\) then \(c_2 \cdot g(n) \leqslant f(n) \leqslant c_1 \cdot g(n)\) for all values of \(n\) that are greater than or equal to \(n_0\).

Calculating Big-O

Non-Recursive Algorithms

To figure out the complexity of a non-recursive algorithm follows a pretty simple series of steps:

  1. Determine a parameter to indicate the input’s size.

  2. Figure out what the algorithm’s basic operation is (check the innermost loop).

  3. Check if the number of times the basic operation is run is only dependent on the input size. If it isn’t, you might have to look at best-, worst- and average-case complexity individually.

  4. Create a sum that expresses the number of times the algorithm’s basic operation is executed.

  5. Use sum manipulation to find a closed-form formula for the count, or, at the least, establish the order of growth.

Let’s try this out on an algorithm to determine the largest value in an array.

pseudocode
function MaxElem(A[0 ... n-1]):
    // input: an array A of n integers
    // output: the value of the largest element in A
    max_val <- A[0]
    for i <- 0 to n do:
        if A[i] > max_val:
            max_val <- A[i]
    return max_val

The first thing we do is look at the input’s size, in this case it is \(n\): the length of the array. Then we look at the basic operation, which is usually in the innermost loop, let’s say that it’s the if-statement, so we are counting comparisons. And because it only depends on the value of \(n\), we can go straight to making the sum. In this case, there is one comparison per item in the array, so we can write our function to count comparisons like this:

\[C(n) = 1_1 + 1_2 + \cdots + 1_n\]

Where \(C(n)\) represents the number of comparisons for an input of size \(n\).

But this isn’t really in a nice form, so we use summation notation, which you might recognise from methods or specialist mathematics:

\[C(n) = \sum_{i=1}^{n} 1\]

which then by using our summation rules, can be simplified to \(C(n) = n\), therefore our MaxElem function is \(\O(n)\)

Recursive Algorithms

The process of finding the complexity of a recursive algorithm is similar to that of a non-recursive algorithm. But instead of creating a sum, we create a recurrence relation.

  1. Determine a parameter to indicate the input’s size.

  2. Figure out what the algorithm’s basic operation is.

  3. Check if the number of times the basic operation is run is only dependent on the input size. If it isn’t, you might have to look at best-, worst- and average-case complexity individually.

  4. Create a recurrence relation (with an appropriate initial condition) for the number of times the basic operation is executed.

  5. Solve the recurrence relation.

pseudocode
function Fact(n):
    // finds n!
    // input: a non-negative integer n
    // ouput: n!
    if n is 0 then return 1
    else return Fact(n-1) * n

Let’s pick the multiplication on the final line to be our basic operation. And now we can create a recurrence relation, which consists of a recursive mathematical function and its base case.

\[\begin{align*} M(0) &= 0 \\ M(n) &= M(n-1) + 1 \\ \end{align*}\]

From here we need to "solve" the relation. That is, we need to find a non-recursive way of writing \(M(n)\). The easiest way I’ve found to do this is to write out the first few values of \(n\) and try to spot the pattern. So let’s try that:

\[\begin{align*} M(1) &= M(0) + 1 = 1 \\ M(2) &= M(1) + 1 = 2 \\ M(3) &= M(2) + 1 = 3 \\ \end{align*}\]

From that information it looks to be pretty simple: \(M(n) = n\). Now, whilst this is fine, it’s always a good idea to check your working. And to do this we solve our recurrence relationship by subbing in our new equation (\(M(n) = n\)).

\[ \begin{align*} M(n) &= M(n-1) + 1 \\ M(n) &= (n-1) + 1 \\ M(n) &= n - 1 + 1 \\ M(n) &= n \\ \end{align*}\]

After subbing in our new equation, we can see that it is indeed the same as what we came up with when looking at the different values of \(n\). Therefore \(\text{Fact} (n)\in \O(n)\).

Because recursive algorithms can be a little bit confusing, let’s try another one.

pseudocode
function S(K):
    // input: a positive integer K
    // output: the sum of the first K cubes
    if n == 1 return 1
    else return S(n-1) + n * n * n

And in this case we’re going to count the number of multiplications again. So let’s set up our recurrence relation:

\[\begin{align*} M(1) &= 0 \\ M(n) &= M(n-1) + 2 \\ \end{align*}\]

Let’s look at the value of \(M(n)\) for \(n \in \{2,3,4\}\):

\[\begin{align*} M(2) &= M(1) + 2 = 2 \\ M(3) &= M(2) + 2 = 4 \\ M(4) &= M(3) + 2 = 6 \\ \end{align*}\]

From this we can see that our equation is going to be \(M(n) = 2(n-1)\). But we do need to check to make sure:

\[\begin{align*} M(n) &= M(n-1) + 2 \\ &= 2((n-1)-1) + 2\\ &= 2(n-2) + 2 \\ &= 2n-4 +2 \\ &= 2n-2 \\ &= 2(n-1) \\ \end{align*}\]

Therefore \(S(n) \in \O(n)\) because remember that big oh doesn’t care about addition or multiplication of \(n\) in its approximation.

The Master Theorem

The master theorem is a way to quickly solve \(\O\) for a recursive divide-and-conquer algorithm.

Where you have a recurrence relationship of:

\[T(n) = \begin{cases} aT(\frac{n}{b}) + kn^c \quad &\text{if } n > 1 \\ d \quad &\text{if } n=1 \\ \end{cases}\]

as long as \(a,b,c\) and \(k\) are all constants (and are \(a \gt 0, b \gt 1, c \geqslant 0, d \geqslant 0, k \gt 0\)), then it solves to:

\[T(n) \in \begin{cases} \O(n^c) & \text{if } \log_b{a} < c\\ \O(n^c \log{n}) &\text{if } \log_b{a}=c\\ \O(n^{\log_b{a}}) &\text{if } \log_b{a}> c \\ \end{cases}\]

P and NP Class Problems

In algorithm analysis we deal with a special subset of problems called P and NP problems. P class problems are problems that can be solved in polynomial time (hence P), that is, they are \(\O(n^k)\) where \(k\) is any positive integer. And NP class problems are ones where we can verify a solution in polynomial time, but might not be able to solve it in polynomial time. [Cormen2013]

An NP-complete problem is solvable in polynomial time with a "lucky" algorithm. What that means is that the algorithm will always get it correct on the first try, it doesn’t need to try all the options. This is basically the same as saying that we can verify a solution in polynomial time, but mightn’t be able to solve it.

NP-Complete Problems

NP-Complete problems are the "hardest" problems in NP. Informally, an NP-complete problem must satisfy two conditions:

  1. the problem must be NP, and

  2. if there is a polynomial time algorithm for the problem, then there is a way to convert every problem in NP into this problem in such a way to solve them in polynomial time.

NP-Hard Problems

For a problem to be NP-hard it must satisfy the second condition of an NP-complete problem — that is, if there is a polynomial time algorithm for the problem, then there is a way to convert every problem in NP into this problem in such a way to solve them in polynomial time — but may or may not be in NP. An NP-hard problem is at least as hard as any other problem in NP.

An example of an NP-Hard problem is the travelling salesman problem as it cannot be solved in P time and also a solution cannot be verified in P time. This makes it NP-hard as it is as hard as the most hard problem in NP.

Summary

  • P: Problems solvable in polynomial time.

  • NP: Problems where their solution is verifiable in polynomial time, but we may not be able to solve it in polynomial time.

  • NP-Hard: A problem that may or may not be in NP, but is at least as hard as the hardest problems in NP and if a polynomial time algorithm was found to solve it, then every problem in NP could be converted into this problem.

  • NP-Complete: A problem that is both NP-hard and in NP. \(\text{NP-Complete} \in \text{NP-Hard} \cap \text{NP}\) [Demaine2014]

A line graph of computational difficulty comparing P
Figure 1. A line graph of computational difficulty comparing P, NP, and NP-hard

Decision Problems

When working with P and NP, we tend to consider a subset of all problems called "decision problems". Decision problems are problems where the answer is a simple yes or no, where you can decide if the statement is true or not. Most problems can be converted into decision problems.

An example of a decision problem might be: "given a list of coins and a value \(k\), can the coins be used to sum to \(k\)?". Note that the problem isn’t asking for the coins that add up to \(k\), just whether or not they can add up to \(k\).

Decision problems can be used in solving optimisation problems as well, that is, they can solve for things like "what is the length of the shortest path in this graph" or "what is the minimum number of coins I can use to give change for \(k\) dollars". The way this works is by framing the decision problem as working over a smaller subset of the optimisation problem and then increasing the size of this subset multiple times with different questions.

So, if we were to take the question "what is the minimum number of coins I can use to give change for \(k\) dollars", we could convert it to a decision problem like so: "can I give change for \(k\) dollars using only \(n\) coin(s)". Notice how we’ve added the variable \(n\). This will keep track of the number of coins we are allowed to use. So solving the original problem is as easy as solving the decision problem with increasing values of \(n\) until we find a solution.

Challenges
  1. What is meant by the term "asymptotic time complexity"?

  2. How does asymptotic time complexity differ from asymptotic space complexity?

  3. Use the most appropriate notation (\(\O, \Theta, \Omega\)) to find the following classes of time complexity for the following algorithm.

    python
    def alg(A):
        # where A is an array of at least six elements
        max = 0
        if A[5] == A[0]:
            return A[5]
        for i in A:
            if i > max:
    	   max = i
        return max
    1. Worst-case complexity

    2. Best-case complexity

    3. Average-case complexity

  4. Using the algorithm from the previous question, what input would result in the best-case performance?

  5. For each of the following, state an example of an algorithm that would result in the time complexity presented.

    1. \(\O(n)\)

    2. \(\O(\log{n})\)

    3. \(\O(n^2)\)

    4. \(\O(n \log{n})\)

    5. \(\O(n!)\)

  6. In your own words, what are P class problems? And how are they different to NP problems?

  7. Given the recurrence relation \(T(n) = 3T(n/2) + n^2\), determine its big oh value by using the master theorem.

  8. Label this diagram with the terms P, NP, NP-complete and NP-hard. And state the relationship between P and NP for each section.

    p eq np
  9. Robert has created a video game that they think is NP-hard. But Ben says that he’s found a way to win the game in polynomial time. If the game is indeed NP-hard — and Ben’s algorithm does run in polynomial time — how would that affect the relationships between P, NP, NP-hard and NP-complete problems?

  10. Darren and Will are discussing finding the largest number in a random set of numbers that has been given to them.

    Will thinks that he has made the best algorithm to find the largest number given an array:

    python
    def will_alg(A):
        # input: A, an array of numbers
        for i in range(len(A)):
            for j in range(len(A) - i - 1):
                if A[j] > A[j + 1]:
                    A[j], A[j + 1] = A[j + 1], A[j]
        return A[-1]

    But Darren thinks that Will’s algorithm is garbage, and so he suggests a better one:

    python
    def darren_alg(A):
        # input: A, an array of numbers
        max_elem = 0
        for i in A:
            if i > max_elem:
                max_elem = i
        return max_elem

    Out of these two algorithms, which one is better? Justify your answer, making reference to each algorithm’s time complexity.

  11. Marcus has developed the following algorithm to find the \(k\)th even natural number (the first being \(0\)).

    pseudocode
    def even(k):
        // input: k, a positive number
        if k = 1 return 0
        else return even(k-1) + 2

    Marcus is unsure of the time complexity for their algorithm, but they think that it could be in \(\O(n)\). Create a recurrence relationship for even() and use it to say whether or not Marcus is correct.

Answers
  1. Asymptotic time complexity gives an indication of how fast an algorithm will run in terms of the input size as the input size becomes very large (towards infinity, hence "asymptotic").

  2. Asymptotic space complexity refers to the amount of memory space an algorithm will take to run, whereas time complexity refers to how long it will take to run.

  3. Find the classes of time complexity for:

    1. Worst-case: \(\O(n)\)

    2. Best-case: \(\Omega(1)\)

    3. Average-case: \(\Theta(n)\)

  4. It would be an array where A[0] == A[5] because then the algorithm would avoid looping through all items in A.

  5. State an example of an algorithm for:

    1. \(\O(n)\): Anything that loops through all the items in a list once.

    2. \(\O(\log{n})\): Anything that divides the input size by a constant factor for each operation, but does not need to recombine those results. An algorithm that takes \(\O(\log{n})\), by definition, cannot look through all of \(n\).

    3. \(\O(n^2)\): An algorithm that consists of one loop through \(n\) inside another loop through \(n\).

    4. \(\O(n \log{n})\): An algorithm that divides the input size by a constant factor with each iteration, but rejoins all the items together at the end. So, most divide-and-conquer algorithms.

    5. \(\O(n!)\): Any algorithm that generates all permutations of its input will run in factorial time.

  6. P problems are those that are solvable in polynomial time, whereas NP problems are those where potential solutions can be verified in polynomial time, but no algorithm exists to solve them in polynomial time. P is a subset of NP.

  7. \( a = 3, b = 2, c = 2, k = 1 \\ \log_b{a} = \log_2{3} \approx 1.58 \\ \log_b{a} \lt c \\ \therefore T(n) = 3T(n/2) + n^2 \in \O(n^2) \)

  8. Label this diagram with the terms P, NP, NP-complete and NP-hard.

    p eq np answer
  9. By finding a solution to an NP-hard problem in polynomial time, Ben has shown that P=NP=NP-complete. But NP-hard stays the same as it is the set of all problems that are at least as hard as the hardest in NP.

  10. Darrens’s algorithm is better than Will’s because it runs faster than Will’s algorithm. If we define \(n\) as the length of the input array, then darren_alg() runs in \(\O(n)\), and will_alg() would run in \(\O(n^2)\).

  11. Counting additions.

    \( A(1) = 0 \\ A(n) = A(n - 1) + 1 \)

    which then simplifies to \(A(n) = n - 1\), therefore Marcus is correct. Their algorithm runs in \(\O(n)\).

References