Algorithms I CMPUT204

Loop Invariant (Topic 2)

Proving Correctness

Loop Invariant

Steps for Using a Loop Invariant

  1. Identify the loop invariant
  2. Prove the loop invariant for the following code states

Insertion Sort - Loop Invariant Proof

Example Library

Asymptotic Notation (Topic 3)

Analysis of algorithms becomes the analysis of functions

Breakdown of Big \(O\)

Big O

Definition
\(h(n) \in O(f(n)) \iff \exists c, n_0\) such that \(\forall n \geq n_0, h(n) \leq cf(n)\)

Polynomials

Big \(\Omega\)

Big \(\Omega\)

Definition
\(h(n) \in \Omega(f(n)) \iff \exists c, n_0\) such that \(\forall n \geq n_0, h(n) \geq cf(n)\)

Big \(\Theta\)

Little \(o\) and \(\omega\)

Useful Tips

Limits

Proving Non-membership

Properties of Big-O Notation

Logarithms Review

Functions in Increasing order of Growth

Divide and Conquer, Merge Sort (Topic 4)

Splitting up

Merge Sort

mergeSort(Arr, low, high) {
    if(low < high) {
        let mid = (low + high)/2;  //integer (floor) division
        mergeSort(Arr, low, mid);  //each works one half of Arr
        mergeSort(Arr, mid + 1, high);
        merge(Arr, low, mid, high) //merges the sublists
        // this "merge" functions needs all the endpoints to know what to merge
    }
}

Efficiency of Merge Sort

Solving Recurrences

Iterated Substitution

#### Merge Sort Application - $T(n) =\left\{\begin{array}{lr}0 & \text{if } n=1\\2 \times T\left(\frac{n}{2}\right)+ (n-1), & \text{if } n \geq 2\end{array}\right\}$, so - $T(2^{k})= (2^{k}- 1) + 2 \times T(2^{k}-1)$ $= (2^{k}- 1) + 2 \times ((2^{k-1} -1) + 2 \times T(2^{k-2})) = \dots$ - Eventually, by writing this out to the base case using $\dots$ and factoring, we get $k \times 2^{k}- \sum\limits^{k-1}_{i=0} 2^{i}= k\times 2^{k}- (2^{k}-1) = (k-1)2^{k}+ 1$ - Since $n=2^k$, we have $k = \log(n) \implies T(n) = n \log(n - 1) + 1$ - We have found $T(n)$, but not proven it. For that, we need induction. The induction proof follows structurally from the recurrence relation

Recurrence Tree

Guess and Test

Master Theorem

Let \(a \geq 1\) and \(b > 1\) be constants, and let \(f(n)\) be a function. Let \(T(n)\) be defined on non-negative integers by the recurrence \[T(n) = aT\left(\frac{n}{b}\right)+ f(n)\]

Then \(T(n)\) has the following bounds

Categorizing Recurrences (appendix)

Heaps, Priority Queue, HeapSort (Topic 5)

Heaps

Array Representation

Array Indices for Heaps

Definition

Almost-Heaps and Max-Heapify

// precondition: Tree rooted at A[i] is an almost-heap
// postcondition: Tree rooted at A[i] is a heap

lc = leftchild(i), rc = rightchild(i), largest = i;

//finds the largest child of A[i] (either left or right) 
if(lc <= heapsize(A) && A[lc] > A[largest]) {
    largest = lc
} //else?
if(rc <= heapsize(A) && A[rc] > A[largest]) {
    largest = rc
}

//if the root is not in the right spot, it switches it it with
//the largest child, then recursively calls Max-Heapify
//on that child's subheap
if(largest != i) {
    swap(A[i], A[largest])
    Max-Heapify(A, largest)
}
Running Time

Building a Heap

Procedure
  1. Each leaf node is a key in and of itself, since it follows the max-heap property
  2. Thus, the nodes on the second-bottom layer are almost-heaps, since both of their children are the roots of heaps
  3. Now, the nodes on the third level are almost-heaps because we turned their children into heaps. We can also Max-Heapify this into heaps
  4. Keep repeating this process recursively up the tree
  5. The whole tree becomes a heap
Pseudocode
let i = floor(length(A)/2);

for (i downto 1) {
    Max-Heapify(A, i)
}
Running Time (loose)
Running Time (tight)
- $c$ is the worst case running time for `max-heapify` on a tree with a root and two leaves (it's constant) - To find the WC runtime of each layer, we take $\text{number of nodes in the layer} \times \text{number of layers below node} \times c$ - We need to sum all of these to get the RT for the whole algorithm - Height: $k = \lfloor \log{n} \rfloor$ - Base of tree ($k=0$): at most $2^{\lfloor \log{n} \rfloor}$ - Number of nodes: $2^n$ - For an $n$-element heap, the maximum number of elements of height $k$ is $\dfrac{n}{2^k}$ - So: $\displaystyle\sum\limits_{k=1}^{\log{n}} kc \times \dfrac{n}{2^{k}}\geq T(n)$ - We can use algebra to find that is this $O(n)$ (i.e. reduces to a constant times $n$) - Another way to look at it: $T(n) \leq \displaystyle\sum\limits_{k=1}^{\log{n}} kc \times \dfrac{n}{2^{k}} \leq nc \sum\limits_{k=1}^{\infty}\dfrac{k}{2^k}$. Note that $\displaystyle\sum\limits_{k=1}^{\infty}\dfrac{k}{2^{k}} \leq \int^{\infty}_{0}\dfrac{x}{2^{x}} \, dx$, which is constant

Priority Queues

Definition and Motivation

Heap Implementation

Heapsort

Motivation and Implementation Sketch

Runtime

Quicksort, Sorting Lower Bound BST, Balanced BST, Hash Table (Topic 6)

Quicksort

Pseudocode

QuickSort(A, p, r): 
    // sorts A = [p ... r]
    if(p < r) {
        // A[q] is the pivot
        // all elements before q are smaller, all after are larger
        // note: partition both modifies A (side effect) AND returns q
        q = Partition(A, p, r)
        // recursive calls
        QuickSort(A, p, q-1)
        QuickSort(A, q+1, r)
    }

// Partition subprocedure
Partition(A, p, r):
    // last element of array being examined is picked as a pivot
    pivot = A[r];
    // "switch counter" (i is an index; counts up from start index)
    i = p-1
    for(j from p to r-1) { // counts up from start to second last element
        // if current element is smaller than pivot, switch and increment counter
        if(A[j] <= pivot) {
            i = i+1;
            swap(A[i], A[j])
        }
    }
    // switch last element and element in the "pivot spot", as determined by
    // incrementing i
    // note that we compared with r when switching and incrementing, so this
    // is why we can simply swap these and know the list is in the right form
    swap(A[i+1], A[r])
    return i+1
Partition Explanation

Runtime

Notes (meta)
Recursion Tree
Recurrence
Worst-case
Best-case
Average Case
(More) Formal Analysis

Storage

Random Quicksort

Notes on Randomness

Lower Bound for Sorting

Useful Trees

Recursion Tree

Decision Tree

Showing a Sorting LB with Decision Trees

Dynamic Sizes

Binary Search Trees (BSTs)

Tree Terms

Binary Search Tree Definition

Binary Tree Operations

AVL Trees (Balancing BSTs)

AVL Rotations

AVL Tree Definition

Maintaining the AVL Property

Hash Tables

Hash Table Definition

Hash Table Runtime

Types of Hashing

Greedy Methods (Topic 7)

Greedy Algorithms

Greedy Algorithm Paradigm

Paradigm
Always make choices that look best at the current step. These choices are final and cannot be changed later

Optimization Problems

Proof Strategy

Properties of Greedy Solutions

Fractional Knapsack Problem

Intuition/Proof

Job Scheduling Problem

Activity Selection

Divide-and-Conquer Revisited (Topic 8)

Exponentiation

Karatsuba's Algorithm for Large Integer Multiplication

Strassen Matrix Multiplication

Summary

Dynamic Programming (Topic 9)

Dynamic Programming Paradigm

Paradigm
Problems are solved recursively, but the results of individual recursive calls are saved and may be re-used

Divide and Conquer vs. DP

Example: Fibonacci

Properties of DP Solutions

General Steps for DP

  1. Find a recurrence relation for the problem
  2. Check if the recurrence ever makes any of the same calls, and if there are reasonable amount of different ones
  3. Describe array (or I guess any data structure) of values that you wish to compute. Each value will be the result of a specific recursive call
  4. Fill the array from the bottom up* (backward induction). Solutions to subproblems will be looked up from the array instead of being computed themselves

Integral Knapsack Problem

Naive Recursive Solution

Applying DP

Better Solution: Working Bottom-up

Rod Cutting Problem

Longest Common Subsequences

Graphs, BFS, DFS (Topic 10)

Graph Terms

Paths and Cycles

Graph Representations

Graph Traversals

Node Classification

Generally, we color a node grey when we visit it for the first time, then color it black when we have exhausted (searched) all of its adjacencies. We ignore black nodes for further searches.

Nodes are always visited in layer order

Pseudocode

// G is a graph
// s is the starting vertex

void BFS(G, s)

for each vertex v in the graph, do
    v.color = "WHITE";
    v.dist = Infinity;
    v.predec = NULL;

Initialize a queue Q

// start examining s because it is the first node
// add it to the queue and change color to grey
s.color = "GREY"
s.dist = 0
enqueue(Q, s)

while(Q !== []) {
    u = dequeue(Q)    // remove from the queue
    for each neighbour v of u {
        if(v.color = "WHITE") {
            v.color = "GREY"
            v.dist = u.dist + 1
            v.predec = u
            enqueue(Q, v)   // add v to the Queue
        }
    }
    u.color = BLACK   // done visiting the neighbours of u
}

Explanation

BFS Trees

Properties

Running time

BFS for Disconnected Graphs

Pseudocode

int time  // global

void DFS(G)

    // initialize graph
    for each vertex v
        v.color = "WHITE"
        v.predec = NULL
    
    time = 0

    // visit all unvisited nodes
    for each vertex v
        if(v.color = "WHITE") {
            DFS-visit(G, v)
        }

void DFS-visit(G, s)

    // examining "first" node
    s.color = "GREY"
    time = time + 1
    s.dtime = time

    for each u, neighbour of s {
        if(u.color = "WHITE") {
            u.predec = s
            // recursive call
            DFS-visit(G, u)
        }
    }

    // done visiting everything
    s.color = "BLACK"
    time = time + 1
    s.ftime = time
    

Explanation

Running Time

DFS Parenthesis Theorem

White Path Theorem

Edge Classification for BFS/DFS

DFS trees only have forward and back edges, no cross edges.

BFS trees only have tree and cross edges, no forward/back edges

DFS Applications

Directed Acyclic Graphs (DAGs)

Using DFS to Determine if a Graph is a DAG

Topological Sorting

Motivation
Algorithm I - Khan's Algorithm
void topo-sort(G) { // khan's algorithm
    S = []
    for each vertex v {
        if in-degree(v) == 0 {
            S.enqueue(v)
        }
    }
    
    i = 1;
    
    while(S !== []) {
        v = S.dequeue();
        print(v);
        i++;
        for each edge [v, u] {
            remove edge [v, u]
            if in-degree(u) == 0 {
                S.enqueue(u)
            }
        }
    }
    
    if i < n {
        print("G has a cycle")
    }
}
Algorithm II - DFS

Strongly Connected Component

Algorithm
  1. Run DFS on \(G\)
  2. Flip \(G\)'s edges to create \(G^T\) (i.e. we do \(A \to A^\top\) to the adjacency matrix)
  3. Following the decreasing order of \(ftime\), run DFS on \(G^T\)
  4. SCCs of \(G\) are the trees of the DFS forrest of \(G^T\)

Minimum Spanning Tree (Topic 12)

Introduction (recap)

Greedy Algorithms for MST

Kruskal's Algorithm

We keep adding the smallest edges to the spanning tree that don't create a cycle

Pseudocode

T = []
for each (v in V(G)) {
    Define cluster C(v) = v
}

sort edges in E(G) into non-decreasing weight order

for each (edge e = (u, v) in E(G)) {
    if(C(u) != C(v)) {
        T = [...T, e]   // union
        merge clusters C(u) and C(v)
    }
}

return T

Explanation

Correctness

Runtime

So the total time is \(O((m+n)\log{n})\)

Useful Theorem for Kruskal Proofs

Prim's Algorithm

Pseudocode

void primMST(G)

for each (v in V(G)) {
    v.key = infinity
    v.predec = NULL
}

// s is an arbitrary starting node (can be treated as a paramter)
s.key = 0

// Q is the set of nodes that haven't been looked at yet
Initialize a min-priority-queue Q on V using V.key

while (Q != []) {
    u = ExtractMin(Q)
    // finds the node with the lowest cost to connect to
    for each (v neighbour of u) {
        // w(u, v) is the weight of the edge between nodes u and v
        // if there's a shorter way to get to the node than moving directly
        if(v in Q and w(u, v) < v.key) {
            v.predec = u
            decrease-key(Q, v, w(u, v))
        }
    }
}

Running Time

Aside: Cuts

Shortest Path (Topic 13-14)

From MST to Shortest Path

Dijkstra's Algorithm

Overview

Pseudocode

// G is the graph, G = (V, E)
// w is the weights
// s is the starting ndoe
void dijkstra(G, w, s) {
    for(each v) {
        v.dist = infinity
        v.predec = NULL
    }
    // distance to self is 0
    s.dist = 0

    Build a Min-Priority-Queue Q on all nodes, key = dist
    // Q is the set of nodes whose shortest paths we are not sure about
    // while we still have nodes to consider:
    while(Q != []) {
        // get the closest node
        u = ExtractMin(Q)
        for(each v neighbour of u) {
            // if we find a shorter path going through an existing subpath
            // update to make that path the new shortest path
            // this is called RELAXATION
            if(v.dist > u.dist + w(u, v)) {
                v.dist = u.dist + w(u, v)
                v.predec = u
                decrease-key(Q, v, v.dist)
            }
        }
        ??? dequeue(u)
    }
}

Relaxation

if(v.dist > u.dist + w(u, v)) {
    v.dist = u.dist + w(u, v) 
}

Correctness

Loop invariant: At the beginning of each while-loop iteration for node \(u\), we claim:

\(u.dist=d(s, u)\) for all \(u\) that aren't in the queue (i.e. \(u\in V-Q\)), i.e. nodes we aren't considering anymore \(u.dist \geq d(s, u)\) for all \(u\) that are in the queue (i.e. \(u\in Q\))

Runtime

We have \(\dots < + O(n\log{n})+O(m\log{n})\), so the total runtime is \(O((n+m)\log{n})\) with an adjacency list

Drawbacks

Dijkstra's Algorithm for DAGs

If a graph has been sorted topologically, we already know that we don't need to "backtrack" to find a shortest path: they must move "forward" in the DAG by definition. So, we don't need to build a priority queue; we can just look at adjacent nodes in their natural order.

Pseudocode

for each vertex v {
    v.dist = infinity
    v.predec = NULL
}

s.dist = 0

for i from 1 to n-1 {
    for each u adjacent to v[i] {
        // relaxation: update the distance if needed
        if(u.dist > v[i].dist + w(v[i], u)) {
            u.dist = v[i].dist + w(v[i], u)
            u.predec = v[i]
        }
    }
}

Runtime

It takes \(O(n)\) to preprocess and \(O(m)\) to execute the for loops (since the checks happen a constant amount of time per edge). So, the total runtime is \(O(n+m)\).

Bellman-Ford Algorithm

Slower than Dijkstra's algorithm, but can handle negative edge weights and is more flexible if the weight of an edge changes.

Idea: instead of picking the node of smallest distance sequentially, we do all of them all at once enough times that every needed relaxation can occur

Pseudocode

for each vertex v {
    v.dist = infinity
    v.predec = NULL
}

s.dist = 0

// relax each edge n times; this accounts for all the relaxations that may be needed (although it does so rather inelegantly)
for i from 1 to n-1 {
    for each edge (u, v) {
        // recall: relax checks whether a shorter path exists
        relax(u, v)
    }
}

// if things are still changing, it must be because a negative cycle exist
// (which will keep decrementing weights indefinitely)
// so, we return false in this case
for each edge (u, v) {
    if u.dist + w(u, v) < v.dist {
        return false
    }
}

return true

Runtime

The first for-loop runs in \(O(nm)\), since it executes \(n\) times and runs an \(O(1)\) procedure on each edge each time (\(O(m)\)).

Notes

The algorithm can find the shortest paths after less than \(n-1\) iterations, but still completes the rest of the computation every time.

Floyd-Warshall Algorithm

Floyd-Warshall aims to solve the All-Pairs Shortest Path problem (APSP): what is the shortest path between any two pairs of nodes in a graph?

Idea: run Dijkstra or Bellman-Ford at each vertex (\(n\) times). This would be very slow, but improvable with dynamic programming.

Subproblems

Subproblem: \(k-1\): for given vertices \(i, j\), find the distance of the shortest path from \(i\) to \(j\) where all the intermediate vertices are in \(\set{1, 2, \dots, k-1}\) (we have numbered the vertices \(1\dots n\))

Either we don't need vertex \(k\) (i.e. the same shortest path as subproblem \(k\) for \(i, j\)), in which case \(d[i, j, k]=d[i, j, k-1]\), or we do need \(k\), in which case \(d[i, j, k]=d[i, k, k-1]+d[k, j, k-1]\).

So we get the recurrence \(d[i, j, k]=\min\set{d[i, j, k-1], d[i, k, k-1]+d[k, j, k-1]}\), which is dynamic-programmable.

Runtime

Our table is \(n\times n\times n\) (since it's \(3\)-dimensional and there are \(n\) nodes), so filling the entire table (which requires \(O(1)\) operations) is \(O(n^3)\).

Cheat Sheet

Master Theorem (not sure if will be given)

Theorem
Let \(a \geq 1\) and \(b > 1\) be constants, and let \(f(n)\) be a function. Let \(T(n)\) be defined on non-negative integers by the recurrence \[T(n) = aT\left(\frac{n}{b}\right)+ f(n)\] Then \(T(n)\) has the following bounds

Heaps, Priority Queue, Heapsort

Array Indices for Heaps

Definition

Topic 6

QuickSort(A, p, r): 
    // sorts A = [p ... r]
    if(p < r) {
        // A[q] is the pivot
        // all elements before q are smaller, all after are larger
        // note: partition both modifies A (side effect) AND returns q
        q = Partition(A, p, r)
        // recursive calls
        QuickSort(A, p, q-1)
        QuickSort(A, q+1, r)
    }

// Partition subprocedure
Partition(A, p, r):
    // last element of array being examined is picked as a pivot
    pivot = A[r];
    // "switch counter" (i is an index; counts up from start index)
    i = p-1
    for(j from p to r-1) { // counts up from start to second last element
        // if current element is smaller than pivot, switch and increment counter
        if(A[j] <= pivot) {
            i = i+1;
            swap(A[i], A[j])
        }
    }
    // switch last element and element in the "pivot spot", as determined by
    // incrementing i
    // note that we compared with r when switching and incrementing, so this
    // is why we can simply swap these and know the list is in the right form
    swap(A[i+1], A[r])
    return i+1
- **Right-rotate**: the old root becomes the right child of the new root - **Left-rotate**: the old root becomes the left child of the new root - An **AVL** tree is a BST where, for any node, we have $|h_L - h_R| \leq 1$, I.e. the heights of both subtrees are as close as possible

Greedy Algorithms

D and C again

DP

  1. Check if the recurrence ever makes any of the same calls, and if there are reasonable amount of different ones
  2. Describe array (or I guess any data structure) of values that you wish to compute. Each value will be the result of a specific recursive call
  3. Fill the array from the bottom up* (backward induction). Solutions to subproblems will be looked up from the array instead of being computed themselves

all dp examples

Graphs

BFS

// G is a graph
// s is the starting vertex

void BFS(G, s)

for each vertex v in the graph, do
    v.color = "WHITE";
    v.dist = Infinity;
    v.predec = NULL;

Initialize a queue Q

// start examining s because it is the first node
// add it to the queue and change color to grey
s.color = "GREY"
s.dist = 0
enqueue(Q, s)

while(Q !== []) {
    u = dequeue(Q)    // remove from the queue
    for each neighbour v of u {
        if(v.color = "WHITE") {
            v.color = "GREY"
            v.dist = u.dist + 1
            v.predec = u
            enqueue(Q, v)   // add v to the Queue
        }
    }
    u.color = BLACK   // done visiting the neighbours of u
}

DFS

int time  // global

void DFS(G)

    // initialize graph
    for each vertex v
        v.color = "WHITE"
        v.predec = NULL
    
    time = 0

    // visit all unvisited nodes
    for each vertex v
        if(v.color = "WHITE") {
            DFS-visit(G, v)
        }

void DFS-visit(G, s)

    // examining "first" node
    s.color = "GREY"
    time = time + 1
    s.dtime = time

    for each u, neighbour of s {
        if(u.color = "WHITE") {
            u.predec = s
            // recursive call
            DFS-visit(G, u)
        }
    }

    // done visiting everything
    s.color = "BLACK"
    time = time + 1
    s.ftime = time
    

MST

Kruskal

T = []
for each (v in V(G)) {
    Define cluster C(v) = v
}

sort edges in E(G) into non-decreasing weight order

for each (edge e = (u, v) in E(G)) {
    if(C(u) != C(v)) {
        T = [...T, e]   // union
        merge clusters C(u) and C(v)
    }
}

return T

Prim

SP

Dijkstra

// G is the graph, G = (V, E)
// w is the weights
// s is the starting ndoe
void dijkstra(G, w, s) {
    for(each v) {
        v.dist = infinity
        v.predec = NULL
    }
    // distance to self is 0
    s.dist = 0

    Build a Min-Priority-Queue Q on all nodes, key = dist
    // Q is the set of nodes whose shortest paths we are not sure about
    while(Q != []) {
        // get the closest node
        u = ExtractMin(Q)
        for(each v neighbour of u) {
            // relaxation
            if(v.dist > u.dist + w(u, v)) {
                v.dist = u.dist + w(u, v)
                v.predec = u
                decrease-key(Q, v, v.dist)
            }
        }
        ??? dequeue(u)
    }
}

or

Recursive Algorithms (Warm Up)

TODO

Fast multiplication where you square values instead of recalculating them every time: was a quiz question that I ate shit on