What is Complexity Analysis?

  • Complexity Analysis is the mathematical evaluation of the resources required by an algorithm.
  • Instead of measuring execution time in seconds (which varies across different hardware, operating systems, and CPU loads), we measure complexity in terms of the number of basic operations and memory slots used relative to the input size $n$.
  • We express this using asymptotic notations to describe the growth rate of resource usage as the input size scales to infinity.

Why It Matters

  • Hardware improvements (faster CPUs, more RAM) cannot fix a fundamentally inefficient algorithm. As the scale of data grows, algorithmic efficiency dominates hardware performance.
  • Scaling Breakdown

    Suppose a computer can perform $10^9$ operations per second. Here is how different algorithms scale for an input size $n = 10^6$ (one million elements):

    • Logarithmic time $O(\log n)$ $\approx 20$ operations $\implies$ $0.02$ microseconds
    • Linear time $O(n)$ $= 10^6$ operations $\implies$ $1$ millisecond
    • Linearithmic time $O(n \log n)$ $\approx 2 \cdot 10^7$ operations $\implies$ $20$ milliseconds
    • Quadratic time $O(n^2)$ $= 10^{12}$ operations $\implies$ $1,000$ seconds (~$16.6$ minutes)
    • Exponential time $O(2^n)$ $= 2^{10^6}$ operations $\implies$ Will never finish (longer than the age of the universe)

    Choosing the correct algorithm can mean the difference between an instant response and system failure.

How It Works

1. Asymptotic Notations

  • We use three primary mathematical notations to describe the bounds of an algorithm:
NotationNamePurposeMathematical Definition
$O(g(n))$Big OUpper Bound (Worst case)$T(n) \le c \cdot g(n)$ for all $n \ge n_0$
$\Omega(g(n))$Big OmegaLower Bound (Best case)$T(n) \ge c \cdot g(n)$ for all $n \ge n_0$
$\Theta(g(n))$Big ThetaTight Bound (Average case)$c_1 \cdot g(n) \le T(n) \le c_2 \cdot g(n)$
    • Worst-case ($O$) gives a guarantee that the algorithm will never take longer than this. This is the standard notation used in industry.
    • Best-case ($\Omega$) shows the absolute minimum operations needed.
    • Tight bound ($\Theta$) describes the behavior when the best and worst cases have the same growth rate.

2. Common Complexity Classes

  • Below are the most common complexity classes encountered in algorithms:
Complexity ClassNameExample Algorithm
$O(1)$ConstantAccessing an array index, push/pop in stack
$O(\log n)$LogarithmicBinary Search
$O(n)$LinearLinear Search, single loop traversal
$O(n \log n)$LinearithmicMerge Sort, Heap Sort, Quick Sort (average)
$O(n^2)$QuadraticBubble Sort, Selection Sort, nested loops
$O(2^n)$ExponentialNaive recursive Fibonacci computation
$O(n!)$FactorialGenerating all permutations of a string
xychart-beta
    title "Growth Rate Curves (n vs Operations)"
    x-axis ["n=1", "n=2", "n=4", "n=8", "n=16"]
    y-axis "Operations" 0 --> 300
    line "O(1) Constant" [1, 1, 1, 1, 1]
    line "O(log n) Logarithmic" [0, 1, 2, 3, 4]
    line "O(n) Linear" [1, 2, 4, 8, 16]
    line "O(n log n) Linearithmic" [0, 2, 8, 24, 64]
    line "O(n^2) Quadratic" [1, 4, 16, 64, 256]

3. Mathematical Derivation of $O(\log n)$

  • Logarithmic growth is incredibly efficient. Here is the step-by-step mathematical proof of why dividing the search space in half (like in Binary Search) yields $O(\log n)$:
    1. Let the initial size of the input search space be $n$.
    1. In each iteration, we divide the search space by $2$.
    1. After $k$ iterations, the remaining search space size is: $$\text{Size} = \frac{n}{2^k}$$
    1. The search terminates when the search space is reduced to a single element ($1$): $$\frac{n}{2^k} = 1 \implies n = 2^k$$
    1. To solve for the number of steps $k$, we take the base-2 logarithm of both sides: $$\log_2(n) = \log_2(2^k)$$ $$\log_2(n) = k$$
    1. Therefore, the maximum number of operations $k$ is proportional to $\log_2(n)$, which is written as $O(\log n)$.
  • Logarithmic Growth Rule doubles, a logarithmic algorithm requires only one extra step.

    Every time the input size $n$

4. Code Complexity Analysis Examples

  • Here is how you identify these complexities in real code:

Constant Time $O(1)$

def get_first_element(arr):
    # This runs in O(1) time regardless of how large arr is
    return arr[0] if arr else None

Linear Time $O(n)$

def find_element(arr, target):
    # Single loop checking every element
    for item in arr:
        if item == target:
            return True
    return False

Quadratic Time $O(n^2)$

def print_pairs(arr):
    # Nested loop: outer loop runs n times, inner loop runs n times
    for i in range(len(arr)):
        for j in range(len(arr)):
            print(arr[i], arr[j])

Logarithmic Time $O(\log n)$

def divide_by_two(n):
    # The counter is halved each iteration
    steps = 0
    while n > 1:
        n //= 2
        steps += 1
    return steps

5. Recurrence Relations & Master Theorem

  • For recursive divide-and-conquer algorithms, we represent time complexity using a recurrence relation: $$T(n) = aT(n/b) + f(n)$$
  • Where:
    • $a$: Number of subproblems generated recursively.
    • $b$: Factor by which the input size is divided.
    • $f(n)$: Work done outside the recursive calls (splitting and merging).
  • The Master Theorem provides a shortcut to solve recurrences of this form:
  • The 3 Cases of Master Theorem

    Compare $f(n)$ with $n^{\log_b a}$:

    1. Case 1 (Recursive work dominates): If $f(n) = O(n^{\log_b a - \epsilon})$ for some constant $\epsilon > 0$, then: $$T(n) = \Theta(n^{\log_b a})$$

    2. Case 2 (Recursive work and split/merge work are equal): If $f(n) = \Theta(n^{\log_b a})$, then: $$T(n) = \Theta(n^{\log_b a} \log n)$$

    3. Case 3 (Split/merge work dominates): If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some constant $\epsilon > 0$, and if $a f(n/b) \le c f(n)$ for some constant $c < 1$, then: $$T(n) = \Theta(f(n))$$

Real-World Recurrence Examples:

  • Binary Search: $T(n) = T(n/2) + O(1)$
    • Here, $a=1$, $b=2$, $f(n) = \Theta(1)$.
    • $n^{\log_b a} = n^{\log_2 1} = n^0 = 1$. Since $f(n) = \Theta(1)$, this is Case 2.
    • $$T(n) = \Theta(1 \cdot \log n) = \Theta(\log n)$$
  • Merge Sort: $T(n) = 2T(n/2) + O(n)$
    • Here, $a=2$, $b=2$, $f(n) = \Theta(n)$.
    • $n^{\log_b a} = n^{\log_2 2} = n^1 = n$. Since $f(n) = \Theta(n)$, this is Case 2.
    • $$T(n) = \Theta(n \log n)$$

6. Space Complexity & The Call Stack

  • Space complexity measures the amount of memory an algorithm needs relative to input size $n$.
    • Auxiliary Space: The extra or temporary space used by the algorithm, excluding the input space.
    • Total Space: Input space + Auxiliary space.
  • When evaluating space complexity, we must account for the memory allocated on the recursion call stack:
Call Stack (Recursive Binary Search)
+-----------------------------------+
| binary_search(arr, low, high)     |  --> O(1) space variables per frame
+-----------------------------------+
| binary_search(arr, mid+1, high)   |  --> Frame 2
+-----------------------------------+
| binary_search(arr, low, mid-1)    |  --> Frame 1
+-----------------------------------+
    • Iterative Algorithms run inside a single stack frame and update variables in-place, achieving $O(1)$ auxiliary space.
    • Recursive Algorithms spawn a stack frame for every nested call. If recursion goes $d$ levels deep, the call stack memory complexity is $O(d)$. For recursive binary search, this stack depth is $O(\log n)$.

7. Amortized Analysis

  • Some operations might be highly expensive on rare occasions, but extremely cheap most of the time. Amortized Analysis averages the time spent per operation over a sequence of operations.
  • When you insert elements into a dynamic array:

Example: Dynamic Array (e.g. Python list / C++ std::vector)

  1. Insertion is normally a constant time $O(1)$ operation because empty slots are available.
  2. When the array is full, it allocates a new array of double the size, copies all $n$ existing elements to the new space, and inserts the new element. This single insertion takes $O(n)$ time.
  3. Doubling only happens once every $n$ insertions.
  4. Amortized Time: $$\text{Total Cost of } n \text{ insertions} = \underbrace{n \cdot O(1)}{\text{Normal inserts}} + \underbrace{O(n)}{\text{Doubling/copying}} = O(n)$$ $$\text{Amortized Cost per operation} = \frac{O(n)}{n} = \mathbf{O(1)}$$
  • Even though some inserts take $O(n)$, the average/amortized cost per insert is guaranteed to be $O(1)$.

Real-World Applications

  • Database Indexes: Databases store millions of rows. Searching without an index takes $O(n)$ time. Creating a B-Tree index changes search time to $O(\log n)$, turning a 1-second query into less than 1 millisecond.
  • High-Frequency Trading: Cryptographic signers and trade brokers use hash maps ($O(1)$ amortized lookup) instead of binary search trees ($O(\log n)$) to minimize latency.
  • Router Routing Tables: Network routers handle millions of packets per second. They use trie structures to perform prefix matching in $O(L)$ time, where $L$ is IP address length, completely independent of the total number of routes $n$.

Key Takeaways

  • Time & Space — We analyze operations performed (Time) and extra memory allocated (Space).
  • Asymptotic Bounds — Big O ($O$) represents worst-case, Omega ($\Omega$) best-case, and Theta ($\Theta$) tight-bound.
  • Logarithmic Growth — $O(\log n)$ means dividing the problem size in half each step. It scales incredibly well.
  • Recursion Cost — Recursive algorithms require stack space proportional to the depth of recursion.
  • Amortized Cost — The average cost of an operation over a long sequence, smoothing out occasional expensive calls.

More Learn