What is Complexity Analysis?

  • Complexity Analysis is the mathematical evaluation of the resources (time and memory) consumed by an algorithm.
  • Think of it like ordering food at a restaurant:

🧠 Beginner Mental Model

  • Bad waiter: Takes 5 minutes per person, so 100 people = 500 minutes. 😰 → per group
  • Good waiter: Takes a menu order once for the whole table → per group
  • The algorithm is the waiter. The input size is the number of people.
  • Instead of measuring execution time in seconds (which varies across hardware, OS, and CPU load), we measure complexity in terms of the number of basic operations and memory slots used relative to input size .
  • We express this using asymptotic notations to describe the growth rate of resource usage as input size scales toward infinity.

Key Terms

TermMeaning
Input size ()The quantity the algorithm works on (array length, number of nodes, etc.)
Basic operationA single comparison, assignment, arithmetic op, or memory access
Growth rateHow fast the operation count increases as grows
AsymptoticBehavior of the algorithm as (very large values)

Why It Matters

  • Hardware improvements (faster CPUs, more RAM) cannot fix a fundamentally inefficient algorithm. As data scales, algorithmic efficiency dominates hardware performance.
  • Scaling Breakdown operations per second. Here is how different algorithms scale for input size (one million elements):

    Suppose a computer can perform

    • Logarithmic time operations microseconds
    • Linear time operations millisecond
    • Linearithmic time operations milliseconds
    • Quadratic time operations seconds (~ minutes)
    • Exponential time operations Will never finish (longer than the age of the universe)

    Choosing the correct algorithm can mean the difference between an instant response and a system crash.

📦 Real Analogy — Searching in a Phone Book

StrategyComplexityHow it works
Read every single nameLinear search
Open to the middle, eliminate halfBinary search
Teleport to exact pageHash map lookup

How It Works

1. Asymptotic Notations

  • We use three primary mathematical notations to describe the bounds of an algorithm:
NotationNamePurposeMathematical DefinitionIntuition
Big OUpper Bound (Worst case) such that for all “Will never be slower than this”
Big OmegaLower Bound (Best case) such that for all “Will never be faster than this”
Big ThetaTight Bound (Exact) such that “Always grows at this rate”
  • When do we use which?

    • In interviews and practice, people almost always say “Big O” even when they technically mean Big Theta.
    • Big O = “ceiling” → useful to guarantee a worst case bound.
    • Big Omega = “floor” → useful to prove no algorithm can beat this limit.
    • Big Theta = “exact fit” → used in rigorous academic proofs.
  • The definition for all means:

Formal Definition Explained (Beginner Friendly)

  • We’re allowed to pick any positive constant (scaling factor).
  • We’re allowed to ignore a finite “startup” region (i.e., for small , the constants can dominate).
  • Beyond some threshold , is always a valid upper bound on .
  • In practice: constants are dropped because they vary by machine.
  • Beyond , the function is sandwiched between and .

Visualizing the Notations

T(n)
  |              ↑ c₂·g(n)  (upper bound, O)
  |            ↗
  |         ↗ T(n)           (actual runtime, inside Θ band)
  |       ↗
  |     ↗  ↗ c₁·g(n)        (lower bound, Ω)
  +------------------------→ n
       n₀

2. Simplification Rules

  • When computing Big O, we apply these simplification rules:
  • Constants don’t affect the growth rate:

Rule 1 — Drop Constants

  • Keep only the fastest-growing term:

Rule 2 — Drop Non-Dominant Terms

  • If two loops iterate over different inputs, use separate variables:

Rule 3 — Different Variables

def process(arr_a, arr_b):
    for a in arr_a:  # O(a)
        print(a)
    for b in arr_b:  # O(b)
        print(b)
# Total: O(a + b), NOT O(n)

Rule 4 — Sequences vs. Nesting

PatternComplexityCode Structure
Sequential loopsfor ... : ; for ... :
Nested loopsfor ...: for ...:
  • Can you simplify these?

    1. → ?
    2. → ?
    3. where → ?

    Answers: , ,

Quick Practice Problems

3. Common Complexity Classes

  • Below are the most common complexity classes, ordered from fastest to slowest:
Complexity ClassNameExample AlgorithmAnalogy
ConstantArray index access, push/popFinding your seat by ticket number
LogarithmicBinary Search, BST lookupPhone book binary search
Square RootTrial division primality testChecking grid diagonals
LinearLinear Search, array traversalReading a book page by page
LinearithmicMerge Sort, Heap SortSorting a deck optimally
QuadraticBubble Sort, nested loopsComparing every pair in a group
CubicNaive matrix multiplicationTriple nested loop
ExponentialRecursive subset generationAll possible subsets
FactorialGenerating all permutationsBrute-force travelling salesman
xychart-beta
    title "Growth Rate Curves (n vs Operations)"
    x-axis ["n=1", "n=2", "n=4", "n=8", "n=16"]
    y-axis "Operations" 0 --> 300
    line "O(1) Constant" [1, 1, 1, 1, 1]
    line "O(log n) Logarithmic" [0, 1, 2, 3, 4]
    line "O(n) Linear" [1, 2, 4, 8, 16]
    line "O(n log n) Linearithmic" [0, 2, 8, 24, 64]
    line "O(n^2) Quadratic" [1, 4, 16, 64, 256]
  • The Complexity Ladder Always aim as far left on this ladder as possible.

4. Mathematical Derivation of

  • Logarithmic growth is incredibly efficient. Here is a step-by-step mathematical proof of why halving the search space (like in Binary Search) yields :
    1. Let the initial search space size be .
    1. Each iteration divides the search space by .
    1. After iterations, the remaining size is:
    1. The search terminates when the space is reduced to :
    1. Take the base-2 logarithm of both sides:
    1. Therefore, the number of steps , which we write as .
  • Logarithmic Growth Rule doubles, a logarithmic algorithm requires only one extra step.

    Every time the input size

    10
    21
    42
    83
    1,02410
    1,048,57620
    1,073,741,824 (1 billion)30
  • Since and is a constant, the bases only differ by a constant factor: This is why we drop the base in Big O notation.

Why the base of log doesn’t matter in Big O

5. Code Complexity Analysis — Step by Step

  • Here is how to identify complexities by reading code patterns:
  • Operations that don’t depend on input size at all:

🟢 Constant Time —

def get_first_element(arr):
    # Always exactly 1 operation, regardless of len(arr)
    return arr[0] if arr else None
 
def is_even(n):
    # Single arithmetic check — constant time
    return n % 2 == 0
 
def swap(arr, i, j):
    # Three assignments — still O(1) since it's a fixed number
    arr[i], arr[j] = arr[j], arr[i]
  • Each iteration cuts the problem in half (or by a constant fraction):

🟡 Logarithmic Time —

def binary_search(arr, target):
    low, high = 0, len(arr) - 1
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1      # Discard left half
        else:
            high = mid - 1     # Discard right half
    return -1
# Search space: n → n/2 → n/4 → ... → 1
# Steps = log₂(n)  →  O(log n)
 
def count_digits(n):
    count = 0
    while n > 0:
        n //= 10   # Divides by 10 each time
        count += 1
    return count
# Steps ≈ log₁₀(n)  →  O(log n)
  • A single pass through all elements:

🟠 Linear Time —

def find_max(arr):
    max_val = arr[0]
    for item in arr:       # Loop runs n times
        if item > max_val:
            max_val = item
    return max_val
 
def sum_array(arr):
    total = 0
    for x in arr:          # One operation per element
        total += x
    return total
  • A nested loop where both loops scale with :

🔴 Quadratic Time —

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):           # Outer: n iterations
        for j in range(n - i - 1): # Inner: up to n iterations
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
# Total comparisons ≈ n*(n-1)/2  =  O(n²)
 
def print_all_pairs(arr):
    for i in range(len(arr)):    # O(n)
        for j in range(len(arr)): # O(n) for each i
            print(arr[i], arr[j])
# Total: O(n × n) = O(n²)
  • Divide-and-conquer: split times, process elements each split:

🔵 Linearithmic Time —

def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])   # T(n/2) — split left half
    right = merge_sort(arr[mid:])  # T(n/2) — split right half
    return merge(left, right)      # O(n)   — merge step
 
def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i]); i += 1
        else:
            result.append(right[j]); j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result
# Recurrence: T(n) = 2T(n/2) + O(n) → O(n log n)
  • Each call generates two more calls — the tree doubles at each level:

⚫ Exponential Time —

def fibonacci_naive(n):
    # For each call, we make 2 more recursive calls
    if n <= 1:
        return n
    return fibonacci_naive(n - 1) + fibonacci_naive(n - 2)
# Call tree has ~2^n nodes → O(2^n) time, O(n) space
 
def all_subsets(arr):
    # Generates all 2^n subsets of arr
    if not arr:
        return [[]]
    first = arr[0]
    rest_subsets = all_subsets(arr[1:])
    return rest_subsets + [[first] + s for s in rest_subsets]
  • Not all nested loops are . Always trace the actual iterations:

🟣 Tricky Multi-Loop Example

def mystery(n):
    i = 1
    while i < n:
        i *= 2   # i doubles: 1, 2, 4, 8, ..., n
    # Loop runs log₂(n) times → O(log n) ❗ NOT O(n)
 
def print_triangle(n):
    for i in range(n):           # n rows
        for j in range(i + 1):   # 1, 2, 3, ..., n iterations
            print("*", end=" ")
    # Total = 1 + 2 + ... + n = n(n+1)/2 = O(n²)

6. Recurrence Relations & Master Theorem

  • For recursive divide-and-conquer algorithms, we express time complexity as a recurrence relation:
  • Where:
    • : Number of subproblems at each recursion level.
    • : Factor by which the input size shrinks.
    • : Work done outside the recursive calls (splitting, merging, processing).
  • For merge_sort: T(n) = 2T(n/2) + O(n)

Step-by-Step: Reading a Recurrence

SymbolMeaningValue
Split into 2 halvesTwo recursive calls
Each half is sizeProblem halves each level
Merging two sorted arraysLinear merge step
  • The Master Theorem provides a shortcut to solve recurrences of this form:
  • The 3 Cases of the Master Theorem critical exponent:

    First compute the

    Case 1 — Recursive work dominates (subproblems take most time):

    Case 2 — Work is balanced (split cost equals recursive cost):

    Case 3 — Combining work dominates (merge/split takes most time):

Real-World Recurrence Examples

  • Binary Search:
    • , ,
    • Critical exponent: , so
    • Case 2
  • Merge Sort:
    • , ,
    • Critical exponent: , so
    • Case 2
  • Strassen’s Matrix Multiply:
    • , ,
    • Critical exponent:
    • Case 1
  • For :

Recursion Tree Method (Alternative to Master Theorem)

Level 0:        n                     → cost: n
Level 1:    n/2   n/2                 → cost: n
Level 2:  n/4 n/4 n/4 n/4             → cost: n
...
Level k:  [2^k nodes, each n/2^k]     → cost: n
                                        (log n levels)
Total cost = n × log n = O(n log n)

7. Space Complexity & the Call Stack

  • Space complexity measures the memory an algorithm uses relative to input size .
TypeDefinition
Input SpaceMemory occupied by the input data itself
Auxiliary SpaceExtra memory used by the algorithm (variables, call stack, data structures)
Total SpaceInput Space + Auxiliary Space
  • Interview Convention auxiliary space when they say "space complexity". We generally exclude the input itself.

    Most algorithm interviews refer to

  • When a function calls itself recursively, each active call occupies a stack frame on the call stack:

Call Stack Memory

Call Stack for: factorial(4)
┌─────────────────────┐
│  factorial(1) → 1   │  ← Top of stack (most recent)
├─────────────────────┤
│  factorial(2)        │
├─────────────────────┤
│  factorial(3)        │
├─────────────────────┤
│  factorial(4)        │  ← Bottom of stack (first call)
└─────────────────────┘
Depth = 4 = n  →  Space: O(n)

Space Complexity Examples

# O(1) Space — Iterative: single stack frame, fixed variables
def sum_iterative(n):
    total = 0
    for i in range(1, n + 1):
        total += i
    return total
 
# O(n) Space — Recursive: n stack frames deep
def sum_recursive(n):
    if n == 0:
        return 0
    return n + sum_recursive(n - 1)
 
# O(log n) Space — Recursive Binary Search: log n stack frames
def binary_search_recursive(arr, low, high, target):
    if low > high:
        return -1
    mid = (low + high) // 2
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, mid + 1, high, target)
    else:
        return binary_search_recursive(arr, low, mid - 1, target)
 
# O(n) Space — Creating a new array
def double_array(arr):
    return [x * 2 for x in arr]  # New list of size n

Iterative vs Recursive Space

AlgorithmTimeSpace (Iterative)Space (Recursive)
Factorial
Binary Search
Merge Sort
Fibonacci (naive)

8. Amortized Analysis

  • Some operations are expensive occasionally but cheap most of the time. Amortized Analysis averages the cost per operation across a long sequence to give a tighter bound than worst-case alone.
  • When inserting elements into a dynamic array:

Example: Dynamic Array (Python list / C++ std::vector)

  1. Normal insert (space available): — just write to empty slot.
  2. Full array insert (no space): Allocate slots, copy all elements, then insert → one-time cost.
  3. After doubling, the next inserts are all again.
  • Amortized Cost:
  • Banker's Argument insert as paying 2 coins:

    Think of each

    • 1 coin for itself
    • 1 coin saved in a “doubling bank”

    When doubling happens, you have coins saved → can afford the copy. Every insert’s amortized cost = 2 coins = .

Other Amortized Examples

Data StructureOperationAmortized CostWorst Case
Dynamic Arrayappend
Binary Heap (heapify)Build from elements total naive
Splay TreeAny single op
Union-Find (path compression)find

9. Best / Average / Worst Case — When Do They Differ?

  • Complexity can vary depending on the specific input. A complete analysis covers all three cases:
AlgorithmBest CaseAverage CaseWorst Case
Linear Search (first element) (not found)
Binary Search (midpoint is target)
Quick Sort (balanced pivot) (sorted array, bad pivot)
Bubble Sort (already sorted)
Hash Map lookup (all keys collide)
def linear_search(arr, target):
    for i, item in enumerate(arr):
        if item == target:
            return i    # Best case: O(1) if target is at index 0
    return -1           # Worst case: O(n) if not found
  • Quick Sort and Pivot Selection of Quick Sort occurs when the pivot always splits the array into (e.g., always picking the smallest element on a sorted array). Randomized Quick Sort avoids this by choosing pivots randomly → expected .

    The worst case

Real-World Applications

  • Databases store millions of rows. A full-table scan without an index is . Adding a B-Tree index transforms lookups to .

🗄️ Database Indexing

  • Searching 1 million rows without index: ~ comparisons
  • Searching with B-Tree index: ~ comparisons ()
  • Trade brokers use hash maps ( amortized lookup) instead of balanced BSTs () to minimize microsecond-scale latency. At 10,000 trades/second, even can be a bottleneck.

⚡ High-Frequency Trading (HFT)

  • Network routers handle millions of packets/second. They use Trie structures for prefix matching IP addresses in time (where bits for IPv4), completely independent of total number of routes .

🌐 Network Routing

  • The security of RSA encryption relies on the fact that multiplying two large primes is but factoring the product back is believed to be exponential — the computational gap is the security guarantee.

🔒 Cryptography

  • DNA sequence alignment uses Dynamic Programming algorithms like Smith-Waterman in time. For whole-genome alignment (billions of bases), approximations and heuristics reduce this to .

🧬 Bioinformatics

Key Takeaways

  • Time & Space — Analyze operations performed (time) and extra memory allocated (space). Both matter.
  • Big O is the Standard — In practice, Big O (worst case) is what engineers discuss. It’s the safety guarantee.
  • Simplify ruthlessly — Drop constants and non-dominant terms. .
  • Logarithmic is powerful means doubling the input only costs 1 more step. Always prefer it over when possible.
  • Recursion has hidden costs — Every recursive call adds a stack frame. A -level-deep recursion costs space.
  • Amortized smooths spikes — When one expensive operation enables many cheap ones, the true per-operation cost is lower than it looks.
  • Context matters is fine for ; catastrophic for . Always estimate the expected input size.
graph TD
  A["Input: n"] --> B{"What does the algorithm do?"}
  B -->|"Fixed ops regardless of n"| C["O(1) — Constant"]
  B -->|"Halves problem each step"| D["O(log n) — Logarithmic"]
  B -->|"One pass through all n"| E["O(n) — Linear"]
  B -->|"Divide + merge each level"| F["O(n log n) — Linearithmic"]
  B -->|"Nested loop over n"| G["O(n²) — Quadratic"]
  B -->|"Two branches per call"| H["O(2^n) — Exponential"]

More Learn