What is Complexity Analysis?
- Complexity Analysis is the mathematical evaluation of the resources (time and memory) consumed by an algorithm.
- Think of it like ordering food at a restaurant:
🧠 Beginner Mental Model
- Bad waiter: Takes 5 minutes per person, so 100 people = 500 minutes. 😰 → per group
- Good waiter: Takes a menu order once for the whole table → per group
- The algorithm is the waiter. The input size is the number of people.
- Instead of measuring execution time in seconds (which varies across hardware, OS, and CPU load), we measure complexity in terms of the number of basic operations and memory slots used relative to input size .
- We express this using asymptotic notations to describe the growth rate of resource usage as input size scales toward infinity.
Key Terms
| Term | Meaning |
|---|---|
| Input size () | The quantity the algorithm works on (array length, number of nodes, etc.) |
| Basic operation | A single comparison, assignment, arithmetic op, or memory access |
| Growth rate | How fast the operation count increases as grows |
| Asymptotic | Behavior of the algorithm as (very large values) |
Why It Matters
- Hardware improvements (faster CPUs, more RAM) cannot fix a fundamentally inefficient algorithm. As data scales, algorithmic efficiency dominates hardware performance.
-
Scaling Breakdown operations per second. Here is how different algorithms scale for input size (one million elements):
Suppose a computer can perform
- Logarithmic time operations microseconds
- Linear time operations millisecond
- Linearithmic time operations milliseconds
- Quadratic time operations seconds (~ minutes)
- Exponential time operations Will never finish (longer than the age of the universe)
Choosing the correct algorithm can mean the difference between an instant response and a system crash.
📦 Real Analogy — Searching in a Phone Book
| Strategy | Complexity | How it works |
|---|---|---|
| Read every single name | Linear search | |
| Open to the middle, eliminate half | Binary search | |
| Teleport to exact page | Hash map lookup |
How It Works
1. Asymptotic Notations
- We use three primary mathematical notations to describe the bounds of an algorithm:
| Notation | Name | Purpose | Mathematical Definition | Intuition |
|---|---|---|---|---|
| Big O | Upper Bound (Worst case) | such that for all | “Will never be slower than this” | |
| Big Omega | Lower Bound (Best case) | such that for all | “Will never be faster than this” | |
| Big Theta | Tight Bound (Exact) | such that | “Always grows at this rate” |
-
When do we use which?
- In interviews and practice, people almost always say “Big O” even when they technically mean Big Theta.
- Big O = “ceiling” → useful to guarantee a worst case bound.
- Big Omega = “floor” → useful to prove no algorithm can beat this limit.
- Big Theta = “exact fit” → used in rigorous academic proofs.
- The definition for all means:
Formal Definition Explained (Beginner Friendly)
- We’re allowed to pick any positive constant (scaling factor).
- We’re allowed to ignore a finite “startup” region (i.e., for small , the constants can dominate).
- Beyond some threshold , is always a valid upper bound on .
- In practice: constants are dropped because they vary by machine.
- Beyond , the function is sandwiched between and .
Visualizing the Notations
T(n)
| ↑ c₂·g(n) (upper bound, O)
| ↗
| ↗ T(n) (actual runtime, inside Θ band)
| ↗
| ↗ ↗ c₁·g(n) (lower bound, Ω)
+------------------------→ n
n₀
2. Simplification Rules
- When computing Big O, we apply these simplification rules:
- Constants don’t affect the growth rate:
Rule 1 — Drop Constants
- Keep only the fastest-growing term:
Rule 2 — Drop Non-Dominant Terms
- If two loops iterate over different inputs, use separate variables:
Rule 3 — Different Variables
def process(arr_a, arr_b):
for a in arr_a: # O(a)
print(a)
for b in arr_b: # O(b)
print(b)
# Total: O(a + b), NOT O(n)Rule 4 — Sequences vs. Nesting
| Pattern | Complexity | Code Structure |
|---|---|---|
| Sequential loops | for ... : ; for ... : | |
| Nested loops | for ...: for ...: |
-
Can you simplify these?
- → ?
- → ?
- where → ?
Answers: , ,
Quick Practice Problems
3. Common Complexity Classes
- Below are the most common complexity classes, ordered from fastest to slowest:
| Complexity Class | Name | Example Algorithm | Analogy |
|---|---|---|---|
| Constant | Array index access, push/pop | Finding your seat by ticket number | |
| Logarithmic | Binary Search, BST lookup | Phone book binary search | |
| Square Root | Trial division primality test | Checking grid diagonals | |
| Linear | Linear Search, array traversal | Reading a book page by page | |
| Linearithmic | Merge Sort, Heap Sort | Sorting a deck optimally | |
| Quadratic | Bubble Sort, nested loops | Comparing every pair in a group | |
| Cubic | Naive matrix multiplication | Triple nested loop | |
| Exponential | Recursive subset generation | All possible subsets | |
| Factorial | Generating all permutations | Brute-force travelling salesman |
xychart-beta title "Growth Rate Curves (n vs Operations)" x-axis ["n=1", "n=2", "n=4", "n=8", "n=16"] y-axis "Operations" 0 --> 300 line "O(1) Constant" [1, 1, 1, 1, 1] line "O(log n) Logarithmic" [0, 1, 2, 3, 4] line "O(n) Linear" [1, 2, 4, 8, 16] line "O(n log n) Linearithmic" [0, 2, 8, 24, 64] line "O(n^2) Quadratic" [1, 4, 16, 64, 256]
-
The Complexity Ladder Always aim as far left on this ladder as possible.
4. Mathematical Derivation of
- Logarithmic growth is incredibly efficient. Here is a step-by-step mathematical proof of why halving the search space (like in Binary Search) yields :
-
- Let the initial search space size be .
-
- Each iteration divides the search space by .
-
- After iterations, the remaining size is:
-
- The search terminates when the space is reduced to :
-
- Take the base-2 logarithm of both sides:
-
- Therefore, the number of steps , which we write as .
-
Logarithmic Growth Rule doubles, a logarithmic algorithm requires only one extra step.
Every time the input size
1 0 2 1 4 2 8 3 1,024 10 1,048,576 20 1,073,741,824 (1 billion) 30
- Since and is a constant, the bases only differ by a constant factor: This is why we drop the base in Big O notation.
Why the base of log doesn’t matter in Big O
5. Code Complexity Analysis — Step by Step
- Here is how to identify complexities by reading code patterns:
- Operations that don’t depend on input size at all:
🟢 Constant Time —
def get_first_element(arr):
# Always exactly 1 operation, regardless of len(arr)
return arr[0] if arr else None
def is_even(n):
# Single arithmetic check — constant time
return n % 2 == 0
def swap(arr, i, j):
# Three assignments — still O(1) since it's a fixed number
arr[i], arr[j] = arr[j], arr[i]- Each iteration cuts the problem in half (or by a constant fraction):
🟡 Logarithmic Time —
def binary_search(arr, target):
low, high = 0, len(arr) - 1
while low <= high:
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
low = mid + 1 # Discard left half
else:
high = mid - 1 # Discard right half
return -1
# Search space: n → n/2 → n/4 → ... → 1
# Steps = log₂(n) → O(log n)
def count_digits(n):
count = 0
while n > 0:
n //= 10 # Divides by 10 each time
count += 1
return count
# Steps ≈ log₁₀(n) → O(log n)- A single pass through all elements:
🟠 Linear Time —
def find_max(arr):
max_val = arr[0]
for item in arr: # Loop runs n times
if item > max_val:
max_val = item
return max_val
def sum_array(arr):
total = 0
for x in arr: # One operation per element
total += x
return total- A nested loop where both loops scale with :
🔴 Quadratic Time —
def bubble_sort(arr):
n = len(arr)
for i in range(n): # Outer: n iterations
for j in range(n - i - 1): # Inner: up to n iterations
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
# Total comparisons ≈ n*(n-1)/2 = O(n²)
def print_all_pairs(arr):
for i in range(len(arr)): # O(n)
for j in range(len(arr)): # O(n) for each i
print(arr[i], arr[j])
# Total: O(n × n) = O(n²)- Divide-and-conquer: split times, process elements each split:
🔵 Linearithmic Time —
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # T(n/2) — split left half
right = merge_sort(arr[mid:]) # T(n/2) — split right half
return merge(left, right) # O(n) — merge step
def merge(left, right):
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i]); i += 1
else:
result.append(right[j]); j += 1
result.extend(left[i:])
result.extend(right[j:])
return result
# Recurrence: T(n) = 2T(n/2) + O(n) → O(n log n)- Each call generates two more calls — the tree doubles at each level:
⚫ Exponential Time —
def fibonacci_naive(n):
# For each call, we make 2 more recursive calls
if n <= 1:
return n
return fibonacci_naive(n - 1) + fibonacci_naive(n - 2)
# Call tree has ~2^n nodes → O(2^n) time, O(n) space
def all_subsets(arr):
# Generates all 2^n subsets of arr
if not arr:
return [[]]
first = arr[0]
rest_subsets = all_subsets(arr[1:])
return rest_subsets + [[first] + s for s in rest_subsets]- Not all nested loops are . Always trace the actual iterations:
🟣 Tricky Multi-Loop Example
def mystery(n):
i = 1
while i < n:
i *= 2 # i doubles: 1, 2, 4, 8, ..., n
# Loop runs log₂(n) times → O(log n) ❗ NOT O(n)
def print_triangle(n):
for i in range(n): # n rows
for j in range(i + 1): # 1, 2, 3, ..., n iterations
print("*", end=" ")
# Total = 1 + 2 + ... + n = n(n+1)/2 = O(n²)6. Recurrence Relations & Master Theorem
- For recursive divide-and-conquer algorithms, we express time complexity as a recurrence relation:
- Where:
- : Number of subproblems at each recursion level.
- : Factor by which the input size shrinks.
- : Work done outside the recursive calls (splitting, merging, processing).
- For
merge_sort:T(n) = 2T(n/2) + O(n)
Step-by-Step: Reading a Recurrence
| Symbol | Meaning | Value |
|---|---|---|
| Split into 2 halves | Two recursive calls | |
| Each half is size | Problem halves each level | |
| Merging two sorted arrays | Linear merge step |
- The Master Theorem provides a shortcut to solve recurrences of this form:
-
The 3 Cases of the Master Theorem critical exponent:
First compute the
Case 1 — Recursive work dominates (subproblems take most time):
Case 2 — Work is balanced (split cost equals recursive cost):
Case 3 — Combining work dominates (merge/split takes most time):
Real-World Recurrence Examples
- Binary Search:
- , ,
- Critical exponent: , so
- → Case 2
- Merge Sort:
- , ,
- Critical exponent: , so
- → Case 2
- Strassen’s Matrix Multiply:
- , ,
- Critical exponent:
- → Case 1
- For :
Recursion Tree Method (Alternative to Master Theorem)
Level 0: n → cost: n
Level 1: n/2 n/2 → cost: n
Level 2: n/4 n/4 n/4 n/4 → cost: n
...
Level k: [2^k nodes, each n/2^k] → cost: n
(log n levels)
Total cost = n × log n = O(n log n)
7. Space Complexity & the Call Stack
- Space complexity measures the memory an algorithm uses relative to input size .
| Type | Definition |
|---|---|
| Input Space | Memory occupied by the input data itself |
| Auxiliary Space | Extra memory used by the algorithm (variables, call stack, data structures) |
| Total Space | Input Space + Auxiliary Space |
-
Interview Convention auxiliary space when they say "space complexity". We generally exclude the input itself.
Most algorithm interviews refer to
- When a function calls itself recursively, each active call occupies a stack frame on the call stack:
Call Stack Memory
Call Stack for: factorial(4)
┌─────────────────────┐
│ factorial(1) → 1 │ ← Top of stack (most recent)
├─────────────────────┤
│ factorial(2) │
├─────────────────────┤
│ factorial(3) │
├─────────────────────┤
│ factorial(4) │ ← Bottom of stack (first call)
└─────────────────────┘
Depth = 4 = n → Space: O(n)
Space Complexity Examples
# O(1) Space — Iterative: single stack frame, fixed variables
def sum_iterative(n):
total = 0
for i in range(1, n + 1):
total += i
return total
# O(n) Space — Recursive: n stack frames deep
def sum_recursive(n):
if n == 0:
return 0
return n + sum_recursive(n - 1)
# O(log n) Space — Recursive Binary Search: log n stack frames
def binary_search_recursive(arr, low, high, target):
if low > high:
return -1
mid = (low + high) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
return binary_search_recursive(arr, mid + 1, high, target)
else:
return binary_search_recursive(arr, low, mid - 1, target)
# O(n) Space — Creating a new array
def double_array(arr):
return [x * 2 for x in arr] # New list of size nIterative vs Recursive Space
| Algorithm | Time | Space (Iterative) | Space (Recursive) |
|---|---|---|---|
| Factorial | |||
| Binary Search | |||
| Merge Sort | |||
| Fibonacci (naive) |
8. Amortized Analysis
- Some operations are expensive occasionally but cheap most of the time. Amortized Analysis averages the cost per operation across a long sequence to give a tighter bound than worst-case alone.
- When inserting elements into a dynamic array:
Example: Dynamic Array (Python list / C++ std::vector)
- Normal insert (space available): — just write to empty slot.
- Full array insert (no space): Allocate slots, copy all elements, then insert → one-time cost.
- After doubling, the next inserts are all again.
- Amortized Cost:
-
Banker's Argument insert as paying 2 coins:
Think of each
- 1 coin for itself
- 1 coin saved in a “doubling bank”
When doubling happens, you have coins saved → can afford the copy. Every insert’s amortized cost = 2 coins = .
Other Amortized Examples
| Data Structure | Operation | Amortized Cost | Worst Case |
|---|---|---|---|
| Dynamic Array | append | ||
Binary Heap (heapify) | Build from elements | total | naive |
| Splay Tree | Any single op | ||
| Union-Find (path compression) | find | ≈ |
9. Best / Average / Worst Case — When Do They Differ?
- Complexity can vary depending on the specific input. A complete analysis covers all three cases:
| Algorithm | Best Case | Average Case | Worst Case |
|---|---|---|---|
| Linear Search | (first element) | (not found) | |
| Binary Search | (midpoint is target) | ||
| Quick Sort | (balanced pivot) | (sorted array, bad pivot) | |
| Bubble Sort | (already sorted) | ||
| Hash Map lookup | (all keys collide) |
def linear_search(arr, target):
for i, item in enumerate(arr):
if item == target:
return i # Best case: O(1) if target is at index 0
return -1 # Worst case: O(n) if not found-
Quick Sort and Pivot Selection of Quick Sort occurs when the pivot always splits the array into (e.g., always picking the smallest element on a sorted array). Randomized Quick Sort avoids this by choosing pivots randomly → expected .
The worst case
Real-World Applications
- Databases store millions of rows. A full-table scan without an index is . Adding a B-Tree index transforms lookups to .
🗄️ Database Indexing
- Searching 1 million rows without index: ~ comparisons
- Searching with B-Tree index: ~ comparisons ()
- Trade brokers use hash maps ( amortized lookup) instead of balanced BSTs () to minimize microsecond-scale latency. At 10,000 trades/second, even can be a bottleneck.
⚡ High-Frequency Trading (HFT)
- Network routers handle millions of packets/second. They use Trie structures for prefix matching IP addresses in time (where bits for IPv4), completely independent of total number of routes .
🌐 Network Routing
- The security of RSA encryption relies on the fact that multiplying two large primes is but factoring the product back is believed to be exponential — the computational gap is the security guarantee.
🔒 Cryptography
- DNA sequence alignment uses Dynamic Programming algorithms like Smith-Waterman in time. For whole-genome alignment (billions of bases), approximations and heuristics reduce this to .
🧬 Bioinformatics
Key Takeaways
- Time & Space — Analyze operations performed (time) and extra memory allocated (space). Both matter.
- Big O is the Standard — In practice, Big O (worst case) is what engineers discuss. It’s the safety guarantee.
- Simplify ruthlessly — Drop constants and non-dominant terms. .
- Logarithmic is powerful — means doubling the input only costs 1 more step. Always prefer it over when possible.
- Recursion has hidden costs — Every recursive call adds a stack frame. A -level-deep recursion costs space.
- Amortized smooths spikes — When one expensive operation enables many cheap ones, the true per-operation cost is lower than it looks.
- Context matters — is fine for ; catastrophic for . Always estimate the expected input size.
graph TD A["Input: n"] --> B{"What does the algorithm do?"} B -->|"Fixed ops regardless of n"| C["O(1) — Constant"] B -->|"Halves problem each step"| D["O(log n) — Logarithmic"] B -->|"One pass through all n"| E["O(n) — Linear"] B -->|"Divide + merge each level"| F["O(n log n) — Linearithmic"] B -->|"Nested loop over n"| G["O(n²) — Quadratic"] B -->|"Two branches per call"| H["O(2^n) — Exponential"]
More Learn
-
- Big-O Cheat Sheet — Quick reference for all complexity classes with data structures
-
- Khan Academy: Asymptotic Notation — Thorough introduction to algorithm math
-
- Master Theorem - Wikipedia — In-depth proof and edge cases
-
- Visualgo — Algorithm Visualizer — Interactive visual animations for sorting and searching
-
- MIT OpenCourseWare — Introduction to Algorithms — Full lecture series with problem sets