Concept: BVH Traversal

Parent: PathTracer Learning - Phase 2 - CPU Ray Tracing

Basic Traversal Algorithm

Recursive version (conceptual)

float traverse(Ray ray, BVHNode node):
    if not intersect(ray, node.aabb): return infinity
    if node.is_leaf:
        return intersect_primitives(ray, node.primitives)
    t_left  = traverse(ray, node.left)
    t_right = traverse(ray, node.right)
    return min(t_left, t_right)

Iterative version (GPU-friendly — no recursion)

Use an explicit stack (array of node indices)

float traverseBVH(Ray ray) {
    int stack[64];
    int stackPtr = 0;
    stack[stackPtr++] = 0;  // root node
    float tMin = infinity;
    
    while (stackPtr > 0) {
        int nodeIdx = stack[--stackPtr];
        BVHNode node = bvhNodes[nodeIdx];
        
        float t;
        if (!intersectAABB(ray, node.aabb_min, node.aabb_max, t)) continue;
        if (t > tMin) continue;  // already found closer hit
        
        if (node.prim_count > 0) {  // leaf
            for (int i = 0; i < node.prim_count; i++) {
                float tHit = intersectTriangle(ray, node.prim_offset + i);
                tMin = min(tMin, tHit);
            }
        } else {  // internal
            stack[stackPtr++] = node.left_child;
            stack[stackPtr++] = node.left_child + 1;  // right child
        }
    }
    return tMin;
}

Ordered Traversal (Optimization)

Visit the closer child first
If closer child has a hit, the farther child may be skipped entirely
How to determine closer child
- Compare AABB intersection distances: t_left vs t_right
- Push farther child first (so closer child is on top of stack)
Significant speedup for primary rays (coherent rays)
Less benefit for secondary rays (incoherent)

Shadow Ray Optimization

Shadow rays only need to know IF there’s a hit, not WHICH hit
Use gl_RayFlagsTerminateOnFirstHitEXT in Vulkan
In software BVH: return immediately on first primitive hit
Avoids traversing the entire BVH for occlusion queries

Stack Overflow

BVH depth can exceed stack size for degenerate trees
Typical max depth: 30-40 for well-built BVH
Stack size of 64 is usually sufficient
Degenerate case: all primitives in a line → depth = N (use SAH to avoid)

GPU Traversal Considerations

Warp divergence
- Different threads in a warp may take different BVH paths
- Reduces GPU utilization (some threads idle while others traverse)
- Hardware RT (Vulkan) handles this internally — major advantage
Memory access patterns
- BVH nodes should be in a contiguous buffer (good cache behavior)
- Leaf primitives should be sorted to match BVH order
Persistent threads
- Keep threads alive across multiple rays (avoid kernel launch overhead)
- Used in production GPU path tracers

Help keep this knowledge base fast, ad-free, and expanding! We are currently at 0% of our domain & hosting goal.

Code Notes made by Vaibhav Rathod

Guest Mode

User

Roadmap & Features

My Bookmarks

Table of Contents

Explorer

PathTracer Learning - Concept - BVH Traversal

Concept: BVH Traversal

Basic Traversal Algorithm

Ordered Traversal (Optimization)

Shadow Ray Optimization

Stack Overflow

GPU Traversal Considerations

Enjoying the Notes?

Graph View

Backlinks

Recently Updated

A Search Algorithm

PathTracer Learning

PathTracer Learning - Chat Analysis

PathTracer Learning - Phase 3 - GPU and Vulkan

Arrays

Circular buffer

Dijkstras Algorithm

DSA Algo & System Design

Hash Tables

boruvkas-algorithm

boyer-moore-majority-vote

burrows-wheeler-transform

continuous-delivery

continuous-integration

convex-hull-divide-and-conquer

convex-hull-set2-graham-scan

cryptography

dekkers-algorithm

Table of Contents

Explorer

PathTracer Learning - Concept - BVH Traversal

Concept: BVH Traversal

Basic Traversal Algorithm

Ordered Traversal (Optimization)

Shadow Ray Optimization

Stack Overflow

GPU Traversal Considerations

Related

Enjoying the Notes?

Graph View

Backlinks

Recently Updated