Phase 4 — Godot Engine Internals

  • How Godot’s rendering architecture works and where a path tracer plugs in.
  • Based on the NVPathtracer contributor discussions — this is the real integration challenge.
  • Parent: PathTracer Learning

4.1 Godot Rendering Architecture

  • Godot 4 uses a layered rendering system
    • RenderingServer — high-level API (scene objects, materials, lights)
    • RenderingDevice (RD) — low-level GPU API (Vulkan wrapper)
    • RendererSceneRender — abstract base for render pipelines
    • RendererSceneRenderRD — RD-based implementation
    • RenderSceneBuffersRD — per-viewport render targets and buffers
  • The NVPathtracer hooks into RendererSceneRenderRD
    • Overrides _render_scene() to replace rasterization with path tracing
    • Uses RenderingDevice directly for Vulkan RT commands
  • Key source files in Godot
    • servers/rendering/renderer_rd/renderer_scene_render_rd.cpp
    • servers/rendering/renderer_rd/storage_rd/render_scene_buffers_rd.cpp
    • servers/rendering/rendering_device.cpp
    • servers/rendering/renderer_rd/shaders/ — all GLSL shaders

4.2 RenderingDevice API

  • Godot’s abstraction over Vulkan (and other backends)
  • Buffer management
    • RD.buffer_create(size, usage, data) — allocate GPU buffer
    • RD.buffer_update(buffer, offset, size, data) — upload CPU data
    • RD.buffer_get_data(buffer) — readback (slow, avoid in hot path)
    • Usage flags: BUFFER_USAGE_STORAGE_BIT, BUFFER_USAGE_VERTEX_BIT, etc.
  • Texture management
    • RD.texture_create(format, view, data) — create texture
    • RD.texture_update(texture, layer, data) — upload texture data
    • RD.texture_get_data(texture, layer) — readback
  • Compute pipelines
    • RD.shader_create_from_spirv(spirv) — compile shader
    • RD.compute_pipeline_create(shader) — create pipeline
    • RD.compute_list_begin()RD.compute_list_bind_*()RD.compute_list_dispatch()RD.compute_list_end()
  • Uniform sets
    • RD.uniform_set_create(uniforms, shader, set_index)
    • Uniforms: RDUniform with uniform_type, binding, ids
  • Ray tracing (Vulkan extension, not yet in stable RD API)
    • NVPathtracer calls Vulkan directly via vkGetDeviceProcAddr
    • vkCmdBuildAccelerationStructuresKHR, vkCmdTraceRaysKHR
    • Requires bypassing RenderingDevice for RT-specific commands

4.3 RenderSceneBuffers

  • Manages all per-viewport render targets
  • Key buffers in a path tracer context
    • Accumulation buffer — stores running average of path traced samples
    • Albedo buffer — demodulated color for denoiser
    • Normal buffer — world-space normals for denoiser
    • Motion vector buffer — for temporal reprojection
    • Depth buffer — for TAA and denoiser
  • Accessing existing Godot buffers
    • render_scene_buffers.get_depth_texture() — depth from previous frame
    • render_scene_buffers.get_velocity_texture() — motion vectors
    • These are already populated by Godot’s pre-pass
  • Creating custom buffers
    RID accum_buffer = RD::get_singleton()->texture_create(
        RDTextureFormat{
            .format = RD::DATA_FORMAT_R32G32B32A32_SFLOAT,
            .width  = viewport_width,
            .height = viewport_height,
            .usage_bits = RD::TEXTURE_USAGE_STORAGE_BIT | RD::TEXTURE_USAGE_SAMPLING_BIT
        }, RDTextureView{});

4.4 Acceleration Structure Management in Godot

  • Godot doesn’t have built-in BLAS/TLAS management (as of 4.x)
  • NVPathtracer implements its own
    • Iterates RenderingServer mesh instances
    • Extracts vertex/index buffers via RenderingServer.mesh_get_surface_arrays()
    • Builds BLAS per mesh, TLAS per frame
  • Mesh data extraction
    Array arrays = RS::get_singleton()->mesh_get_surface_arrays(mesh_rid, surface_idx);
    PackedVector3Array vertices = arrays[RS::ARRAY_VERTEX];
    PackedInt32Array   indices  = arrays[RS::ARRAY_INDEX];
    PackedVector3Array normals  = arrays[RS::ARRAY_NORMAL];
    PackedVector2Array uvs      = arrays[RS::ARRAY_TEX_UV];
  • Skinned mesh challenge
    • Skinned meshes need BLAS rebuild every frame (or BLAS update)
    • ALLOW_UPDATE flag + vkCmdBuildAccelerationStructuresKHR with UPDATE mode
    • Requires scratch buffer to be kept alive
    • Alternative: use compute shader to deform vertices, then rebuild BLAS
  • Instance data buffer
    • Custom buffer storing per-instance material data, transform, etc.
    • Indexed by gl_InstanceCustomIndexEXT in the closest-hit shader
    struct InstanceData {
        uint64_t vertex_address;  // buffer device address
        uint64_t index_address;
        uint64_t material_address;
        mat4     transform;
        mat4     transform_inv_transpose;  // for normals
    };

4.5 Material System Integration

  • Godot materials: StandardMaterial3D / BaseMaterial3D
  • Extracting material properties for RT shaders
    • material.albedo_color → base color
    • material.roughness → GGX roughness
    • material.metallic → metallic factor
    • material.emission → emissive color
    • Texture RIDs → need to pass to RT shader as bindless array
  • Bindless texture array
    • Collect all material textures into a descriptor array
    • Pass array index per instance in InstanceData
    • In closest-hit shader: texture(textures[nonuniformEXT(inst.albedo_tex_idx)], uv)

4.6 Integration Points

  • Where the path tracer hooks in
    • _render_scene() — main render call, replace with RT dispatch
    • _render_shadow_pass() — can skip (path tracer handles shadows)
    • _post_opaque_render_step() — good place for denoising
  • GDExtension vs engine fork
    • NVPathtracer is an engine fork (modifies core rendering code)
    • GDExtension cannot access RenderingDevice internals directly
    • Future: Godot may expose more RT hooks via GDExtension
    • Workaround: use RenderingServer.call_on_render_thread() for some operations
  • Shader compilation
    • Godot compiles GLSL to SPIR-V at startup
    • RT shaders need GL_EXT_ray_tracing extension
    • Godot’s shader compiler may need patching to support RT extensions
    • Alternative: precompile RT shaders with glslangValidator or shaderc

4.7 Frame Synchronization

  • Godot’s render thread is separate from the main thread
  • Use RenderingServer.call_on_render_thread() to execute code on render thread
  • TLAS rebuild must happen before vkCmdTraceRaysKHR
  • Barrier between TLAS build and RT dispatch:
    VkMemoryBarrier barrier{};
    barrier.srcAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR;
    barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR;
    vkCmdPipelineBarrier(cmd,
        VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR,
        VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR,
        0, 1, &barrier, 0, nullptr, 0, nullptr);

4.8 Key Chat Insights (NVPathtracer Discussion)

  • PathTracer Learning - Chat Analysis
    • The contributors discussed several key challenges
    • TLAS rebuild strategy — full rebuild vs incremental update
    • Material system — how to pass PBR material data to RT shaders
    • Denoising integration — DLSS vs OIDN vs temporal accumulation
    • Performance — async compute for BLAS builds
    • Shader hot-reloading — important for development iteration speed

Phase 4 Checklist

  • Read and understand renderer_scene_render_rd.cpp
  • Understand how RenderSceneBuffersRD manages textures
  • Trace how a mesh gets from MeshInstance3D to GPU buffers
  • Implement BLAS creation from Godot mesh data
  • Implement TLAS with Godot scene instances
  • Hook path tracer into _render_scene()
  • Pass material data to RT shaders via instance custom index
  • Set up bindless texture array for material textures
  • Implement accumulation buffer management