Phase 4 — Godot Engine Internals
- How Godot’s rendering architecture works and where a path tracer plugs in.
- Based on the NVPathtracer contributor discussions — this is the real integration challenge.
- Parent: PathTracer Learning
4.1 Godot Rendering Architecture
- Godot 4 uses a layered rendering system
RenderingServer— high-level API (scene objects, materials, lights)RenderingDevice(RD) — low-level GPU API (Vulkan wrapper)RendererSceneRender— abstract base for render pipelinesRendererSceneRenderRD— RD-based implementationRenderSceneBuffersRD— per-viewport render targets and buffers
- The NVPathtracer hooks into
RendererSceneRenderRD- Overrides
_render_scene()to replace rasterization with path tracing - Uses
RenderingDevicedirectly for Vulkan RT commands
- Overrides
- Key source files in Godot
servers/rendering/renderer_rd/shaders/— all GLSL shaders
4.2 RenderingDevice API
- Godot’s abstraction over Vulkan (and other backends)
- Buffer management
RD.buffer_create(size, usage, data)— allocate GPU bufferRD.buffer_update(buffer, offset, size, data)— upload CPU dataRD.buffer_get_data(buffer)— readback (slow, avoid in hot path)- Usage flags:
BUFFER_USAGE_STORAGE_BIT,BUFFER_USAGE_VERTEX_BIT, etc.
- Texture management
RD.texture_create(format, view, data)— create textureRD.texture_update(texture, layer, data)— upload texture dataRD.texture_get_data(texture, layer)— readback
- Compute pipelines
RD.shader_create_from_spirv(spirv)— compile shaderRD.compute_pipeline_create(shader)— create pipelineRD.compute_list_begin()→RD.compute_list_bind_*()→RD.compute_list_dispatch()→RD.compute_list_end()
- Uniform sets
- Uniforms:
RDUniformwithuniform_type,binding,ids
- Uniforms:
- Ray tracing (Vulkan extension, not yet in stable RD API)
- NVPathtracer calls Vulkan directly via
vkGetDeviceProcAddr vkCmdBuildAccelerationStructuresKHR,vkCmdTraceRaysKHR- Requires bypassing RenderingDevice for RT-specific commands
- NVPathtracer calls Vulkan directly via
4.3 RenderSceneBuffers
- Manages all per-viewport render targets
- Key buffers in a path tracer context
- Accumulation buffer — stores running average of path traced samples
- Albedo buffer — demodulated color for denoiser
- Normal buffer — world-space normals for denoiser
- Motion vector buffer — for temporal reprojection
- Depth buffer — for TAA and denoiser
- Accessing existing Godot buffers
render_scene_buffers.get_depth_texture()— depth from previous framerender_scene_buffers.get_velocity_texture()— motion vectors- These are already populated by Godot’s pre-pass
- Creating custom buffers
RID accum_buffer = RD::get_singleton()->texture_create( RDTextureFormat{ .format = RD::DATA_FORMAT_R32G32B32A32_SFLOAT, .width = viewport_width, .height = viewport_height, .usage_bits = RD::TEXTURE_USAGE_STORAGE_BIT | RD::TEXTURE_USAGE_SAMPLING_BIT }, RDTextureView{});
4.4 Acceleration Structure Management in Godot
- Godot doesn’t have built-in BLAS/TLAS management (as of 4.x)
- NVPathtracer implements its own
- Iterates
RenderingServermesh instances - Extracts vertex/index buffers via
RenderingServer.mesh_get_surface_arrays() - Builds BLAS per mesh, TLAS per frame
- Iterates
- Mesh data extraction
Array arrays = RS::get_singleton()->mesh_get_surface_arrays(mesh_rid, surface_idx); PackedVector3Array vertices = arrays[RS::ARRAY_VERTEX]; PackedInt32Array indices = arrays[RS::ARRAY_INDEX]; PackedVector3Array normals = arrays[RS::ARRAY_NORMAL]; PackedVector2Array uvs = arrays[RS::ARRAY_TEX_UV]; - Skinned mesh challenge
- Skinned meshes need BLAS rebuild every frame (or BLAS update)
ALLOW_UPDATEflag +vkCmdBuildAccelerationStructuresKHRwithUPDATEmode- Requires scratch buffer to be kept alive
- Alternative: use compute shader to deform vertices, then rebuild BLAS
- Instance data buffer
- Custom buffer storing per-instance material data, transform, etc.
- Indexed by
gl_InstanceCustomIndexEXTin the closest-hit shader
struct InstanceData { uint64_t vertex_address; // buffer device address uint64_t index_address; uint64_t material_address; mat4 transform; mat4 transform_inv_transpose; // for normals };
4.5 Material System Integration
- Godot materials:
StandardMaterial3D/BaseMaterial3D - Extracting material properties for RT shaders
material.albedo_color→ base colormaterial.roughness→ GGX roughnessmaterial.metallic→ metallic factormaterial.emission→ emissive color- Texture RIDs → need to pass to RT shader as bindless array
- Bindless texture array
- Collect all material textures into a descriptor array
- Pass array index per instance in
InstanceData - In closest-hit shader:
texture(textures[nonuniformEXT(inst.albedo_tex_idx)], uv)
4.6 Integration Points
- Where the path tracer hooks in
_render_scene()— main render call, replace with RT dispatch_render_shadow_pass()— can skip (path tracer handles shadows)_post_opaque_render_step()— good place for denoising
- GDExtension vs engine fork
- NVPathtracer is an engine fork (modifies core rendering code)
- GDExtension cannot access
RenderingDeviceinternals directly - Future: Godot may expose more RT hooks via GDExtension
- Workaround: use
RenderingServer.call_on_render_thread()for some operations
- Shader compilation
- Godot compiles GLSL to SPIR-V at startup
- RT shaders need
GL_EXT_ray_tracingextension - Godot’s shader compiler may need patching to support RT extensions
- Alternative: precompile RT shaders with
glslangValidatororshaderc
4.7 Frame Synchronization
- Godot’s render thread is separate from the main thread
- Use
RenderingServer.call_on_render_thread()to execute code on render thread - TLAS rebuild must happen before
vkCmdTraceRaysKHR - Barrier between TLAS build and RT dispatch:
VkMemoryBarrier barrier{}; barrier.srcAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_WRITE_BIT_KHR; barrier.dstAccessMask = VK_ACCESS_ACCELERATION_STRUCTURE_READ_BIT_KHR; vkCmdPipelineBarrier(cmd, VK_PIPELINE_STAGE_ACCELERATION_STRUCTURE_BUILD_BIT_KHR, VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR, 0, 1, &barrier, 0, nullptr, 0, nullptr);
4.8 Key Chat Insights (NVPathtracer Discussion)
- PathTracer Learning - Chat Analysis
- The contributors discussed several key challenges
- TLAS rebuild strategy — full rebuild vs incremental update
- Material system — how to pass PBR material data to RT shaders
- Denoising integration — DLSS vs OIDN vs temporal accumulation
- Performance — async compute for BLAS builds
- Shader hot-reloading — important for development iteration speed
Phase 4 Checklist
- Read and understand
renderer_scene_render_rd.cpp - Understand how
RenderSceneBuffersRDmanages textures - Trace how a mesh gets from
MeshInstance3Dto GPU buffers - Implement BLAS creation from Godot mesh data
- Implement TLAS with Godot scene instances
- Hook path tracer into
_render_scene() - Pass material data to RT shaders via instance custom index
- Set up bindless texture array for material textures
- Implement accumulation buffer management