Phase 6.2: RT shadows — inline ray queries with BLAS/TLAS fix

Add shadow compute shader (voxelShadowCS.hlsl) that traces rays toward
the sun using DXR inline ray queries (RayQuery<>, SM 6.5). Shadows
modulate voxelRT_ in-place via RWTexture2D (no extra render target).

Key fixes to Phase 6.1 BLAS/TLAS infrastructure:
- Sequential index buffer required: Wicked treats IndexCount=0 with
  non-null IndexBuffer as "0 indexed triangles" → empty BLAS
- Memory barriers between BLAS→TLAS→RT: without GPUBarrier::Memory()
  the TLAS build races with BLAS builds, causing zero ray hits
- inverseViewProjection added to VoxelCB for depth reconstruction

F5 toggles shadows OFF→ON→DEBUG (red=hit, green=miss, blue=backface).
This commit is contained in:
Samuel Bouchet 2026-03-28 20:01:18 +01:00
parent 7f36bdae38
commit 6b41da0932
6 changed files with 297 additions and 22 deletions

View file

@ -32,7 +32,8 @@ bvle-voxels/
│ ├── voxelTopingPS.hlsl # Pixel shader topings (triplanar + directional lighting) │ ├── voxelTopingPS.hlsl # Pixel shader topings (triplanar + directional lighting)
│ ├── voxelSmoothVS.hlsl # Vertex shader smooth Surface Nets (vertex pulling, t6) │ ├── voxelSmoothVS.hlsl # Vertex shader smooth Surface Nets (vertex pulling, t6)
│ ├── voxelSmoothPS.hlsl # Pixel shader smooth (triplanar + material blending) │ ├── voxelSmoothPS.hlsl # Pixel shader smooth (triplanar + material blending)
│ └── voxelBLASExtractCS.hlsl # Compute shader BLAS position extraction (Phase 6.1) │ ├── voxelBLASExtractCS.hlsl # Compute shader BLAS position extraction (Phase 6.1)
│ └── voxelShadowCS.hlsl # Compute shader RT shadows (inline ray queries, Phase 6.2)
└── CLAUDE.md └── CLAUDE.md
``` ```
@ -555,15 +556,36 @@ Système de biseaux décoratifs (« topings ») sur les faces +Y exposées pour
- Smooth BLAS auto-recreated when vertex count changes - Smooth BLAS auto-recreated when vertex count changes
- **HUD** : RT status line showing TLAS state + triangle counts for blocky/smooth - **HUD** : RT status line showing TLAS state + triangle counts for blocky/smooth
- **Pièges résolus** : - **Pièges résolus** :
- **Index buffer obligatoire dans BLAS** : `CreateRaytracingAccelerationStructure` dans Wicked accède TOUJOURS `index_buffer` via `to_internal()` (ligne 4356 de `wiGraphicsDevice_DX12.cpp`), même pour de la géométrie non-indexée. Un `GPUBuffer` par défaut (invalide) cause un null deref à offset 0xd8. Solution : fournir un buffer valide dummy + `index_count = 0` - **Index buffer obligatoire dans BLAS** : `CreateRaytracingAccelerationStructure` dans Wicked accède TOUJOURS `index_buffer` via `to_internal()` (ligne 4356 de `wiGraphicsDevice_DX12.cpp`), même pour de la géométrie non-indexée. Un `GPUBuffer` par défaut (invalide) cause un null deref à offset 0xd8. De plus, `index_count = 0` avec `IndexBuffer != 0` fait que DX12 interprète "0 triangles indexés" → BLAS vide. Solution : fournir un vrai sequential index buffer `[0,1,2,...]` avec `index_count = vertex_count` et `index_format = UINT32`
- **`CreateBuffer2` pour TLAS instance buffer** : les buffers avec `ResourceMiscFlag::RAY_TRACING` ne supportent pas `UpdateBuffer` (state mismatch). Utiliser `CreateBuffer2` avec callback pour pré-remplir les instances à la création - **`CreateBuffer2` pour TLAS instance buffer** : les buffers avec `ResourceMiscFlag::RAY_TRACING` ne supportent pas `UpdateBuffer` (state mismatch). Utiliser `CreateBuffer2` avec callback pour pré-remplir les instances à la création
- **Memory barriers BLAS→TLAS→RT — PIÈGE MAJEUR** : `BuildRaytracingAccelerationStructure` est asynchrone GPU. Sans barriers :
- Le TLAS build peut s'exécuter avant que les BLAS ne soient terminés
- Les ray queries peuvent s'exécuter avant que le TLAS ne soit prêt
- Résultat : BLAS apparaît vide (zéro hits) sans aucun crash ni erreur
- Solution (pattern de `wiRenderer.cpp` lignes 5788, 5808) :
1. `GPUBarrier::Memory()` après tous les BLAS builds, avant le TLAS build
2. `GPUBarrier::Memory(&tlas_)` après le TLAS build, avant les ray queries
#### Phase 6.2 - RT Shadows [A FAIRE] #### Phase 6.2 - RT Shadows [FAIT]
- Compute shader with inline ray queries (`TraceRayInline`) - **Compute shader** (`voxelShadowCS.hlsl`) avec inline ray queries (`RayQuery<>`, SM 6.5)
- Bind TLAS as SRV, voxelNormalRT_ + voxelDepth_ as input - Lit `voxelDepth_` (t0, D32→R32_FLOAT) + `voxelNormalRT_` (t1) + TLAS (t2)
- Shadow map output (R8_UNORM or similar) - Reconstruit worldPos depuis depth via `inverseViewProjection` (ajouté au VoxelCB)
- Sun direction ray: trace from surface point toward light - Trace un rayon vers le soleil : `L = normalize(-sunDirection.xyz)`
- `RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH` + `RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES` (shadow binaire)
- Normal bias (0.15) pour éviter l'auto-intersection
- Surfaces back-facing (NdotL ≤ 0) : assombries sans ray trace
- **In-place modulation** : `RWTexture2D<float4>` sur `voxelRT_` (u0), chaque thread lit/modifie son pixel (pas de race)
- Shadow factor : `color.rgb *= 0.3` pour les pixels en ombre
- `voxelRT_` créé avec `UNORDERED_ACCESS` additionnel pour permettre l'écriture compute
- **Dispatch** : 8×8 thread groups, `ceil(w/8) × ceil(h/8)`, après les 3 render passes (blocky+topings+smooth)
- **Barriers** :
- Pre : `voxelDepth_` DEPTHSTENCIL→SHADER_RESOURCE + `voxelRT_` SHADER_RESOURCE→UAV
- Post : `voxelDepth_` SHADER_RESOURCE→DEPTHSTENCIL + `voxelRT_` UAV→SHADER_RESOURCE
- **Mode debug** (F5 × 2 = DBG) : rouge=shadow hit, vert=miss, bleu=back-facing, gris foncé=ciel
- **Toggle** : F5 cycle OFF→ON→DBG→OFF
- **CB** : `inverseViewProjection` (float4x4) ajouté après `viewProjection` dans VoxelCB (HLSL + C++)
- **Push constants** : width, height, normalBias, maxDistance, debugMode
#### Phase 6.3 - RT AO [A FAIRE] #### Phase 6.3 - RT AO [A FAIRE]

View file

@ -66,6 +66,7 @@ add_custom_command(TARGET BVLEVoxels POST_BUILD
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCentroidCS.cso $<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCentroidCS.cso
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCS.cso $<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCS.cso
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelBLASExtractCS.cso $<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelBLASExtractCS.cso
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelShadowCS.cso
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelCommon.hlsli.cso $<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelCommon.hlsli.cso
COMMENT "Clearing stale voxel shader cache (forces recompilation from current .hlsl sources)" COMMENT "Clearing stale voxel shader cache (forces recompilation from current .hlsl sources)"
) )

View file

@ -38,6 +38,7 @@
// ── Per-frame constant buffer (b0) ────────────────────────────── // ── Per-frame constant buffer (b0) ──────────────────────────────
cbuffer VoxelCB : register(b0) { cbuffer VoxelCB : register(b0) {
float4x4 viewProjection; float4x4 viewProjection;
float4x4 inverseViewProjection; // for depth-to-world reconstruction (RT shadows)
float4 cameraPosition; float4 cameraPosition;
float4 sunDirection; float4 sunDirection;
float4 sunColor; float4 sunColor;

105
shaders/voxelShadowCS.hlsl Normal file
View file

@ -0,0 +1,105 @@
// BVLE Voxels - RT Shadow Compute Shader (Phase 6.2)
// Traces shadow rays from each pixel toward the sun using inline ray queries.
// Reads depth + normal to reconstruct world position, modulates voxelRT_ in-place.
#include "voxelCommon.hlsli"
// SRV bindings
Texture2D<float> depthTexture : register(t0); // voxelDepth_ (D32_FLOAT as R32_FLOAT SRV)
Texture2D<float4> normalTexture : register(t1); // voxelNormalRT_ (R16G16B16A16_SNORM)
RaytracingAccelerationStructure tlas : register(t2); // TLAS with blocky + smooth instances
// UAV: read-modify-write voxelRT_ (each thread handles exactly one pixel, no race)
RWTexture2D<float4> colorOutput : register(u0);
// Push constants
struct ShadowPush {
uint width;
uint height;
float normalBias;
float maxDistance;
uint debugMode; // 0=normal, 1=debug visualization
uint pad[7];
};
[[vk::push_constant]] ConstantBuffer<ShadowPush> push : register(b999);
[RootSignature(VOXEL_ROOTSIG)]
[numthreads(8, 8, 1)]
void main(uint3 DTid : SV_DispatchThreadID) {
if (DTid.x >= push.width || DTid.y >= push.height) return;
float depth = depthTexture[DTid.xy];
// depth == 0 means sky (reverse-Z: 0 = far plane)
if (depth == 0.0) {
if (push.debugMode > 0) colorOutput[DTid.xy] = float4(0.1, 0.1, 0.1, 1); // dark gray = sky
return;
}
// Reconstruct world position from depth via inverse VP
float2 uv = (float2(DTid.xy) + 0.5) / float2(push.width, push.height);
float2 ndc = float2(uv.x * 2.0 - 1.0, (1.0 - uv.y) * 2.0 - 1.0);
float4 clipPos = float4(ndc, depth, 1.0);
float4 worldPos4 = mul(inverseViewProjection, clipPos);
float3 worldPos = worldPos4.xyz / worldPos4.w;
// Read world-space normal
float3 N = normalTexture[DTid.xy].xyz;
// Light direction: sunDirection is the direction of travel, negate for "toward sun"
float3 L = normalize(-sunDirection.xyz);
// Skip surfaces facing away from the light (self-shadowed by geometry)
float NdotL = dot(N, L);
if (NdotL <= 0.0) {
if (push.debugMode > 0) {
colorOutput[DTid.xy] = float4(0.0, 0.0, 0.5, 1); // blue = back-facing
} else {
float4 color = colorOutput[DTid.xy];
color.rgb *= 0.3;
colorOutput[DTid.xy] = color;
}
return;
}
// Offset ray origin along normal to avoid self-intersection
float3 origin = worldPos + N * push.normalBias;
RayDesc ray;
ray.Origin = origin;
ray.Direction = L;
ray.TMin = 0.01;
ray.TMax = push.maxDistance;
// Inline ray query: accept first hit (binary shadow, don't need closest)
RayQuery<RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES | RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH> q;
q.TraceRayInline(tlas, 0, 0xFF, ray);
// With FLAG_OPAQUE geometry + ACCEPT_FIRST_HIT, Proceed() handles everything
while (q.Proceed()) {}
if (q.CommittedStatus() == COMMITTED_TRIANGLE_HIT) {
if (push.debugMode > 0) {
colorOutput[DTid.xy] = float4(1.0, 0.0, 0.0, 1); // RED = shadow ray hit (cast shadow!)
} else {
float4 color = colorOutput[DTid.xy];
color.rgb *= 0.3;
colorOutput[DTid.xy] = color;
}
} else {
if (push.debugMode > 0) {
// Debug: trace downward ray from reconstructed worldPos to verify BLAS
RayDesc testRay;
testRay.Origin = worldPos + float3(0, 5, 0); // 5 units above surface
testRay.Direction = float3(0, -1, 0); // straight down
testRay.TMin = 0.01;
testRay.TMax = 100.0;
RayQuery<RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES | RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH> testQ;
testQ.TraceRayInline(tlas, 0, 0xFF, testRay);
while (testQ.Proceed()) {}
if (testQ.CommittedStatus() == COMMITTED_TRIANGLE_HIT) {
colorOutput[DTid.xy] = float4(1, 0, 0, 1); // RED = hit (BLAS works!)
} else {
colorOutput[DTid.xy] = float4(0, 1, 0, 1); // GREEN = miss (BLAS broken)
}
}
}
}

View file

@ -170,10 +170,25 @@ void VoxelRenderer::initialize(GraphicsDevice* dev) {
posDesc.stride = 0; // raw buffer, no stride posDesc.stride = 0; // raw buffer, no stride
posDesc.usage = Usage::DEFAULT; posDesc.usage = Usage::DEFAULT;
bool ok = device_->CreateBuffer(&posDesc, nullptr, &blasPositionBuffer_); bool ok = device_->CreateBuffer(&posDesc, nullptr, &blasPositionBuffer_);
if (ok && blasPositionBuffer_.IsValid()) { // Sequential index buffer for BLAS (DX12 requires valid index buffer,
// Wicked always writes IndexBuffer GPU address even for "non-indexed").
GPUBufferDesc idxDesc;
idxDesc.size = (uint64_t)MAX_BLAS_VERTICES * sizeof(uint32_t);
idxDesc.bind_flags = BindFlag::SHADER_RESOURCE;
idxDesc.usage = Usage::DEFAULT;
auto fillIndices = [](void* dest) {
uint32_t* p = (uint32_t*)dest;
for (uint32_t i = 0; i < MAX_BLAS_VERTICES; i++)
p[i] = i;
};
bool okIdx = device_->CreateBuffer2(&idxDesc, fillIndices, &blasIndexBuffer_);
if (ok && blasPositionBuffer_.IsValid() && okIdx && blasIndexBuffer_.IsValid()) {
device_->SetName(&blasPositionBuffer_, "VoxelRenderer::blasPositionBuffer"); device_->SetName(&blasPositionBuffer_, "VoxelRenderer::blasPositionBuffer");
wi::backlog::post("VoxelRenderer: RT available (BLAS position buffer " device_->SetName(&blasIndexBuffer_, "VoxelRenderer::blasIndexBuffer");
+ std::to_string(posDesc.size / (1024*1024)) + " MB)"); wi::backlog::post("VoxelRenderer: RT available (BLAS pos "
+ std::to_string(posDesc.size / (1024*1024)) + " MB + idx "
+ std::to_string(idxDesc.size / (1024*1024)) + " MB)");
} else { } else {
rtAvailable_ = false; rtAvailable_ = false;
wi::backlog::post("VoxelRenderer: RT buffer creation failed", wi::backlog::LogLevel::Warning); wi::backlog::post("VoxelRenderer: RT buffer creation failed", wi::backlog::LogLevel::Warning);
@ -182,6 +197,16 @@ void VoxelRenderer::initialize(GraphicsDevice* dev) {
rtAvailable_ = false; rtAvailable_ = false;
wi::backlog::post("VoxelRenderer: RT available but BLAS extraction shader failed", wi::backlog::LogLevel::Warning); wi::backlog::post("VoxelRenderer: RT available but BLAS extraction shader failed", wi::backlog::LogLevel::Warning);
} }
// ── RT Shadows (Phase 6.2) ────────────────────────────────────
wi::renderer::LoadShader(ShaderStage::CS, shadowShader_, "voxel/voxelShadowCS.cso",
wi::graphics::ShaderModel::SM_6_5);
if (shadowShader_.IsValid()) {
rtShadowsEnabled_ = true;
wi::backlog::post("VoxelRenderer: RT shadows available");
} else {
wi::backlog::post("VoxelRenderer: RT shadow shader failed to compile",
wi::backlog::LogLevel::Warning);
}
} else { } else {
wi::backlog::post("VoxelRenderer: RT not available (GPU does not support ray tracing)"); wi::backlog::post("VoxelRenderer: RT not available (GPU does not support ray tracing)");
} }
@ -1003,10 +1028,13 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
geom.triangles.vertex_count = blockyVertCount; geom.triangles.vertex_count = blockyVertCount;
geom.triangles.vertex_stride = sizeof(float) * 3; // 12 bytes per float3 geom.triangles.vertex_stride = sizeof(float) * 3; // 12 bytes per float3
geom.triangles.vertex_format = Format::R32G32B32_FLOAT; geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
// Wicked ALWAYS accesses index_buffer via to_internal() even for non-indexed. // Wicked ALWAYS accesses index_buffer via to_internal() — a default GPUBuffer
// Provide a valid buffer with index_count=0 to prevent null deref crash. // causes null deref. And DX12 treats non-zero IndexBuffer + IndexCount=0 as
geom.triangles.index_buffer = blasPositionBuffer_; // dummy, won't be used // "indexed with 0 triangles" → empty BLAS. Solution: real sequential index buffer.
geom.triangles.index_count = 0; geom.triangles.index_buffer = blasIndexBuffer_;
geom.triangles.index_count = blockyVertCount;
geom.triangles.index_format = IndexBufferFormat::UINT32;
geom.triangles.index_offset = 0;
bool ok = dev->CreateRaytracingAccelerationStructure(&desc, bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
&blockyBLAS_); &blockyBLAS_);
@ -1048,9 +1076,11 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
geom.triangles.vertex_byte_offset = 0; geom.triangles.vertex_byte_offset = 0;
geom.triangles.vertex_count = smoothVertCount; geom.triangles.vertex_count = smoothVertCount;
geom.triangles.vertex_stride = 32; // SmoothVtx struct = 32 bytes, position at offset 0 geom.triangles.vertex_stride = 32; // SmoothVtx struct = 32 bytes, position at offset 0
// Wicked always accesses index_buffer (null deref if invalid) // Wicked always accesses index_buffer — must be valid + use real indices
geom.triangles.index_buffer = smoothVB; // dummy, won't be used geom.triangles.index_buffer = blasIndexBuffer_;
geom.triangles.index_count = 0; geom.triangles.index_count = smoothVertCount;
geom.triangles.index_format = IndexBufferFormat::UINT32;
geom.triangles.index_offset = 0;
geom.triangles.vertex_format = Format::R32G32B32_FLOAT; geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
bool ok = dev->CreateRaytracingAccelerationStructure(&desc, bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
@ -1071,6 +1101,14 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
rtSmoothVertexCount_ = smoothVertCount; rtSmoothVertexCount_ = smoothVertCount;
} }
// ── Memory barrier: sync BLAS builds before TLAS ──────────────
// Without this, TLAS build can execute before BLASes are complete.
// (Same pattern as wiRenderer.cpp line 5788)
{
GPUBarrier barriers[] = { GPUBarrier::Memory() };
dev->Barrier(barriers, 1, cmd);
}
// ── TLAS (2 instances: blocky + smooth) ────────────────────── // ── TLAS (2 instances: blocky + smooth) ──────────────────────
// Always recreate TLAS with pre-filled instance data via CreateBuffer2. // Always recreate TLAS with pre-filled instance data via CreateBuffer2.
// RAY_TRACING instance buffers have special resource state requirements, // RAY_TRACING instance buffers have special resource state requirements,
@ -1152,9 +1190,79 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
// Build TLAS // Build TLAS
dev->BuildRaytracingAccelerationStructure(&tlas_, cmd, nullptr); dev->BuildRaytracingAccelerationStructure(&tlas_, cmd, nullptr);
// Memory barrier: sync TLAS build before ray queries can use it
// (Same pattern as wiRenderer.cpp line 5808)
{
GPUBarrier barriers[] = { GPUBarrier::Memory(&tlas_) };
dev->Barrier(barriers, 1, cmd);
}
rtDirty_ = false; rtDirty_ = false;
} }
// ── RT Shadow dispatch (Phase 6.2) ──────────────────────────────
void VoxelRenderer::dispatchShadows(CommandList cmd,
const Texture& depthBuffer,
const Texture& renderTarget,
const Texture& normalTarget) const
{
if (!rtShadowsEnabled_ || !shadowShader_.IsValid() || !tlas_.IsValid())
return;
auto* dev = device_;
uint32_t w = renderTarget.GetDesc().width;
uint32_t h = renderTarget.GetDesc().height;
// Pre-barriers:
// - voxelDepth_: DEPTHSTENCIL → SHADER_RESOURCE (for depth reads)
// - voxelRT_: SHADER_RESOURCE → UNORDERED_ACCESS (for in-place shadow modulation)
// - voxelNormalRT_ is already in SHADER_RESOURCE state from render pass
GPUBarrier preBarriers[] = {
GPUBarrier::Image(&const_cast<Texture&>(depthBuffer),
ResourceState::DEPTHSTENCIL, ResourceState::SHADER_RESOURCE),
GPUBarrier::Image(&const_cast<Texture&>(renderTarget),
ResourceState::SHADER_RESOURCE, ResourceState::UNORDERED_ACCESS),
};
dev->Barrier(preBarriers, 2, cmd);
dev->BindComputeShader(&shadowShader_, cmd);
// Bind resources
dev->BindResource(&depthBuffer, 0, cmd); // t0 = depth
dev->BindResource(&normalTarget, 1, cmd); // t1 = normals
dev->BindResource(&tlas_, 2, cmd); // t2 = TLAS
dev->BindUAV(&renderTarget, 0, cmd); // u0 = color (read-modify-write)
dev->BindConstantBuffer(&constantBuffer_, 0, cmd); // b0 = VoxelCB
// Push constants
struct ShadowPush {
uint32_t width;
uint32_t height;
float normalBias;
float maxDistance;
uint32_t debugMode;
uint32_t pad[7];
} pushData = {};
pushData.width = w;
pushData.height = h;
pushData.normalBias = 0.15f; // offset along normal to avoid self-intersection
pushData.maxDistance = 512.0f; // max shadow ray distance
pushData.debugMode = rtShadowDebug_ ? 1 : 0;
dev->PushConstants(&pushData, sizeof(pushData), cmd);
// Dispatch: 8×8 thread groups covering the screen
dev->Dispatch((w + 7) / 8, (h + 7) / 8, 1, cmd);
// Post-barriers: restore states for Compose()
GPUBarrier postBarriers[] = {
GPUBarrier::Image(&const_cast<Texture&>(depthBuffer),
ResourceState::SHADER_RESOURCE, ResourceState::DEPTHSTENCIL),
GPUBarrier::Image(&const_cast<Texture&>(renderTarget),
ResourceState::UNORDERED_ACCESS, ResourceState::SHADER_RESOURCE),
};
dev->Barrier(postBarriers, 2, cmd);
}
// ── Frustum plane extraction (Gribb-Hartmann method) ──────────── // ── Frustum plane extraction (Gribb-Hartmann method) ────────────
static void extractFrustumPlanes(const XMMATRIX& vp, XMFLOAT4 planes[6]) { static void extractFrustumPlanes(const XMMATRIX& vp, XMFLOAT4 planes[6]) {
XMFLOAT4X4 m; XMFLOAT4X4 m;
@ -1213,8 +1321,10 @@ void VoxelRenderer::render(
VoxelConstants cb = {}; VoxelConstants cb = {};
XMMATRIX vpMatrix = camera.GetViewProjection(); XMMATRIX vpMatrix = camera.GetViewProjection();
XMStoreFloat4x4(&cb.viewProjection, vpMatrix); XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
XMMATRIX invVP = XMMatrixInverse(nullptr, vpMatrix);
XMStoreFloat4x4(&cb.inverseViewProjection, invVP);
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f); cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
cb.sunDirection = XMFLOAT4(-0.5f, -0.8f, -0.3f, 0.0f); cb.sunDirection = XMFLOAT4(-0.7f, -0.4f, -0.3f, 0.0f); // lower sun = longer cast shadows
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f); cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
cb.chunkSize = (float)CHUNK_SIZE; cb.chunkSize = (float)CHUNK_SIZE;
cb.textureTiling = 0.25f; cb.textureTiling = 0.25f;
@ -1313,7 +1423,7 @@ void VoxelRenderer::render(
XMMATRIX vpMatrix = camera.GetViewProjection(); XMMATRIX vpMatrix = camera.GetViewProjection();
XMStoreFloat4x4(&cb.viewProjection, vpMatrix); XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f); cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
cb.sunDirection = XMFLOAT4(-0.5f, -0.8f, -0.3f, 0.0f); cb.sunDirection = XMFLOAT4(-0.7f, -0.4f, -0.3f, 0.0f); // lower sun = longer cast shadows
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f); cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
cb.chunkSize = (float)CHUNK_SIZE; cb.chunkSize = (float)CHUNK_SIZE;
cb.textureTiling = 0.25f; cb.textureTiling = 0.25f;
@ -2123,7 +2233,8 @@ void VoxelRenderPath::createRenderTargets() {
rtDesc.width = w; rtDesc.width = w;
rtDesc.height = h; rtDesc.height = h;
rtDesc.format = wi::graphics::Format::R8G8B8A8_UNORM; rtDesc.format = wi::graphics::Format::R8G8B8A8_UNORM;
rtDesc.bind_flags = wi::graphics::BindFlag::RENDER_TARGET | wi::graphics::BindFlag::SHADER_RESOURCE; rtDesc.bind_flags = wi::graphics::BindFlag::RENDER_TARGET | wi::graphics::BindFlag::SHADER_RESOURCE
| wi::graphics::BindFlag::UNORDERED_ACCESS; // RT shadows modify in-place
rtDesc.mip_levels = 1; rtDesc.mip_levels = 1;
rtDesc.sample_count = 1; rtDesc.sample_count = 1;
rtDesc.layout = wi::graphics::ResourceState::SHADER_RESOURCE; rtDesc.layout = wi::graphics::ResourceState::SHADER_RESOURCE;
@ -2179,6 +2290,21 @@ void VoxelRenderPath::handleInput(float dt) {
renderer.debugBlend_ = !renderer.debugBlend_; renderer.debugBlend_ = !renderer.debugBlend_;
wi::backlog::post(renderer.debugBlend_ ? "Blend debug: ON" : "Blend debug: OFF"); wi::backlog::post(renderer.debugBlend_ ? "Blend debug: ON" : "Blend debug: OFF");
} }
if (wi::input::Press(wi::input::KEYBOARD_BUTTON_F5)) {
// Cycle: OFF → ON → DEBUG → OFF
if (!renderer.rtShadowsEnabled_) {
renderer.rtShadowsEnabled_ = true;
renderer.rtShadowDebug_ = false;
wi::backlog::post("RT Shadows: ON");
} else if (!renderer.rtShadowDebug_) {
renderer.rtShadowDebug_ = true;
wi::backlog::post("RT Shadows: DEBUG (red=shadow, green=lit, blue=backface)");
} else {
renderer.rtShadowsEnabled_ = false;
renderer.rtShadowDebug_ = false;
wi::backlog::post("RT Shadows: OFF");
}
}
if (wi::input::Press(wi::input::MOUSE_BUTTON_RIGHT)) { if (wi::input::Press(wi::input::MOUSE_BUTTON_RIGHT)) {
mouseCaptured = !mouseCaptured; mouseCaptured = !mouseCaptured;
wi::input::HidePointer(mouseCaptured); wi::input::HidePointer(mouseCaptured);
@ -2369,6 +2495,11 @@ void VoxelRenderPath::Render() const {
// Phase 5: render smooth surfaces (separate render pass, preserves all prior output) // Phase 5: render smooth surfaces (separate render pass, preserves all prior output)
renderer.renderSmooth(cmd, voxelDepth_, voxelRT_, voxelNormalRT_); renderer.renderSmooth(cmd, voxelDepth_, voxelRT_, voxelNormalRT_);
// Phase 6.2: RT Shadows (modulates voxelRT_ in-place after all geometry is rendered)
if (renderer.isRTShadowsEnabled() && renderer.isRTReady()) {
renderer.dispatchShadows(cmd, voxelDepth_, voxelRT_, voxelNormalRT_);
}
auto tRender1 = std::chrono::high_resolution_clock::now(); auto tRender1 = std::chrono::high_resolution_clock::now();
profRender_.add(std::chrono::duration<float, std::milli>(tRender1 - tRender0).count()); profRender_.add(std::chrono::duration<float, std::milli>(tRender1 - tRender0).count());
} }
@ -2481,7 +2612,8 @@ void VoxelRenderPath::Compose(CommandList cmd) const {
if (renderer.isRTReady()) { if (renderer.isRTReady()) {
stats += "RT: TLAS ready | Blocky " stats += "RT: TLAS ready | Blocky "
+ std::to_string(renderer.getRTBlockyTriCount()) + " tris | Smooth " + std::to_string(renderer.getRTBlockyTriCount()) + " tris | Smooth "
+ std::to_string(renderer.getRTSmoothTriCount()) + " tris\n"; + std::to_string(renderer.getRTSmoothTriCount()) + " tris"
+ " | Shadows " + std::string(renderer.rtShadowDebug_ ? "DEBUG" : (renderer.isRTShadowsEnabled() ? "ON" : "OFF")) + "\n";
} else { } else {
stats += "RT: building...\n"; stats += "RT: building...\n";
} }
@ -2490,7 +2622,8 @@ void VoxelRenderPath::Compose(CommandList cmd) const {
} }
stats += "WASD+Space/Ctrl: move | Shift: fast | Right-click: capture mouse\n"; stats += "WASD+Space/Ctrl: move | Shift: fast | Right-click: capture mouse\n";
stats += "F2: console | F3: anim [" + std::string(animatedTerrain_ ? "ON" : "OFF") stats += "F2: console | F3: anim [" + std::string(animatedTerrain_ ? "ON" : "OFF")
+ "] | F4: dbg [" + std::string(renderer.debugBlend_ ? "ON" : "OFF") + "]"; + "] | F4: dbg [" + std::string(renderer.debugBlend_ ? "ON" : "OFF")
+ "] | F5: shadows [" + std::string(renderer.rtShadowDebug_ ? "DBG" : (renderer.isRTShadowsEnabled() ? "ON" : "OFF")) + "]";
wi::font::Draw(stats, fp, cmd); wi::font::Draw(stats, fp, cmd);
} }

View file

@ -147,6 +147,7 @@ private:
// Constants buffer (must match HLSL VoxelCB) // Constants buffer (must match HLSL VoxelCB)
struct VoxelConstants { struct VoxelConstants {
XMFLOAT4X4 viewProjection; XMFLOAT4X4 viewProjection;
XMFLOAT4X4 inverseViewProjection; // for depth-to-world reconstruction (RT shadows)
XMFLOAT4 cameraPosition; XMFLOAT4 cameraPosition;
XMFLOAT4 sunDirection; XMFLOAT4 sunDirection;
XMFLOAT4 sunColor; XMFLOAT4 sunColor;
@ -192,6 +193,7 @@ private:
// ── Ray Tracing (Phase 6.1) ───────────────────────────────────── // ── Ray Tracing (Phase 6.1) ─────────────────────────────────────
wi::graphics::Shader blasExtractShader_; // voxelBLASExtractCS compute shader wi::graphics::Shader blasExtractShader_; // voxelBLASExtractCS compute shader
mutable wi::graphics::GPUBuffer blasPositionBuffer_; // float3[] for blocky BLAS (6 verts per quad) mutable wi::graphics::GPUBuffer blasPositionBuffer_; // float3[] for blocky BLAS (6 verts per quad)
wi::graphics::GPUBuffer blasIndexBuffer_; // sequential uint32 indices [0,1,2,...] for BLAS
mutable wi::graphics::RaytracingAccelerationStructure blockyBLAS_; mutable wi::graphics::RaytracingAccelerationStructure blockyBLAS_;
mutable wi::graphics::RaytracingAccelerationStructure smoothBLAS_; mutable wi::graphics::RaytracingAccelerationStructure smoothBLAS_;
mutable wi::graphics::RaytracingAccelerationStructure tlas_; mutable wi::graphics::RaytracingAccelerationStructure tlas_;
@ -204,6 +206,16 @@ private:
void dispatchBLASExtract(wi::graphics::CommandList cmd) const; void dispatchBLASExtract(wi::graphics::CommandList cmd) const;
void buildAccelerationStructures(wi::graphics::CommandList cmd) const; void buildAccelerationStructures(wi::graphics::CommandList cmd) const;
// ── RT Shadows (Phase 6.2) ─────────────────────────────────────
wi::graphics::Shader shadowShader_; // voxelShadowCS compute shader
mutable bool rtShadowsEnabled_ = false; // true when shader + TLAS ready
mutable bool rtShadowDebug_ = false; // debug visualization mode
void dispatchShadows(wi::graphics::CommandList cmd,
const wi::graphics::Texture& depthBuffer,
const wi::graphics::Texture& renderTarget,
const wi::graphics::Texture& normalTarget) const;
// Benchmark state machine: runs once after world gen // Benchmark state machine: runs once after world gen
enum class BenchState { IDLE, DISPATCH, READBACK, DONE }; enum class BenchState { IDLE, DISPATCH, READBACK, DONE };
mutable BenchState benchState_ = BenchState::IDLE; mutable BenchState benchState_ = BenchState::IDLE;
@ -271,6 +283,7 @@ public:
// Phase 6: Ray Tracing // Phase 6: Ray Tracing
bool isRTAvailable() const { return rtAvailable_; } bool isRTAvailable() const { return rtAvailable_; }
bool isRTReady() const { return rtAvailable_ && tlas_.IsValid(); } bool isRTReady() const { return rtAvailable_ && tlas_.IsValid(); }
bool isRTShadowsEnabled() const { return rtShadowsEnabled_; }
uint32_t getRTBlockyTriCount() const { return rtBlockyVertexCount_ / 3; } uint32_t getRTBlockyTriCount() const { return rtBlockyVertexCount_ / 3; }
uint32_t getRTSmoothTriCount() const { return rtSmoothVertexCount_ / 3; } uint32_t getRTSmoothTriCount() const { return rtSmoothVertexCount_ / 3; }
const wi::graphics::RaytracingAccelerationStructure& getTLAS() const { return tlas_; } const wi::graphics::RaytracingAccelerationStructure& getTLAS() const { return tlas_; }