Phase 6.2: RT shadows — inline ray queries with BLAS/TLAS fix
Add shadow compute shader (voxelShadowCS.hlsl) that traces rays toward the sun using DXR inline ray queries (RayQuery<>, SM 6.5). Shadows modulate voxelRT_ in-place via RWTexture2D (no extra render target). Key fixes to Phase 6.1 BLAS/TLAS infrastructure: - Sequential index buffer required: Wicked treats IndexCount=0 with non-null IndexBuffer as "0 indexed triangles" → empty BLAS - Memory barriers between BLAS→TLAS→RT: without GPUBarrier::Memory() the TLAS build races with BLAS builds, causing zero ray hits - inverseViewProjection added to VoxelCB for depth reconstruction F5 toggles shadows OFF→ON→DEBUG (red=hit, green=miss, blue=backface).
This commit is contained in:
parent
7f36bdae38
commit
6b41da0932
6 changed files with 297 additions and 22 deletions
36
CLAUDE.md
36
CLAUDE.md
|
|
@ -32,7 +32,8 @@ bvle-voxels/
|
||||||
│ ├── voxelTopingPS.hlsl # Pixel shader topings (triplanar + directional lighting)
|
│ ├── voxelTopingPS.hlsl # Pixel shader topings (triplanar + directional lighting)
|
||||||
│ ├── voxelSmoothVS.hlsl # Vertex shader smooth Surface Nets (vertex pulling, t6)
|
│ ├── voxelSmoothVS.hlsl # Vertex shader smooth Surface Nets (vertex pulling, t6)
|
||||||
│ ├── voxelSmoothPS.hlsl # Pixel shader smooth (triplanar + material blending)
|
│ ├── voxelSmoothPS.hlsl # Pixel shader smooth (triplanar + material blending)
|
||||||
│ └── voxelBLASExtractCS.hlsl # Compute shader BLAS position extraction (Phase 6.1)
|
│ ├── voxelBLASExtractCS.hlsl # Compute shader BLAS position extraction (Phase 6.1)
|
||||||
|
│ └── voxelShadowCS.hlsl # Compute shader RT shadows (inline ray queries, Phase 6.2)
|
||||||
└── CLAUDE.md
|
└── CLAUDE.md
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -555,15 +556,36 @@ Système de biseaux décoratifs (« topings ») sur les faces +Y exposées pour
|
||||||
- Smooth BLAS auto-recreated when vertex count changes
|
- Smooth BLAS auto-recreated when vertex count changes
|
||||||
- **HUD** : RT status line showing TLAS state + triangle counts for blocky/smooth
|
- **HUD** : RT status line showing TLAS state + triangle counts for blocky/smooth
|
||||||
- **Pièges résolus** :
|
- **Pièges résolus** :
|
||||||
- **Index buffer obligatoire dans BLAS** : `CreateRaytracingAccelerationStructure` dans Wicked accède TOUJOURS `index_buffer` via `to_internal()` (ligne 4356 de `wiGraphicsDevice_DX12.cpp`), même pour de la géométrie non-indexée. Un `GPUBuffer` par défaut (invalide) cause un null deref à offset 0xd8. Solution : fournir un buffer valide dummy + `index_count = 0`
|
- **Index buffer obligatoire dans BLAS** : `CreateRaytracingAccelerationStructure` dans Wicked accède TOUJOURS `index_buffer` via `to_internal()` (ligne 4356 de `wiGraphicsDevice_DX12.cpp`), même pour de la géométrie non-indexée. Un `GPUBuffer` par défaut (invalide) cause un null deref à offset 0xd8. De plus, `index_count = 0` avec `IndexBuffer != 0` fait que DX12 interprète "0 triangles indexés" → BLAS vide. Solution : fournir un vrai sequential index buffer `[0,1,2,...]` avec `index_count = vertex_count` et `index_format = UINT32`
|
||||||
- **`CreateBuffer2` pour TLAS instance buffer** : les buffers avec `ResourceMiscFlag::RAY_TRACING` ne supportent pas `UpdateBuffer` (state mismatch). Utiliser `CreateBuffer2` avec callback pour pré-remplir les instances à la création
|
- **`CreateBuffer2` pour TLAS instance buffer** : les buffers avec `ResourceMiscFlag::RAY_TRACING` ne supportent pas `UpdateBuffer` (state mismatch). Utiliser `CreateBuffer2` avec callback pour pré-remplir les instances à la création
|
||||||
|
- **Memory barriers BLAS→TLAS→RT — PIÈGE MAJEUR** : `BuildRaytracingAccelerationStructure` est asynchrone GPU. Sans barriers :
|
||||||
|
- Le TLAS build peut s'exécuter avant que les BLAS ne soient terminés
|
||||||
|
- Les ray queries peuvent s'exécuter avant que le TLAS ne soit prêt
|
||||||
|
- Résultat : BLAS apparaît vide (zéro hits) sans aucun crash ni erreur
|
||||||
|
- Solution (pattern de `wiRenderer.cpp` lignes 5788, 5808) :
|
||||||
|
1. `GPUBarrier::Memory()` après tous les BLAS builds, avant le TLAS build
|
||||||
|
2. `GPUBarrier::Memory(&tlas_)` après le TLAS build, avant les ray queries
|
||||||
|
|
||||||
#### Phase 6.2 - RT Shadows [A FAIRE]
|
#### Phase 6.2 - RT Shadows [FAIT]
|
||||||
|
|
||||||
- Compute shader with inline ray queries (`TraceRayInline`)
|
- **Compute shader** (`voxelShadowCS.hlsl`) avec inline ray queries (`RayQuery<>`, SM 6.5)
|
||||||
- Bind TLAS as SRV, voxelNormalRT_ + voxelDepth_ as input
|
- Lit `voxelDepth_` (t0, D32→R32_FLOAT) + `voxelNormalRT_` (t1) + TLAS (t2)
|
||||||
- Shadow map output (R8_UNORM or similar)
|
- Reconstruit worldPos depuis depth via `inverseViewProjection` (ajouté au VoxelCB)
|
||||||
- Sun direction ray: trace from surface point toward light
|
- Trace un rayon vers le soleil : `L = normalize(-sunDirection.xyz)`
|
||||||
|
- `RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH` + `RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES` (shadow binaire)
|
||||||
|
- Normal bias (0.15) pour éviter l'auto-intersection
|
||||||
|
- Surfaces back-facing (NdotL ≤ 0) : assombries sans ray trace
|
||||||
|
- **In-place modulation** : `RWTexture2D<float4>` sur `voxelRT_` (u0), chaque thread lit/modifie son pixel (pas de race)
|
||||||
|
- Shadow factor : `color.rgb *= 0.3` pour les pixels en ombre
|
||||||
|
- `voxelRT_` créé avec `UNORDERED_ACCESS` additionnel pour permettre l'écriture compute
|
||||||
|
- **Dispatch** : 8×8 thread groups, `ceil(w/8) × ceil(h/8)`, après les 3 render passes (blocky+topings+smooth)
|
||||||
|
- **Barriers** :
|
||||||
|
- Pre : `voxelDepth_` DEPTHSTENCIL→SHADER_RESOURCE + `voxelRT_` SHADER_RESOURCE→UAV
|
||||||
|
- Post : `voxelDepth_` SHADER_RESOURCE→DEPTHSTENCIL + `voxelRT_` UAV→SHADER_RESOURCE
|
||||||
|
- **Mode debug** (F5 × 2 = DBG) : rouge=shadow hit, vert=miss, bleu=back-facing, gris foncé=ciel
|
||||||
|
- **Toggle** : F5 cycle OFF→ON→DBG→OFF
|
||||||
|
- **CB** : `inverseViewProjection` (float4x4) ajouté après `viewProjection` dans VoxelCB (HLSL + C++)
|
||||||
|
- **Push constants** : width, height, normalBias, maxDistance, debugMode
|
||||||
|
|
||||||
#### Phase 6.3 - RT AO [A FAIRE]
|
#### Phase 6.3 - RT AO [A FAIRE]
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -66,6 +66,7 @@ add_custom_command(TARGET BVLEVoxels POST_BUILD
|
||||||
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCentroidCS.cso
|
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCentroidCS.cso
|
||||||
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCS.cso
|
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelSmoothCS.cso
|
||||||
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelBLASExtractCS.cso
|
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelBLASExtractCS.cso
|
||||||
|
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelShadowCS.cso
|
||||||
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelCommon.hlsli.cso
|
$<TARGET_FILE_DIR:BVLEVoxels>/shaders/hlsl6/voxel/voxelCommon.hlsli.cso
|
||||||
COMMENT "Clearing stale voxel shader cache (forces recompilation from current .hlsl sources)"
|
COMMENT "Clearing stale voxel shader cache (forces recompilation from current .hlsl sources)"
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -38,6 +38,7 @@
|
||||||
// ── Per-frame constant buffer (b0) ──────────────────────────────
|
// ── Per-frame constant buffer (b0) ──────────────────────────────
|
||||||
cbuffer VoxelCB : register(b0) {
|
cbuffer VoxelCB : register(b0) {
|
||||||
float4x4 viewProjection;
|
float4x4 viewProjection;
|
||||||
|
float4x4 inverseViewProjection; // for depth-to-world reconstruction (RT shadows)
|
||||||
float4 cameraPosition;
|
float4 cameraPosition;
|
||||||
float4 sunDirection;
|
float4 sunDirection;
|
||||||
float4 sunColor;
|
float4 sunColor;
|
||||||
|
|
|
||||||
105
shaders/voxelShadowCS.hlsl
Normal file
105
shaders/voxelShadowCS.hlsl
Normal file
|
|
@ -0,0 +1,105 @@
|
||||||
|
// BVLE Voxels - RT Shadow Compute Shader (Phase 6.2)
|
||||||
|
// Traces shadow rays from each pixel toward the sun using inline ray queries.
|
||||||
|
// Reads depth + normal to reconstruct world position, modulates voxelRT_ in-place.
|
||||||
|
|
||||||
|
#include "voxelCommon.hlsli"
|
||||||
|
|
||||||
|
// SRV bindings
|
||||||
|
Texture2D<float> depthTexture : register(t0); // voxelDepth_ (D32_FLOAT as R32_FLOAT SRV)
|
||||||
|
Texture2D<float4> normalTexture : register(t1); // voxelNormalRT_ (R16G16B16A16_SNORM)
|
||||||
|
RaytracingAccelerationStructure tlas : register(t2); // TLAS with blocky + smooth instances
|
||||||
|
|
||||||
|
// UAV: read-modify-write voxelRT_ (each thread handles exactly one pixel, no race)
|
||||||
|
RWTexture2D<float4> colorOutput : register(u0);
|
||||||
|
|
||||||
|
// Push constants
|
||||||
|
struct ShadowPush {
|
||||||
|
uint width;
|
||||||
|
uint height;
|
||||||
|
float normalBias;
|
||||||
|
float maxDistance;
|
||||||
|
uint debugMode; // 0=normal, 1=debug visualization
|
||||||
|
uint pad[7];
|
||||||
|
};
|
||||||
|
[[vk::push_constant]] ConstantBuffer<ShadowPush> push : register(b999);
|
||||||
|
|
||||||
|
[RootSignature(VOXEL_ROOTSIG)]
|
||||||
|
[numthreads(8, 8, 1)]
|
||||||
|
void main(uint3 DTid : SV_DispatchThreadID) {
|
||||||
|
if (DTid.x >= push.width || DTid.y >= push.height) return;
|
||||||
|
|
||||||
|
float depth = depthTexture[DTid.xy];
|
||||||
|
// depth == 0 means sky (reverse-Z: 0 = far plane)
|
||||||
|
if (depth == 0.0) {
|
||||||
|
if (push.debugMode > 0) colorOutput[DTid.xy] = float4(0.1, 0.1, 0.1, 1); // dark gray = sky
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Reconstruct world position from depth via inverse VP
|
||||||
|
float2 uv = (float2(DTid.xy) + 0.5) / float2(push.width, push.height);
|
||||||
|
float2 ndc = float2(uv.x * 2.0 - 1.0, (1.0 - uv.y) * 2.0 - 1.0);
|
||||||
|
float4 clipPos = float4(ndc, depth, 1.0);
|
||||||
|
float4 worldPos4 = mul(inverseViewProjection, clipPos);
|
||||||
|
float3 worldPos = worldPos4.xyz / worldPos4.w;
|
||||||
|
|
||||||
|
// Read world-space normal
|
||||||
|
float3 N = normalTexture[DTid.xy].xyz;
|
||||||
|
|
||||||
|
// Light direction: sunDirection is the direction of travel, negate for "toward sun"
|
||||||
|
float3 L = normalize(-sunDirection.xyz);
|
||||||
|
|
||||||
|
// Skip surfaces facing away from the light (self-shadowed by geometry)
|
||||||
|
float NdotL = dot(N, L);
|
||||||
|
if (NdotL <= 0.0) {
|
||||||
|
if (push.debugMode > 0) {
|
||||||
|
colorOutput[DTid.xy] = float4(0.0, 0.0, 0.5, 1); // blue = back-facing
|
||||||
|
} else {
|
||||||
|
float4 color = colorOutput[DTid.xy];
|
||||||
|
color.rgb *= 0.3;
|
||||||
|
colorOutput[DTid.xy] = color;
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Offset ray origin along normal to avoid self-intersection
|
||||||
|
float3 origin = worldPos + N * push.normalBias;
|
||||||
|
|
||||||
|
RayDesc ray;
|
||||||
|
ray.Origin = origin;
|
||||||
|
ray.Direction = L;
|
||||||
|
ray.TMin = 0.01;
|
||||||
|
ray.TMax = push.maxDistance;
|
||||||
|
|
||||||
|
// Inline ray query: accept first hit (binary shadow, don't need closest)
|
||||||
|
RayQuery<RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES | RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH> q;
|
||||||
|
q.TraceRayInline(tlas, 0, 0xFF, ray);
|
||||||
|
// With FLAG_OPAQUE geometry + ACCEPT_FIRST_HIT, Proceed() handles everything
|
||||||
|
while (q.Proceed()) {}
|
||||||
|
|
||||||
|
if (q.CommittedStatus() == COMMITTED_TRIANGLE_HIT) {
|
||||||
|
if (push.debugMode > 0) {
|
||||||
|
colorOutput[DTid.xy] = float4(1.0, 0.0, 0.0, 1); // RED = shadow ray hit (cast shadow!)
|
||||||
|
} else {
|
||||||
|
float4 color = colorOutput[DTid.xy];
|
||||||
|
color.rgb *= 0.3;
|
||||||
|
colorOutput[DTid.xy] = color;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if (push.debugMode > 0) {
|
||||||
|
// Debug: trace downward ray from reconstructed worldPos to verify BLAS
|
||||||
|
RayDesc testRay;
|
||||||
|
testRay.Origin = worldPos + float3(0, 5, 0); // 5 units above surface
|
||||||
|
testRay.Direction = float3(0, -1, 0); // straight down
|
||||||
|
testRay.TMin = 0.01;
|
||||||
|
testRay.TMax = 100.0;
|
||||||
|
RayQuery<RAY_FLAG_SKIP_PROCEDURAL_PRIMITIVES | RAY_FLAG_ACCEPT_FIRST_HIT_AND_END_SEARCH> testQ;
|
||||||
|
testQ.TraceRayInline(tlas, 0, 0xFF, testRay);
|
||||||
|
while (testQ.Proceed()) {}
|
||||||
|
if (testQ.CommittedStatus() == COMMITTED_TRIANGLE_HIT) {
|
||||||
|
colorOutput[DTid.xy] = float4(1, 0, 0, 1); // RED = hit (BLAS works!)
|
||||||
|
} else {
|
||||||
|
colorOutput[DTid.xy] = float4(0, 1, 0, 1); // GREEN = miss (BLAS broken)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -170,10 +170,25 @@ void VoxelRenderer::initialize(GraphicsDevice* dev) {
|
||||||
posDesc.stride = 0; // raw buffer, no stride
|
posDesc.stride = 0; // raw buffer, no stride
|
||||||
posDesc.usage = Usage::DEFAULT;
|
posDesc.usage = Usage::DEFAULT;
|
||||||
bool ok = device_->CreateBuffer(&posDesc, nullptr, &blasPositionBuffer_);
|
bool ok = device_->CreateBuffer(&posDesc, nullptr, &blasPositionBuffer_);
|
||||||
if (ok && blasPositionBuffer_.IsValid()) {
|
// Sequential index buffer for BLAS (DX12 requires valid index buffer,
|
||||||
|
// Wicked always writes IndexBuffer GPU address even for "non-indexed").
|
||||||
|
GPUBufferDesc idxDesc;
|
||||||
|
idxDesc.size = (uint64_t)MAX_BLAS_VERTICES * sizeof(uint32_t);
|
||||||
|
idxDesc.bind_flags = BindFlag::SHADER_RESOURCE;
|
||||||
|
idxDesc.usage = Usage::DEFAULT;
|
||||||
|
auto fillIndices = [](void* dest) {
|
||||||
|
uint32_t* p = (uint32_t*)dest;
|
||||||
|
for (uint32_t i = 0; i < MAX_BLAS_VERTICES; i++)
|
||||||
|
p[i] = i;
|
||||||
|
};
|
||||||
|
bool okIdx = device_->CreateBuffer2(&idxDesc, fillIndices, &blasIndexBuffer_);
|
||||||
|
|
||||||
|
if (ok && blasPositionBuffer_.IsValid() && okIdx && blasIndexBuffer_.IsValid()) {
|
||||||
device_->SetName(&blasPositionBuffer_, "VoxelRenderer::blasPositionBuffer");
|
device_->SetName(&blasPositionBuffer_, "VoxelRenderer::blasPositionBuffer");
|
||||||
wi::backlog::post("VoxelRenderer: RT available (BLAS position buffer "
|
device_->SetName(&blasIndexBuffer_, "VoxelRenderer::blasIndexBuffer");
|
||||||
+ std::to_string(posDesc.size / (1024*1024)) + " MB)");
|
wi::backlog::post("VoxelRenderer: RT available (BLAS pos "
|
||||||
|
+ std::to_string(posDesc.size / (1024*1024)) + " MB + idx "
|
||||||
|
+ std::to_string(idxDesc.size / (1024*1024)) + " MB)");
|
||||||
} else {
|
} else {
|
||||||
rtAvailable_ = false;
|
rtAvailable_ = false;
|
||||||
wi::backlog::post("VoxelRenderer: RT buffer creation failed", wi::backlog::LogLevel::Warning);
|
wi::backlog::post("VoxelRenderer: RT buffer creation failed", wi::backlog::LogLevel::Warning);
|
||||||
|
|
@ -182,6 +197,16 @@ void VoxelRenderer::initialize(GraphicsDevice* dev) {
|
||||||
rtAvailable_ = false;
|
rtAvailable_ = false;
|
||||||
wi::backlog::post("VoxelRenderer: RT available but BLAS extraction shader failed", wi::backlog::LogLevel::Warning);
|
wi::backlog::post("VoxelRenderer: RT available but BLAS extraction shader failed", wi::backlog::LogLevel::Warning);
|
||||||
}
|
}
|
||||||
|
// ── RT Shadows (Phase 6.2) ────────────────────────────────────
|
||||||
|
wi::renderer::LoadShader(ShaderStage::CS, shadowShader_, "voxel/voxelShadowCS.cso",
|
||||||
|
wi::graphics::ShaderModel::SM_6_5);
|
||||||
|
if (shadowShader_.IsValid()) {
|
||||||
|
rtShadowsEnabled_ = true;
|
||||||
|
wi::backlog::post("VoxelRenderer: RT shadows available");
|
||||||
|
} else {
|
||||||
|
wi::backlog::post("VoxelRenderer: RT shadow shader failed to compile",
|
||||||
|
wi::backlog::LogLevel::Warning);
|
||||||
|
}
|
||||||
} else {
|
} else {
|
||||||
wi::backlog::post("VoxelRenderer: RT not available (GPU does not support ray tracing)");
|
wi::backlog::post("VoxelRenderer: RT not available (GPU does not support ray tracing)");
|
||||||
}
|
}
|
||||||
|
|
@ -1003,10 +1028,13 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
|
||||||
geom.triangles.vertex_count = blockyVertCount;
|
geom.triangles.vertex_count = blockyVertCount;
|
||||||
geom.triangles.vertex_stride = sizeof(float) * 3; // 12 bytes per float3
|
geom.triangles.vertex_stride = sizeof(float) * 3; // 12 bytes per float3
|
||||||
geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
|
geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
|
||||||
// Wicked ALWAYS accesses index_buffer via to_internal() even for non-indexed.
|
// Wicked ALWAYS accesses index_buffer via to_internal() — a default GPUBuffer
|
||||||
// Provide a valid buffer with index_count=0 to prevent null deref crash.
|
// causes null deref. And DX12 treats non-zero IndexBuffer + IndexCount=0 as
|
||||||
geom.triangles.index_buffer = blasPositionBuffer_; // dummy, won't be used
|
// "indexed with 0 triangles" → empty BLAS. Solution: real sequential index buffer.
|
||||||
geom.triangles.index_count = 0;
|
geom.triangles.index_buffer = blasIndexBuffer_;
|
||||||
|
geom.triangles.index_count = blockyVertCount;
|
||||||
|
geom.triangles.index_format = IndexBufferFormat::UINT32;
|
||||||
|
geom.triangles.index_offset = 0;
|
||||||
|
|
||||||
bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
|
bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
|
||||||
&blockyBLAS_);
|
&blockyBLAS_);
|
||||||
|
|
@ -1048,9 +1076,11 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
|
||||||
geom.triangles.vertex_byte_offset = 0;
|
geom.triangles.vertex_byte_offset = 0;
|
||||||
geom.triangles.vertex_count = smoothVertCount;
|
geom.triangles.vertex_count = smoothVertCount;
|
||||||
geom.triangles.vertex_stride = 32; // SmoothVtx struct = 32 bytes, position at offset 0
|
geom.triangles.vertex_stride = 32; // SmoothVtx struct = 32 bytes, position at offset 0
|
||||||
// Wicked always accesses index_buffer (null deref if invalid)
|
// Wicked always accesses index_buffer — must be valid + use real indices
|
||||||
geom.triangles.index_buffer = smoothVB; // dummy, won't be used
|
geom.triangles.index_buffer = blasIndexBuffer_;
|
||||||
geom.triangles.index_count = 0;
|
geom.triangles.index_count = smoothVertCount;
|
||||||
|
geom.triangles.index_format = IndexBufferFormat::UINT32;
|
||||||
|
geom.triangles.index_offset = 0;
|
||||||
geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
|
geom.triangles.vertex_format = Format::R32G32B32_FLOAT;
|
||||||
|
|
||||||
bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
|
bool ok = dev->CreateRaytracingAccelerationStructure(&desc,
|
||||||
|
|
@ -1071,6 +1101,14 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
|
||||||
rtSmoothVertexCount_ = smoothVertCount;
|
rtSmoothVertexCount_ = smoothVertCount;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Memory barrier: sync BLAS builds before TLAS ──────────────
|
||||||
|
// Without this, TLAS build can execute before BLASes are complete.
|
||||||
|
// (Same pattern as wiRenderer.cpp line 5788)
|
||||||
|
{
|
||||||
|
GPUBarrier barriers[] = { GPUBarrier::Memory() };
|
||||||
|
dev->Barrier(barriers, 1, cmd);
|
||||||
|
}
|
||||||
|
|
||||||
// ── TLAS (2 instances: blocky + smooth) ──────────────────────
|
// ── TLAS (2 instances: blocky + smooth) ──────────────────────
|
||||||
// Always recreate TLAS with pre-filled instance data via CreateBuffer2.
|
// Always recreate TLAS with pre-filled instance data via CreateBuffer2.
|
||||||
// RAY_TRACING instance buffers have special resource state requirements,
|
// RAY_TRACING instance buffers have special resource state requirements,
|
||||||
|
|
@ -1152,9 +1190,79 @@ void VoxelRenderer::buildAccelerationStructures(CommandList cmd) const {
|
||||||
// Build TLAS
|
// Build TLAS
|
||||||
dev->BuildRaytracingAccelerationStructure(&tlas_, cmd, nullptr);
|
dev->BuildRaytracingAccelerationStructure(&tlas_, cmd, nullptr);
|
||||||
|
|
||||||
|
// Memory barrier: sync TLAS build before ray queries can use it
|
||||||
|
// (Same pattern as wiRenderer.cpp line 5808)
|
||||||
|
{
|
||||||
|
GPUBarrier barriers[] = { GPUBarrier::Memory(&tlas_) };
|
||||||
|
dev->Barrier(barriers, 1, cmd);
|
||||||
|
}
|
||||||
|
|
||||||
rtDirty_ = false;
|
rtDirty_ = false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── RT Shadow dispatch (Phase 6.2) ──────────────────────────────
|
||||||
|
void VoxelRenderer::dispatchShadows(CommandList cmd,
|
||||||
|
const Texture& depthBuffer,
|
||||||
|
const Texture& renderTarget,
|
||||||
|
const Texture& normalTarget) const
|
||||||
|
{
|
||||||
|
if (!rtShadowsEnabled_ || !shadowShader_.IsValid() || !tlas_.IsValid())
|
||||||
|
return;
|
||||||
|
|
||||||
|
auto* dev = device_;
|
||||||
|
uint32_t w = renderTarget.GetDesc().width;
|
||||||
|
uint32_t h = renderTarget.GetDesc().height;
|
||||||
|
|
||||||
|
// Pre-barriers:
|
||||||
|
// - voxelDepth_: DEPTHSTENCIL → SHADER_RESOURCE (for depth reads)
|
||||||
|
// - voxelRT_: SHADER_RESOURCE → UNORDERED_ACCESS (for in-place shadow modulation)
|
||||||
|
// - voxelNormalRT_ is already in SHADER_RESOURCE state from render pass
|
||||||
|
GPUBarrier preBarriers[] = {
|
||||||
|
GPUBarrier::Image(&const_cast<Texture&>(depthBuffer),
|
||||||
|
ResourceState::DEPTHSTENCIL, ResourceState::SHADER_RESOURCE),
|
||||||
|
GPUBarrier::Image(&const_cast<Texture&>(renderTarget),
|
||||||
|
ResourceState::SHADER_RESOURCE, ResourceState::UNORDERED_ACCESS),
|
||||||
|
};
|
||||||
|
dev->Barrier(preBarriers, 2, cmd);
|
||||||
|
|
||||||
|
dev->BindComputeShader(&shadowShader_, cmd);
|
||||||
|
|
||||||
|
// Bind resources
|
||||||
|
dev->BindResource(&depthBuffer, 0, cmd); // t0 = depth
|
||||||
|
dev->BindResource(&normalTarget, 1, cmd); // t1 = normals
|
||||||
|
dev->BindResource(&tlas_, 2, cmd); // t2 = TLAS
|
||||||
|
dev->BindUAV(&renderTarget, 0, cmd); // u0 = color (read-modify-write)
|
||||||
|
dev->BindConstantBuffer(&constantBuffer_, 0, cmd); // b0 = VoxelCB
|
||||||
|
|
||||||
|
// Push constants
|
||||||
|
struct ShadowPush {
|
||||||
|
uint32_t width;
|
||||||
|
uint32_t height;
|
||||||
|
float normalBias;
|
||||||
|
float maxDistance;
|
||||||
|
uint32_t debugMode;
|
||||||
|
uint32_t pad[7];
|
||||||
|
} pushData = {};
|
||||||
|
pushData.width = w;
|
||||||
|
pushData.height = h;
|
||||||
|
pushData.normalBias = 0.15f; // offset along normal to avoid self-intersection
|
||||||
|
pushData.maxDistance = 512.0f; // max shadow ray distance
|
||||||
|
pushData.debugMode = rtShadowDebug_ ? 1 : 0;
|
||||||
|
dev->PushConstants(&pushData, sizeof(pushData), cmd);
|
||||||
|
|
||||||
|
// Dispatch: 8×8 thread groups covering the screen
|
||||||
|
dev->Dispatch((w + 7) / 8, (h + 7) / 8, 1, cmd);
|
||||||
|
|
||||||
|
// Post-barriers: restore states for Compose()
|
||||||
|
GPUBarrier postBarriers[] = {
|
||||||
|
GPUBarrier::Image(&const_cast<Texture&>(depthBuffer),
|
||||||
|
ResourceState::SHADER_RESOURCE, ResourceState::DEPTHSTENCIL),
|
||||||
|
GPUBarrier::Image(&const_cast<Texture&>(renderTarget),
|
||||||
|
ResourceState::UNORDERED_ACCESS, ResourceState::SHADER_RESOURCE),
|
||||||
|
};
|
||||||
|
dev->Barrier(postBarriers, 2, cmd);
|
||||||
|
}
|
||||||
|
|
||||||
// ── Frustum plane extraction (Gribb-Hartmann method) ────────────
|
// ── Frustum plane extraction (Gribb-Hartmann method) ────────────
|
||||||
static void extractFrustumPlanes(const XMMATRIX& vp, XMFLOAT4 planes[6]) {
|
static void extractFrustumPlanes(const XMMATRIX& vp, XMFLOAT4 planes[6]) {
|
||||||
XMFLOAT4X4 m;
|
XMFLOAT4X4 m;
|
||||||
|
|
@ -1213,8 +1321,10 @@ void VoxelRenderer::render(
|
||||||
VoxelConstants cb = {};
|
VoxelConstants cb = {};
|
||||||
XMMATRIX vpMatrix = camera.GetViewProjection();
|
XMMATRIX vpMatrix = camera.GetViewProjection();
|
||||||
XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
|
XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
|
||||||
|
XMMATRIX invVP = XMMatrixInverse(nullptr, vpMatrix);
|
||||||
|
XMStoreFloat4x4(&cb.inverseViewProjection, invVP);
|
||||||
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
|
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
|
||||||
cb.sunDirection = XMFLOAT4(-0.5f, -0.8f, -0.3f, 0.0f);
|
cb.sunDirection = XMFLOAT4(-0.7f, -0.4f, -0.3f, 0.0f); // lower sun = longer cast shadows
|
||||||
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
|
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
|
||||||
cb.chunkSize = (float)CHUNK_SIZE;
|
cb.chunkSize = (float)CHUNK_SIZE;
|
||||||
cb.textureTiling = 0.25f;
|
cb.textureTiling = 0.25f;
|
||||||
|
|
@ -1313,7 +1423,7 @@ void VoxelRenderer::render(
|
||||||
XMMATRIX vpMatrix = camera.GetViewProjection();
|
XMMATRIX vpMatrix = camera.GetViewProjection();
|
||||||
XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
|
XMStoreFloat4x4(&cb.viewProjection, vpMatrix);
|
||||||
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
|
cb.cameraPosition = XMFLOAT4(camera.Eye.x, camera.Eye.y, camera.Eye.z, 1.0f);
|
||||||
cb.sunDirection = XMFLOAT4(-0.5f, -0.8f, -0.3f, 0.0f);
|
cb.sunDirection = XMFLOAT4(-0.7f, -0.4f, -0.3f, 0.0f); // lower sun = longer cast shadows
|
||||||
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
|
cb.sunColor = XMFLOAT4(1.2f, 1.1f, 0.9f, 1.0f);
|
||||||
cb.chunkSize = (float)CHUNK_SIZE;
|
cb.chunkSize = (float)CHUNK_SIZE;
|
||||||
cb.textureTiling = 0.25f;
|
cb.textureTiling = 0.25f;
|
||||||
|
|
@ -2123,7 +2233,8 @@ void VoxelRenderPath::createRenderTargets() {
|
||||||
rtDesc.width = w;
|
rtDesc.width = w;
|
||||||
rtDesc.height = h;
|
rtDesc.height = h;
|
||||||
rtDesc.format = wi::graphics::Format::R8G8B8A8_UNORM;
|
rtDesc.format = wi::graphics::Format::R8G8B8A8_UNORM;
|
||||||
rtDesc.bind_flags = wi::graphics::BindFlag::RENDER_TARGET | wi::graphics::BindFlag::SHADER_RESOURCE;
|
rtDesc.bind_flags = wi::graphics::BindFlag::RENDER_TARGET | wi::graphics::BindFlag::SHADER_RESOURCE
|
||||||
|
| wi::graphics::BindFlag::UNORDERED_ACCESS; // RT shadows modify in-place
|
||||||
rtDesc.mip_levels = 1;
|
rtDesc.mip_levels = 1;
|
||||||
rtDesc.sample_count = 1;
|
rtDesc.sample_count = 1;
|
||||||
rtDesc.layout = wi::graphics::ResourceState::SHADER_RESOURCE;
|
rtDesc.layout = wi::graphics::ResourceState::SHADER_RESOURCE;
|
||||||
|
|
@ -2179,6 +2290,21 @@ void VoxelRenderPath::handleInput(float dt) {
|
||||||
renderer.debugBlend_ = !renderer.debugBlend_;
|
renderer.debugBlend_ = !renderer.debugBlend_;
|
||||||
wi::backlog::post(renderer.debugBlend_ ? "Blend debug: ON" : "Blend debug: OFF");
|
wi::backlog::post(renderer.debugBlend_ ? "Blend debug: ON" : "Blend debug: OFF");
|
||||||
}
|
}
|
||||||
|
if (wi::input::Press(wi::input::KEYBOARD_BUTTON_F5)) {
|
||||||
|
// Cycle: OFF → ON → DEBUG → OFF
|
||||||
|
if (!renderer.rtShadowsEnabled_) {
|
||||||
|
renderer.rtShadowsEnabled_ = true;
|
||||||
|
renderer.rtShadowDebug_ = false;
|
||||||
|
wi::backlog::post("RT Shadows: ON");
|
||||||
|
} else if (!renderer.rtShadowDebug_) {
|
||||||
|
renderer.rtShadowDebug_ = true;
|
||||||
|
wi::backlog::post("RT Shadows: DEBUG (red=shadow, green=lit, blue=backface)");
|
||||||
|
} else {
|
||||||
|
renderer.rtShadowsEnabled_ = false;
|
||||||
|
renderer.rtShadowDebug_ = false;
|
||||||
|
wi::backlog::post("RT Shadows: OFF");
|
||||||
|
}
|
||||||
|
}
|
||||||
if (wi::input::Press(wi::input::MOUSE_BUTTON_RIGHT)) {
|
if (wi::input::Press(wi::input::MOUSE_BUTTON_RIGHT)) {
|
||||||
mouseCaptured = !mouseCaptured;
|
mouseCaptured = !mouseCaptured;
|
||||||
wi::input::HidePointer(mouseCaptured);
|
wi::input::HidePointer(mouseCaptured);
|
||||||
|
|
@ -2369,6 +2495,11 @@ void VoxelRenderPath::Render() const {
|
||||||
|
|
||||||
// Phase 5: render smooth surfaces (separate render pass, preserves all prior output)
|
// Phase 5: render smooth surfaces (separate render pass, preserves all prior output)
|
||||||
renderer.renderSmooth(cmd, voxelDepth_, voxelRT_, voxelNormalRT_);
|
renderer.renderSmooth(cmd, voxelDepth_, voxelRT_, voxelNormalRT_);
|
||||||
|
|
||||||
|
// Phase 6.2: RT Shadows (modulates voxelRT_ in-place after all geometry is rendered)
|
||||||
|
if (renderer.isRTShadowsEnabled() && renderer.isRTReady()) {
|
||||||
|
renderer.dispatchShadows(cmd, voxelDepth_, voxelRT_, voxelNormalRT_);
|
||||||
|
}
|
||||||
auto tRender1 = std::chrono::high_resolution_clock::now();
|
auto tRender1 = std::chrono::high_resolution_clock::now();
|
||||||
profRender_.add(std::chrono::duration<float, std::milli>(tRender1 - tRender0).count());
|
profRender_.add(std::chrono::duration<float, std::milli>(tRender1 - tRender0).count());
|
||||||
}
|
}
|
||||||
|
|
@ -2481,7 +2612,8 @@ void VoxelRenderPath::Compose(CommandList cmd) const {
|
||||||
if (renderer.isRTReady()) {
|
if (renderer.isRTReady()) {
|
||||||
stats += "RT: TLAS ready | Blocky "
|
stats += "RT: TLAS ready | Blocky "
|
||||||
+ std::to_string(renderer.getRTBlockyTriCount()) + " tris | Smooth "
|
+ std::to_string(renderer.getRTBlockyTriCount()) + " tris | Smooth "
|
||||||
+ std::to_string(renderer.getRTSmoothTriCount()) + " tris\n";
|
+ std::to_string(renderer.getRTSmoothTriCount()) + " tris"
|
||||||
|
+ " | Shadows " + std::string(renderer.rtShadowDebug_ ? "DEBUG" : (renderer.isRTShadowsEnabled() ? "ON" : "OFF")) + "\n";
|
||||||
} else {
|
} else {
|
||||||
stats += "RT: building...\n";
|
stats += "RT: building...\n";
|
||||||
}
|
}
|
||||||
|
|
@ -2490,7 +2622,8 @@ void VoxelRenderPath::Compose(CommandList cmd) const {
|
||||||
}
|
}
|
||||||
stats += "WASD+Space/Ctrl: move | Shift: fast | Right-click: capture mouse\n";
|
stats += "WASD+Space/Ctrl: move | Shift: fast | Right-click: capture mouse\n";
|
||||||
stats += "F2: console | F3: anim [" + std::string(animatedTerrain_ ? "ON" : "OFF")
|
stats += "F2: console | F3: anim [" + std::string(animatedTerrain_ ? "ON" : "OFF")
|
||||||
+ "] | F4: dbg [" + std::string(renderer.debugBlend_ ? "ON" : "OFF") + "]";
|
+ "] | F4: dbg [" + std::string(renderer.debugBlend_ ? "ON" : "OFF")
|
||||||
|
+ "] | F5: shadows [" + std::string(renderer.rtShadowDebug_ ? "DBG" : (renderer.isRTShadowsEnabled() ? "ON" : "OFF")) + "]";
|
||||||
|
|
||||||
wi::font::Draw(stats, fp, cmd);
|
wi::font::Draw(stats, fp, cmd);
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -147,6 +147,7 @@ private:
|
||||||
// Constants buffer (must match HLSL VoxelCB)
|
// Constants buffer (must match HLSL VoxelCB)
|
||||||
struct VoxelConstants {
|
struct VoxelConstants {
|
||||||
XMFLOAT4X4 viewProjection;
|
XMFLOAT4X4 viewProjection;
|
||||||
|
XMFLOAT4X4 inverseViewProjection; // for depth-to-world reconstruction (RT shadows)
|
||||||
XMFLOAT4 cameraPosition;
|
XMFLOAT4 cameraPosition;
|
||||||
XMFLOAT4 sunDirection;
|
XMFLOAT4 sunDirection;
|
||||||
XMFLOAT4 sunColor;
|
XMFLOAT4 sunColor;
|
||||||
|
|
@ -192,6 +193,7 @@ private:
|
||||||
// ── Ray Tracing (Phase 6.1) ─────────────────────────────────────
|
// ── Ray Tracing (Phase 6.1) ─────────────────────────────────────
|
||||||
wi::graphics::Shader blasExtractShader_; // voxelBLASExtractCS compute shader
|
wi::graphics::Shader blasExtractShader_; // voxelBLASExtractCS compute shader
|
||||||
mutable wi::graphics::GPUBuffer blasPositionBuffer_; // float3[] for blocky BLAS (6 verts per quad)
|
mutable wi::graphics::GPUBuffer blasPositionBuffer_; // float3[] for blocky BLAS (6 verts per quad)
|
||||||
|
wi::graphics::GPUBuffer blasIndexBuffer_; // sequential uint32 indices [0,1,2,...] for BLAS
|
||||||
mutable wi::graphics::RaytracingAccelerationStructure blockyBLAS_;
|
mutable wi::graphics::RaytracingAccelerationStructure blockyBLAS_;
|
||||||
mutable wi::graphics::RaytracingAccelerationStructure smoothBLAS_;
|
mutable wi::graphics::RaytracingAccelerationStructure smoothBLAS_;
|
||||||
mutable wi::graphics::RaytracingAccelerationStructure tlas_;
|
mutable wi::graphics::RaytracingAccelerationStructure tlas_;
|
||||||
|
|
@ -204,6 +206,16 @@ private:
|
||||||
void dispatchBLASExtract(wi::graphics::CommandList cmd) const;
|
void dispatchBLASExtract(wi::graphics::CommandList cmd) const;
|
||||||
void buildAccelerationStructures(wi::graphics::CommandList cmd) const;
|
void buildAccelerationStructures(wi::graphics::CommandList cmd) const;
|
||||||
|
|
||||||
|
// ── RT Shadows (Phase 6.2) ─────────────────────────────────────
|
||||||
|
wi::graphics::Shader shadowShader_; // voxelShadowCS compute shader
|
||||||
|
mutable bool rtShadowsEnabled_ = false; // true when shader + TLAS ready
|
||||||
|
mutable bool rtShadowDebug_ = false; // debug visualization mode
|
||||||
|
|
||||||
|
void dispatchShadows(wi::graphics::CommandList cmd,
|
||||||
|
const wi::graphics::Texture& depthBuffer,
|
||||||
|
const wi::graphics::Texture& renderTarget,
|
||||||
|
const wi::graphics::Texture& normalTarget) const;
|
||||||
|
|
||||||
// Benchmark state machine: runs once after world gen
|
// Benchmark state machine: runs once after world gen
|
||||||
enum class BenchState { IDLE, DISPATCH, READBACK, DONE };
|
enum class BenchState { IDLE, DISPATCH, READBACK, DONE };
|
||||||
mutable BenchState benchState_ = BenchState::IDLE;
|
mutable BenchState benchState_ = BenchState::IDLE;
|
||||||
|
|
@ -271,6 +283,7 @@ public:
|
||||||
// Phase 6: Ray Tracing
|
// Phase 6: Ray Tracing
|
||||||
bool isRTAvailable() const { return rtAvailable_; }
|
bool isRTAvailable() const { return rtAvailable_; }
|
||||||
bool isRTReady() const { return rtAvailable_ && tlas_.IsValid(); }
|
bool isRTReady() const { return rtAvailable_ && tlas_.IsValid(); }
|
||||||
|
bool isRTShadowsEnabled() const { return rtShadowsEnabled_; }
|
||||||
uint32_t getRTBlockyTriCount() const { return rtBlockyVertexCount_ / 3; }
|
uint32_t getRTBlockyTriCount() const { return rtBlockyVertexCount_ / 3; }
|
||||||
uint32_t getRTSmoothTriCount() const { return rtSmoothVertexCount_ / 3; }
|
uint32_t getRTSmoothTriCount() const { return rtSmoothVertexCount_ / 3; }
|
||||||
const wi::graphics::RaytracingAccelerationStructure& getTLAS() const { return tlas_; }
|
const wi::graphics::RaytracingAccelerationStructure& getTLAS() const { return tlas_; }
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue