Add shadow compute shader (voxelShadowCS.hlsl) that traces rays toward
the sun using DXR inline ray queries (RayQuery<>, SM 6.5). Shadows
modulate voxelRT_ in-place via RWTexture2D (no extra render target).
Key fixes to Phase 6.1 BLAS/TLAS infrastructure:
- Sequential index buffer required: Wicked treats IndexCount=0 with
non-null IndexBuffer as "0 indexed triangles" → empty BLAS
- Memory barriers between BLAS→TLAS→RT: without GPUBarrier::Memory()
the TLAS build races with BLAS builds, causing zero ray hits
- inverseViewProjection added to VoxelCB for depth reconstruction
F5 toggles shadows OFF→ON→DEBUG (red=hit, green=miss, blue=backface).
- Add bleedMask/resistBleedMask bitmasks to CB for per-material blend control
- Grass: canBleed + resistsBleed (bleeds onto others, nothing bleeds onto it)
- Stone: no bleed (doesn't overflow, but accepts bleed from others)
- Other materials: normal bidirectional blending
- PS checks flags before blending: mainResists → skip, !neighCanBleed → skip
- Flatten terrain (heightScale 64→20) for better surface visibility
- Replace altitude-based material bands with noise-based 2D patches
(3 noise channels create organic patches of all 5 materials on surface)
- Make stone/sand more visually distinct (stone=blue-gray, sand=warm yellow)
- Lower stone heightContrast (1.2→0.5) so neighbors bleed onto it more
Replace per-chunk DrawInstanced loop with a single DrawInstancedIndirectCount.
CPU fills indirect args buffer with same frustum+backface cull logic as Phase 2.1.
Key discoveries:
- Wicked Engine command signature includes push constant (20-byte stride, not 16)
- SV_VertexID does not reliably include startVertexLocation with ExecuteIndirect
- Solution: pack chunkIndex|(faceIndex<<16) in push constant, VS reconstructs
quad offset from GPUChunkInfo lookup
- No explicit DX12 barriers needed (implicit promotion from COMMON suffices)
Also adds voxel_engine_spec.md and updates references from .docx to .md.
Mega-buffer architecture replacing per-chunk GPU buffers:
- Single StructuredBuffer<PackedQuad> for all chunks (2M quads, 16 MB)
- StructuredBuffer<GPUChunkInfo> with per-chunk metadata (position, quad offsets, face groups)
- VS reads chunk info via push constants (b999) for driver-safe chunk indexing
- CPU frustum culling with wi::primitive::Frustum + AABB per chunk
- Quads sorted by face direction in greedy mesher (faceOffsets/faceCounts)
- GPU frustum + backface cull compute shader (voxelCullCS.hlsl)
- GPU binary mesher compute shader baseline (voxelMeshCS.hlsl)
- Indirect draw buffers and timestamp query infrastructure
- README with build instructions and project architecture