- Load CC0 FreeStylized textures (6 materials: grass, dirt, stone, sand, snow, smoothstone)
as Texture2DArray: t1=albedo+heightmap RGBA, t7=normal maps GL format
- Height-based texture blending: winner-takes-all with sharpness=16, 40% blend zone,
asymmetric bias (coeff 1.6) for resistBleed materials (grass resists sand bleed)
- UDN triplanar normal mapping with 3 critical fixes:
* Use raw normal (NOT abs) in UDN formula — abs inverts lighting on -X/-Y/-Z faces
* sign(normal) correction on tangent X for back-facing UV mirror
* GL green channel flip on Y-projection only (not X/Z where V=worldY is correct)
- Dirt material rendered smooth (FLAG_SMOOTH), ground_02 texture darkened 0.75
- Sun orbit debug mode (F7): 10s cycle with sinusoidal altitude
- Crosshair + face debug HUD (F8): DDA raycast, camera/target/face/normal info
- Screenshot F6 now writes companion .log file with full debug state
- Document UDN pitfalls and logical vs physical coordinates in TROUBLESHOOTING.md
- Add tools/prepare_textures.py for texture pipeline (ZIP → albedo+height RGBA + normal)
- Remove ~430 lines of dead CPU mesh, MDI, and GPU cull render paths
(rebuildMegaBuffer, IndirectDrawArgs, drawCountBuffer, cullShader, etc.)
- Add voxelTopingBLASCS.hlsl compute shader replacing 196ms CPU loop
for toping BLAS position extraction (<1ms on GPU)
- Reduce animation rate from 60Hz to 30Hz (halves CPU regen cost)
- Simplify render() to GPU mesh path only (no conditional branches)
- Remove benchmark state machine and stale mode strings
Per-frame CreateRaytracingAccelerationStructure calls during F3 animation
caused VRAM explosion (especially toping BLAS at ~23M vertices). Now all
3 BLASes use capacity-based allocation with 25% headroom — only recreated
when vertex count exceeds capacity, otherwise just BuildRaytracingAS with
updated desc.vertex_count. TLAS only recreated when instance count changes.
Also adds deferred toping BLAS position upload via UpdateBuffer in Render()
(topingBLASDirty_ flag), enabling toping shadows to update during animation.
Split CLAUDE.md into CLAUDE.md + TROUBLESHOOTING.md for maintainability.
Wonderbox-inspired lighting overhaul across all 3 pixel shaders:
- Hemisphere ambient (sky blue above, warm brown below) replaces flat ambient
- RT shadows lerp toward blue-violet tint instead of plain darkening (factor 0.55)
- Rim light (fresnel) with warm golden color on silhouettes (30% on vegetation)
- Soft exponential tone mapping + saturation boost in final post-process pass
- CB parameters for all lighting values (skyAmbient, groundAmbient, shadowTint, etc.)
- Fog color/density centralized from CB instead of hardcoded per-shader
- Screenshot mode (CLI "screenshot"): fixed camera, AO convergence, auto-capture
- AO noise stability: world-space hash using voxel center + tangent-axis frac position
- AO distance-weighted falloff: continuous occlusion values instead of binary hit/miss
- Interleaved Gradient Noise replaces world-space hash for ray sampling
- Cranley-Patterson rotation (golden ratio × frameIndex) per frame
- Temporal accumulation: blend 5% current + 95% reprojected history (~20 frames)
- aoHistoryTexture_ persists between frames, copy pre-blur for next frame
- prevViewProjection added to VoxelCB for screen-space reprojection
- Push constants: frameIndex + historyValid for temporal control
- Result: nearly noise-free AO with only 8 rays per pixel
- 8 cosine-weighted hemisphere rays per pixel (inline ray queries, SM 6.5)
- Distance-weighted AO: quadratic falloff (1-hitT/aoRadius)² instead of binary hit/miss
- World-space hash seed: voxel coord + tangent-plane frac position (stable, no flicker)
- Bilateral blur pipeline: 2-pass separable (H+V), radius 6, depth+normal edge-stopping
- 4-pass dispatch: shadow+rawAO → blur H → blur V → apply
- AO written to separate R8_UNORM texture, blurred, then applied to color buffer
- Debug mode (F5 x3): grayscale AO visualization
Add shadow compute shader (voxelShadowCS.hlsl) that traces rays toward
the sun using DXR inline ray queries (RayQuery<>, SM 6.5). Shadows
modulate voxelRT_ in-place via RWTexture2D (no extra render target).
Key fixes to Phase 6.1 BLAS/TLAS infrastructure:
- Sequential index buffer required: Wicked treats IndexCount=0 with
non-null IndexBuffer as "0 indexed triangles" → empty BLAS
- Memory barriers between BLAS→TLAS→RT: without GPUBarrier::Memory()
the TLAS build races with BLAS builds, causing zero ray hits
- inverseViewProjection added to VoxelCB for depth reconstruction
F5 toggles shadows OFF→ON→DEBUG (red=hit, green=miss, blue=backface).
- Normal render target (R16G16B16A16_SNORM) as MRT SV_TARGET1 in all 3 pixel
shaders (voxelPS, voxelTopingPS, voxelSmoothPS) for future RT shadow/AO
- BLAS extraction compute shader (voxelBLASExtractCS.hlsl): converts PackedQuad
StructuredBuffer to float3 position buffer for DXR BLAS input
- Blocky BLAS: single BLAS from all GPU-meshed quads (~1.5M triangles)
- Smooth BLAS: single BLAS from smooth vertex buffer directly
- TLAS: 2 instances (blocky + smooth), identity transforms, CreateBuffer2 with
callback to avoid UpdateBuffer on RAY_TRACING flagged buffers
- Fix: Wicked always accesses index_buffer in CreateRaytracingAccelerationStructure
via to_internal() even for non-indexed geometry — provide dummy valid buffer
- Smooth vertex normals: area-weighted accumulation of face normals per
indexed vertex before triangle expansion. Gives Gouraud-smooth shading
without adding geometry.
- Triplanar fix: PS uses geometric normal (ddx/ddy of worldPos) for
texture projection weights, smooth normal for lighting only. Prevents
texture stretching on smoothed surfaces.
- Depth bias: custom rasterizer state (depth_bias=2, slope_scaled=1.0)
on smooth PSO resolves z-fighting at smooth↔blocky overlap.
- hasSmooth filter tightened: check face-adjacent voxels of each corner
(1-voxel reach) instead of neighbor cells' corners (2-cell cascade).
Prevents smooth mesh from extending into underground blocky territory.
Cells at the smooth↔blocky boundary had no smooth corners themselves,
so the strict hasSmooth filter skipped them entirely. This prevented
quad emission between the smooth mesh and blocky territory, leaving
a visible gap. Now checks 6-connected neighbor cells for smooth corners,
ensuring boundary vertices exist for connecting quads.
Rewrote voxelSmoothPS.hlsl to derive a dominant face axis from the smooth
normal, then use the exact same neighbor verification as voxelPS.hlsl:
faceU/faceV tangent tables, stair-priority getNeighborMat(), face-aligned
fractional coords, blendZone 0.25, corner attenuation, bleedMask checks.
Added generateDebugSmooth() with 11 isolated test configurations
(smooth↔blocky transitions, staircases, surrounded patches, reference
blocky pairs). Launch with: BVLEVoxels.exe debugsmooth
Implement CPU-side Naive Surface Nets for smooth voxel surfaces (SmoothStone,
Snow) coexisting with blocky voxels (Grass, Dirt, Stone, Sand).
Key features:
- SmoothMesher with binary SDF, centroid vertex placement, per-axis boundary
clamping to align with blocky grid at smooth↔blocky transitions
- Cross-chunk connectivity: PAD=2 SDF grid, vertex range [-1, CHUNK_SIZE),
canonical edge ownership (no duplicate triangles, no z-fighting)
- Face normals oriented by edge axis+sign (robust with binary SDF, unlike
SDF gradient dot or centroid sampling approaches)
- Y-axis winding fix: sharing cells have different spatial arrangement,
requiring opposite winding from X and Z axes
- GPU mesher treats smooth neighbors as solid (no blocky faces toward smooth)
- Material blending: primary (smooth-only) + secondary (all counts) per vertex
- Dedicated shaders: voxelSmoothVS (vertex pulling t6) + voxelSmoothPS
(triplanar + lerp blending between two materials)
- Separate render pass with LoadOp::LOAD after voxels+topings
- New materials: SmoothStone (mat 6), blocky Stone (mat 3) and Dirt patches
added to world generation for boundary testing
Use same ambient (0.15, 0.18, 0.25) as voxel PS instead of greener
tint. Increase translucency (0.6) to reduce contrast when orbiting
around grass. Wrap at 0.85 for balanced lit-side brightness.
Stone: add corner fill triangles at adjacent open edges and cap
triangles at strip terminaisons. Grass: replace bevel strips with
tuft-based grass blades — clusters of 3-9 curved double-sided
blades with per-tuft height/lean personality and hash-driven
placement (quadratic inset 0-0.30 from edge). Vegetation PS uses
half-Lambert wrap lighting + translucency for soft stylized shading
(inspired by Airborn Trees). Stone keeps classic Lambert.
Add instanced rendering for toping bevels: dedicated shaders
(voxelTopingVS/PS), PSO, GPU buffers (t4 vertices, t5 instances),
per-group DrawInstanced in a separate render pass with LoadOp::LOAD.
Fix inverted face winding (emitTri auto-winding condition flipped for
CW front-facing), slope normals (use inward direction not outward),
and PS lighting (negate sunDirection like voxelPS). Update CLAUDE.md
with Phase 4.1/4.2 documentation.
- Add bleedMask/resistBleedMask bitmasks to CB for per-material blend control
- Grass: canBleed + resistsBleed (bleeds onto others, nothing bleeds onto it)
- Stone: no bleed (doesn't overflow, but accepts bleed from others)
- Other materials: normal bidirectional blending
- PS checks flags before blending: mainResists → skip, !neighCanBleed → skip
- Flatten terrain (heightScale 64→20) for better surface visibility
- Replace altitude-based material bands with noise-based 2D patches
(3 noise channels create organic patches of all 5 materials on surface)
- Make stone/sand more visually distinct (stone=blue-gray, sand=warm yellow)
- Lower stone heightContrast (1.2→0.5) so neighbors bleed onto it more
Compute shader fills indirect args buffer, replacing CPU cull loop.
Single DrawInstancedIndirectCount renders all visible face groups.
Key fixes:
- Compute shader: pack chunkIndex|(faceIndex<<16) in push constant,
startVertexLocation=0 (aligned with Phase 2.2 SV_VertexID fix)
- PushConstants must be called AFTER BindPipelineState, not before.
Wicked Engine dispatches to SetGraphicsRoot32BitConstants only when
active_pso is set; after BindComputeShader it targets compute instead.
- Barriers: UNDEFINED(COMMON)→UAV before compute, UAV→INDIRECT_ARGUMENT after
- Buffer decay: DX12 buffers always return to COMMON between frames,
no cross-frame state tracking needed
Replace per-chunk DrawInstanced loop with a single DrawInstancedIndirectCount.
CPU fills indirect args buffer with same frustum+backface cull logic as Phase 2.1.
Key discoveries:
- Wicked Engine command signature includes push constant (20-byte stride, not 16)
- SV_VertexID does not reliably include startVertexLocation with ExecuteIndirect
- Solution: pack chunkIndex|(faceIndex<<16) in push constant, VS reconstructs
quad offset from GPUChunkInfo lookup
- No explicit DX12 barriers needed (implicit promotion from COMMON suffices)
Also adds voxel_engine_spec.md and updates references from .docx to .md.
- VS supports dual mode: CPU path (push constants) and MDI path (binary search)
- CPU render loop now does per-face-group draws with backface culling (6 draws/chunk max)
- Frustum planes extracted and populated in constant buffer for GPU cull shader
- GPU cull + MDI path fully implemented but disabled (barrier/state debugging needed)
- GPU timestamp query infrastructure with readback for cull/draw timing
- HUD shows rendering mode (GPU cull vs CPU fallback)
Mega-buffer architecture replacing per-chunk GPU buffers:
- Single StructuredBuffer<PackedQuad> for all chunks (2M quads, 16 MB)
- StructuredBuffer<GPUChunkInfo> with per-chunk metadata (position, quad offsets, face groups)
- VS reads chunk info via push constants (b999) for driver-safe chunk indexing
- CPU frustum culling with wi::primitive::Frustum + AABB per chunk
- Quads sorted by face direction in greedy mesher (faceOffsets/faceCounts)
- GPU frustum + backface cull compute shader (voxelCullCS.hlsl)
- GPU binary mesher compute shader baseline (voxelMeshCS.hlsl)
- Indirect draw buffers and timestamp query infrastructure
- README with build instructions and project architecture