bvle-voxels

Author	SHA1	Message	Date
Samuel Bouchet	bc29a02c35	Phase 4.2: GPU toping rendering pipeline + winding/lighting fixes Add instanced rendering for toping bevels: dedicated shaders (voxelTopingVS/PS), PSO, GPU buffers (t4 vertices, t5 instances), per-group DrawInstanced in a separate render pass with LoadOp::LOAD. Fix inverted face winding (emitTri auto-winding condition flipped for CW front-facing), slope normals (use inward direction not outward), and PS lighting (negate sunDirection like voxelPS). Update CLAUDE.md with Phase 4.1/4.2 documentation.	2026-03-26 17:47:08 +01:00
Samuel Bouchet	9e777d653b	Phase 4.1: TopingSystem infrastructure + procedural mesh generation - TopingSystem with TopingDef registry, procedural mesh gen, instance collection - 2 toping types: stone bevel (h=0.06, smooth) + grass edge (h=0.12, bumpy) - 16 mesh variants per type indexed by 4-bit adjacency bitmask (~6 unique with symmetry) - Wedge cross-section: outer wall + sloped top, grass has sinusoidal height profile - Instance collection scans exposed +Y faces, checks same-material neighbors - Cross-chunk adjacency via VoxelWorld::getVoxel() - Integrated into VoxelRenderPath: init at Start(), stats in HUD - ~191K instances, 1920 mesh vertices for 170 chunks (validated) - Research doc (research_connected_meshes.md) + plan (plan_phase4.md)	2026-03-26 15:27:15 +01:00
Samuel Bouchet	f166394b60	Phase 3: per-material bleed flags + patch-based terrain for blend testing - Add bleedMask/resistBleedMask bitmasks to CB for per-material blend control - Grass: canBleed + resistsBleed (bleeds onto others, nothing bleeds onto it) - Stone: no bleed (doesn't overflow, but accepts bleed from others) - Other materials: normal bidirectional blending - PS checks flags before blending: mainResists → skip, !neighCanBleed → skip - Flatten terrain (heightScale 64→20) for better surface visibility - Replace altitude-based material bands with noise-based 2D patches (3 noise channels create organic patches of all 5 materials on surface) - Make stone/sand more visually distinct (stone=blue-gray, sand=warm yellow) - Lower stone heightContrast (1.2→0.5) so neighbors bleed onto it more	2026-03-26 12:47:10 +01:00
Samuel Bouchet	d7e69f97ca	Phase 3: PS-based texture blending with winner-takes-all heightmap Replace pre-encoded quad blend data (v1) with per-pixel voxel data lookups in the pixel shader. The PS reads voxelDataBuffer (SRV t3) to find neighbor materials dynamically, enabling 2 independent blend axes, stair-priority neighbor detection, and winner-takes-all heightmap-driven transitions. Key design decisions validated through 6 iterations (see blending_experiments.md): - Winner-takes-all: material with highest heightmap score wins 100% (sharp but organic transitions, not smooth gradient) - Symmetric bias: bias = 0.5 - weight ensures equal chance at border - Subtractive corner attenuation (param=0.80): xAdj = xEdge - saturate(yEdge - 0.80) reduces blend at corners naturally - Blend zone = 0.25 voxels from each edge (50% of face) - Debug mode (F4) visualizes blend zones as colors	2026-03-26 12:14:08 +01:00
Samuel Bouchet	21f1bd1a12	Phase 2.5: GPU meshing production pipeline + perf optimizations (80+ FPS) Replace CPU greedy mesher with GPU compute mesher as default rendering pipeline. Key optimizations identified via CPU profiling (ProfileAccum, 5s averages): - Fused regenerate+pack: parallel noise gen + memcpy in same jobsystem pass (6ms → 0ms) - VoxelData memcpy: sizeof(VoxelData)==2 enables direct memcpy instead of bit-shift loop (28ms → <1ms) - Dirty-skip: GPU dispatch/upload only when chunks change, not every frame - Animation: 2 fBm octaves + no caves in animation mode (54ms → 8ms) - Result: 80-110 FPS with 60Hz terrain animation, 700+ FPS static	2026-03-26 09:05:52 +01:00
Samuel Bouchet	9a8f80de51	Phase 2.4: GPU compute mesher benchmark (CPU greedy vs GPU baseline) One-shot benchmark runs automatically after world generation: - CPU greedy mesher: 277ms, 358K quads (binary greedy merge) - GPU baseline (1x1): 5.3ms, 2.43M quads (no merge, 52x faster) - Greedy merge reduces quad count by 6.8x Implementation: - State machine: DISPATCH (upload voxels + dispatch) → READBACK → DONE - GPU timestamps for accurate timing - Readback buffer for quad counter - Each chunk's voxel data uploaded and dispatched sequentially	2026-03-25 22:51:22 +01:00
Samuel Bouchet	1bfadc2f7c	Phase 2.3: GPU compute culling with frustum + backface cull Compute shader fills indirect args buffer, replacing CPU cull loop. Single DrawInstancedIndirectCount renders all visible face groups. Key fixes: - Compute shader: pack chunkIndex\|(faceIndex<<16) in push constant, startVertexLocation=0 (aligned with Phase 2.2 SV_VertexID fix) - PushConstants must be called AFTER BindPipelineState, not before. Wicked Engine dispatches to SetGraphicsRoot32BitConstants only when active_pso is set; after BindComputeShader it targets compute instead. - Barriers: UNDEFINED(COMMON)→UAV before compute, UAV→INDIRECT_ARGUMENT after - Buffer decay: DX12 buffers always return to COMMON between frames, no cross-frame state tracking needed	2026-03-25 22:30:50 +01:00
Samuel Bouchet	45af49a659	Phase 2.2: MDI rendering with CPU-filled indirect args Replace per-chunk DrawInstanced loop with a single DrawInstancedIndirectCount. CPU fills indirect args buffer with same frustum+backface cull logic as Phase 2.1. Key discoveries: - Wicked Engine command signature includes push constant (20-byte stride, not 16) - SV_VertexID does not reliably include startVertexLocation with ExecuteIndirect - Solution: pack chunkIndex\|(faceIndex<<16) in push constant, VS reconstructs quad offset from GPUChunkInfo lookup - No explicit DX12 barriers needed (implicit promotion from COMMON suffices) Also adds voxel_engine_spec.md and updates references from .docx to .md.	2026-03-25 22:07:22 +01:00
Samuel Bouchet	46e8f50f37	Phase 2 complete: per-face-group backface culling, frustum planes, GPU cull infrastructure - VS supports dual mode: CPU path (push constants) and MDI path (binary search) - CPU render loop now does per-face-group draws with backface culling (6 draws/chunk max) - Frustum planes extracted and populated in constant buffer for GPU cull shader - GPU cull + MDI path fully implemented but disabled (barrier/state debugging needed) - GPU timestamp query infrastructure with readback for cull/draw timing - HUD shows rendering mode (GPU cull vs CPU fallback)	2026-03-25 14:50:55 +01:00
Samuel Bouchet	5f346bb14a	Phase 2: GPU-driven voxel rendering pipeline Mega-buffer architecture replacing per-chunk GPU buffers: - Single StructuredBuffer<PackedQuad> for all chunks (2M quads, 16 MB) - StructuredBuffer<GPUChunkInfo> with per-chunk metadata (position, quad offsets, face groups) - VS reads chunk info via push constants (b999) for driver-safe chunk indexing - CPU frustum culling with wi::primitive::Frustum + AABB per chunk - Quads sorted by face direction in greedy mesher (faceOffsets/faceCounts) - GPU frustum + backface cull compute shader (voxelCullCS.hlsl) - GPU binary mesher compute shader baseline (voxelMeshCS.hlsl) - Indirect draw buffers and timestamp query infrastructure - README with build instructions and project architecture	2026-03-25 14:24:05 +01:00

10 commits