bvle-voxels

Author	SHA1	Message	Date
Samuel Bouchet	cd9814e494	Phase 5.2-5.3: CPU perf optimizations + GPU compute Surface Nets CPU smooth mesher optimizations (560ms → 17ms): - VoxelData grid cache eliminates redundant readVoxel calls - Pre-cached 27 neighbor chunk pointers (readVoxelFast) - smoothNear dilation (8 lookups/cell instead of 56) - Early exit via containsSmooth flag on chunks - Thread-local scratch buffers (SmoothScratch ~600KB) - wi::jobsystem parallelization across all cores - Persistent staging vectors for upload TopingSystem optimizations (58ms → 6ms): - collectInstancesParallel() with per-chunk local vectors - Neighbor chunk pointer caching GPU compute Surface Nets (Phase 5.3): - Two-pass compute shader: centroid grid + emit with smooth normals - Pass 1 (voxelSmoothCentroidCS): computes centroids + solid flags for cells [-1..32], cross-chunk neighbor voxel reading - Pass 2 (voxelSmoothCS): reads ONLY from centroid grid, computes area-weighted smooth normals from 12 incident edges per vertex - Batched dispatch: all centroid passes then all emit passes with single UAV→SRV barrier (instead of 2 barriers per chunk) - Smooth chunk filtering: only dispatches chunks with containsSmooth - Centroid grid buffer dynamically sized per smooth chunk count - 1-frame readback delay with auto-redispatch on first frame	2026-03-27 22:30:43 +01:00
Samuel Bouchet	b45d5a1884	Phase 5.1: smooth PS blending uses same logic as blocky PS + debug scene Rewrote voxelSmoothPS.hlsl to derive a dominant face axis from the smooth normal, then use the exact same neighbor verification as voxelPS.hlsl: faceU/faceV tangent tables, stair-priority getNeighborMat(), face-aligned fractional coords, blendZone 0.25, corner attenuation, bleedMask checks. Added generateDebugSmooth() with 11 isolated test configurations (smooth↔blocky transitions, staircases, surrounded patches, reference blocky pairs). Launch with: BVLEVoxels.exe debugsmooth	2026-03-27 14:21:35 +01:00
Samuel Bouchet	aab38bb9b9	Phase 5.1: Naive Surface Nets smooth rendering Implement CPU-side Naive Surface Nets for smooth voxel surfaces (SmoothStone, Snow) coexisting with blocky voxels (Grass, Dirt, Stone, Sand). Key features: - SmoothMesher with binary SDF, centroid vertex placement, per-axis boundary clamping to align with blocky grid at smooth↔blocky transitions - Cross-chunk connectivity: PAD=2 SDF grid, vertex range [-1, CHUNK_SIZE), canonical edge ownership (no duplicate triangles, no z-fighting) - Face normals oriented by edge axis+sign (robust with binary SDF, unlike SDF gradient dot or centroid sampling approaches) - Y-axis winding fix: sharing cells have different spatial arrangement, requiring opposite winding from X and Z axes - GPU mesher treats smooth neighbors as solid (no blocky faces toward smooth) - Material blending: primary (smooth-only) + secondary (all counts) per vertex - Dedicated shaders: voxelSmoothVS (vertex pulling t6) + voxelSmoothPS (triplanar + lerp blending between two materials) - Separate render pass with LoadOp::LOAD after voxels+topings - New materials: SmoothStone (mat 6), blocky Stone (mat 3) and Dirt patches added to world generation for boundary testing	2026-03-27 13:03:55 +01:00
Samuel Bouchet	f166394b60	Phase 3: per-material bleed flags + patch-based terrain for blend testing - Add bleedMask/resistBleedMask bitmasks to CB for per-material blend control - Grass: canBleed + resistsBleed (bleeds onto others, nothing bleeds onto it) - Stone: no bleed (doesn't overflow, but accepts bleed from others) - Other materials: normal bidirectional blending - PS checks flags before blending: mainResists → skip, !neighCanBleed → skip - Flatten terrain (heightScale 64→20) for better surface visibility - Replace altitude-based material bands with noise-based 2D patches (3 noise channels create organic patches of all 5 materials on surface) - Make stone/sand more visually distinct (stone=blue-gray, sand=warm yellow) - Lower stone heightContrast (1.2→0.5) so neighbors bleed onto it more	2026-03-26 12:47:10 +01:00
Samuel Bouchet	21f1bd1a12	Phase 2.5: GPU meshing production pipeline + perf optimizations (80+ FPS) Replace CPU greedy mesher with GPU compute mesher as default rendering pipeline. Key optimizations identified via CPU profiling (ProfileAccum, 5s averages): - Fused regenerate+pack: parallel noise gen + memcpy in same jobsystem pass (6ms → 0ms) - VoxelData memcpy: sizeof(VoxelData)==2 enables direct memcpy instead of bit-shift loop (28ms → <1ms) - Dirty-skip: GPU dispatch/upload only when chunks change, not every frame - Animation: 2 fBm octaves + no caves in animation mode (54ms → 8ms) - Result: 80-110 FPS with 60Hz terrain animation, 700+ FPS static	2026-03-26 09:05:52 +01:00
Samuel Bouchet	5f346bb14a	Phase 2: GPU-driven voxel rendering pipeline Mega-buffer architecture replacing per-chunk GPU buffers: - Single StructuredBuffer<PackedQuad> for all chunks (2M quads, 16 MB) - StructuredBuffer<GPUChunkInfo> with per-chunk metadata (position, quad offsets, face groups) - VS reads chunk info via push constants (b999) for driver-safe chunk indexing - CPU frustum culling with wi::primitive::Frustum + AABB per chunk - Quads sorted by face direction in greedy mesher (faceOffsets/faceCounts) - GPU frustum + backface cull compute shader (voxelCullCS.hlsl) - GPU binary mesher compute shader baseline (voxelMeshCS.hlsl) - Indirect draw buffers and timestamp query infrastructure - README with build instructions and project architecture	2026-03-25 14:24:05 +01:00

6 commits