bvle-voxels

Author	SHA1	Message	Date
Samuel Bouchet	c2d1a1e0b6	Commit plan and iteration instructions	2026-03-31 20:04:00 +02:00
Samuel Bouchet	57ac08f231	Refactor: extract VoxelRTManager, DeferredGPUBuffer, decompose VoxelRenderPath - Extract DeferredGPUBuffer utility (staging→dirty→capacity GPU buffer pattern) - Extract VoxelRTManager from VoxelRenderer (~500 lines: BLAS/TLAS, RT shadows+AO) - Decompose VoxelRenderPath into CameraController, AnimationState, VoxelProfiler - Replace toping std::sort with O(n) counting sort by (type, variant) - Update CLAUDE.md architecture docs to reflect new file structure	2026-03-31 13:46:35 +02:00
Samuel Bouchet	f134a5786d	Add sky and refs	2026-03-30 21:54:55 +02:00
Samuel Bouchet	cd9814e494	Phase 5.2-5.3: CPU perf optimizations + GPU compute Surface Nets CPU smooth mesher optimizations (560ms → 17ms): - VoxelData grid cache eliminates redundant readVoxel calls - Pre-cached 27 neighbor chunk pointers (readVoxelFast) - smoothNear dilation (8 lookups/cell instead of 56) - Early exit via containsSmooth flag on chunks - Thread-local scratch buffers (SmoothScratch ~600KB) - wi::jobsystem parallelization across all cores - Persistent staging vectors for upload TopingSystem optimizations (58ms → 6ms): - collectInstancesParallel() with per-chunk local vectors - Neighbor chunk pointer caching GPU compute Surface Nets (Phase 5.3): - Two-pass compute shader: centroid grid + emit with smooth normals - Pass 1 (voxelSmoothCentroidCS): computes centroids + solid flags for cells [-1..32], cross-chunk neighbor voxel reading - Pass 2 (voxelSmoothCS): reads ONLY from centroid grid, computes area-weighted smooth normals from 12 incident edges per vertex - Batched dispatch: all centroid passes then all emit passes with single UAV→SRV barrier (instead of 2 barriers per chunk) - Smooth chunk filtering: only dispatches chunks with containsSmooth - Centroid grid buffer dynamically sized per smooth chunk count - 1-frame readback delay with auto-redispatch on first frame	2026-03-27 22:30:43 +01:00
Samuel Bouchet	45af49a659	Phase 2.2: MDI rendering with CPU-filled indirect args Replace per-chunk DrawInstanced loop with a single DrawInstancedIndirectCount. CPU fills indirect args buffer with same frustum+backface cull logic as Phase 2.1. Key discoveries: - Wicked Engine command signature includes push constant (20-byte stride, not 16) - SV_VertexID does not reliably include startVertexLocation with ExecuteIndirect - Solution: pack chunkIndex\|(faceIndex<<16) in push constant, VS reconstructs quad offset from GPUChunkInfo lookup - No explicit DX12 barriers needed (implicit promotion from COMMON suffices) Also adds voxel_engine_spec.md and updates references from .docx to .md.	2026-03-25 22:07:22 +01:00

5 commits