Commit graph

40 commits

Author SHA1 Message Date
626fbaea80 Fix smooth Surface Nets rendering: eliminate faceting, fix blocky junction
- Remove geoN (ddx/ddy) from smooth PS entirely — use smooth interpolated
  normal N for all triplanar sampling (albedo, heightmap, normal map).
  geoN changes discontinuously at triangle edges, causing per-triangle
  faceting in texture weights and normal perturbation.
- Tune consistency-based vertex normal blend to smoothstep(0.70, 0.90):
  snaps to face normal at 90° boundaries (seamless blocky join) while
  preserving smooth normals on curved terrain.
- Unify all 3 edge axes (X/Y/Z) to same smoothstep formula (was mixed
  smoothstep + pow4).
- Remove grass-specific hardcoded shading from both PS (side darkening,
  warm shift, ambient boost) — will be data-driven per-material later.
- Remove CPU SmoothMesher code (GPU-only path).
- Document all findings in TROUBLESHOOTING.md with calibration table.
2026-04-01 20:35:42 +02:00
d5bf499375 Add debug tools 2026-04-01 18:12:58 +02:00
4c50727cb6 Ignore some files 2026-04-01 18:12:53 +02:00
4419c612bd Phase 8: Real stylized textures with UDN triplanar normal mapping
- Load CC0 FreeStylized textures (6 materials: grass, dirt, stone, sand, snow, smoothstone)
  as Texture2DArray: t1=albedo+heightmap RGBA, t7=normal maps GL format
- Height-based texture blending: winner-takes-all with sharpness=16, 40% blend zone,
  asymmetric bias (coeff 1.6) for resistBleed materials (grass resists sand bleed)
- UDN triplanar normal mapping with 3 critical fixes:
  * Use raw normal (NOT abs) in UDN formula — abs inverts lighting on -X/-Y/-Z faces
  * sign(normal) correction on tangent X for back-facing UV mirror
  * GL green channel flip on Y-projection only (not X/Z where V=worldY is correct)
- Dirt material rendered smooth (FLAG_SMOOTH), ground_02 texture darkened 0.75
- Sun orbit debug mode (F7): 10s cycle with sinusoidal altitude
- Crosshair + face debug HUD (F8): DDA raycast, camera/target/face/normal info
- Screenshot F6 now writes companion .log file with full debug state
- Document UDN pitfalls and logical vs physical coordinates in TROUBLESHOOTING.md
- Add tools/prepare_textures.py for texture pipeline (ZIP → albedo+height RGBA + normal)
2026-04-01 13:41:06 +02:00
c2d1a1e0b6 Commit plan and iteration instructions 2026-03-31 20:04:00 +02:00
8ab908054c Fix HDR screenshot, reduce sun size, windowed 1080p by default
- Add F6 in-app screenshot saving voxelRT_ directly (bypasses Windows HDR)
- Shrink sun disc (pow 256), glow (pow 64), and haze (pow 8) for subtler sky
- Launch as centered 1920x1080 window instead of maximized
2026-03-31 14:58:44 +02:00
57ac08f231 Refactor: extract VoxelRTManager, DeferredGPUBuffer, decompose VoxelRenderPath
- Extract DeferredGPUBuffer utility (staging→dirty→capacity GPU buffer pattern)
- Extract VoxelRTManager from VoxelRenderer (~500 lines: BLAS/TLAS, RT shadows+AO)
- Decompose VoxelRenderPath into CameraController, AnimationState, VoxelProfiler
- Replace toping std::sort with O(n) counting sort by (type, variant)
- Update CLAUDE.md architecture docs to reflect new file structure
2026-03-31 13:46:35 +02:00
53df73e5e6 fixes after Improving perfs 2026-03-31 08:53:37 +02:00
0d93cef8f1 GPU profiling + staggered BLAS builds + RT disable during animation
- Add comprehensive GPU timestamp queries for all major operations
  (mesh, smooth mesh, BLAS extract, BLAS build, draw, RT shadows)
- Add full-frame profiling: Wicked Render, GPU Wait/Sync, true FPS
- Stagger BLAS builds during animation: alternate blocky/smooth per
  frame, skip toping BLAS entirely (~130ms savings per frame)
- Auto-disable RT shadows on F3 animation start (prevents stale
  shadow artifacts), auto-restore on F3 stop with full BLAS rebuild
- Split buildAccelerationStructures() with selective build flags
- Result: animation ~24 FPS (CPU-bound on Regenerate 27ms)
  vs previous 2 FPS (GPU-bound on BLAS Build 1368ms)
2026-03-31 02:21:11 +02:00
0d3f8200b4 Refactor: remove dead CPU/MDI paths, GPU BLAS compute, 30Hz animation
- Remove ~430 lines of dead CPU mesh, MDI, and GPU cull render paths
  (rebuildMegaBuffer, IndirectDrawArgs, drawCountBuffer, cullShader, etc.)
- Add voxelTopingBLASCS.hlsl compute shader replacing 196ms CPU loop
  for toping BLAS position extraction (<1ms on GPU)
- Reduce animation rate from 60Hz to 30Hz (halves CPU regen cost)
- Simplify render() to GPU mesh path only (no conditional branches)
- Remove benchmark state machine and stale mode strings
2026-03-31 01:43:53 +02:00
f134a5786d Add sky and refs 2026-03-30 21:54:55 +02:00
afb86446cd Fix VRAM leak: capacity-based BLAS/TLAS + deferred toping BLAS upload
Per-frame CreateRaytracingAccelerationStructure calls during F3 animation
caused VRAM explosion (especially toping BLAS at ~23M vertices). Now all
3 BLASes use capacity-based allocation with 25% headroom — only recreated
when vertex count exceeds capacity, otherwise just BuildRaytracingAS with
updated desc.vertex_count. TLAS only recreated when instance count changes.

Also adds deferred toping BLAS position upload via UpdateBuffer in Render()
(topingBLASDirty_ flag), enabling toping shadows to update during animation.

Split CLAUDE.md into CLAUDE.md + TROUBLESHOOTING.md for maintainability.
2026-03-30 21:37:39 +02:00
dac63e3be5 Phase 6.2: toping BLAS shadows + adaptive TMin + perf optimization
- Re-enable toping BLAS in TLAS (3 instances: blocky + smooth + topings)
  with PREFER_FAST_TRACE for optimized BVH traversal (23M tris)
- Separate shadow/AO ray origins: shadow uses worldPos directly (zero bias),
  AO keeps normal bias (0.15) for hemisphere self-avoidance
- Adaptive TMin solves self-hit vs gap dilemma:
  ground (N.y≈1) → TMin=0.002 for tight blade shadows,
  blade surfaces (N.y≈0) → TMin=0.10 to skip own geometry
- Shadow rays 4→3 with tight cone (0.012 rad), AO rays 8→4
  (7 total rays/pixel, temporal accumulation compensates)
- Remove screen-space contact shadows (doesn't work for thin geometry)
2026-03-30 13:58:57 +02:00
3d0c4f2f80 Phase 4.2+7: grass blade rework + soft RT shadows + toping BLAS optimization
Grass blades:
- Leaf-shaped profile (4 sections: base→belly→taper→tip) instead of spiky triangles
- Wider blades (base 0.055-0.095), more spacing between blades (±0.07 scatter)
- Natural green texture (50,140,35 → 80,180,55) instead of neon lime
- Reduced warm shift and removed artificial saturation boost
- Side faces at 60% brightness (dark green) instead of 38% (near-black)

Soft RT shadows:
- 2 jittered shadow rays per pixel with IGN+Cranley-Patterson temporal variation
- 2.3° cone around sun direction for soft penumbra
- Gradual shadow factor (0-100%) instead of binary on/off

Performance:
- Toping BLAS removed from TLAS (23M+ tris caused massive ray traversal slowdown)
- Toping BLAS position/index buffer construction skipped entirely
- Shadow rays reduced from 4 to 2 (temporal accumulation compensates)
2026-03-29 19:46:25 +02:00
82307269e8 Phase 7.1 tuning: reduce saturation, increase contrast, multi-angle screenshots
- Saturation 1.40→1.15, exposure 2.2→1.8 (less oversaturated)
- Shadow factor 0.55→0.45 (more contrast between lit and shadow)
- Ambient reduced slightly for better contrast
- Screenshot mode: 4 camera views (landscape, sideview, topdown, backlit)
- AO history reset between view changes (prevents temporal contamination)
2026-03-29 15:11:42 +02:00
55c67686f2 Phase 7.1: stylized lighting — hemisphere ambient, colored shadows, rim light, tone mapping
Wonderbox-inspired lighting overhaul across all 3 pixel shaders:
- Hemisphere ambient (sky blue above, warm brown below) replaces flat ambient
- RT shadows lerp toward blue-violet tint instead of plain darkening (factor 0.55)
- Rim light (fresnel) with warm golden color on silhouettes (30% on vegetation)
- Soft exponential tone mapping + saturation boost in final post-process pass
- CB parameters for all lighting values (skyAmbient, groundAmbient, shadowTint, etc.)
- Fog color/density centralized from CB instead of hardcoded per-shader
- Screenshot mode (CLI "screenshot"): fixed camera, AO convergence, auto-capture
- AO noise stability: world-space hash using voxel center + tangent-axis frac position
- AO distance-weighted falloff: continuous occlusion values instead of binary hit/miss
2026-03-29 15:00:12 +02:00
40560c25ef Phase 6.3: temporal accumulation + IGN noise for RT AO
- Interleaved Gradient Noise replaces world-space hash for ray sampling
- Cranley-Patterson rotation (golden ratio × frameIndex) per frame
- Temporal accumulation: blend 5% current + 95% reprojected history (~20 frames)
- aoHistoryTexture_ persists between frames, copy pre-blur for next frame
- prevViewProjection added to VoxelCB for screen-space reprojection
- Push constants: frameIndex + historyValid for temporal control
- Result: nearly noise-free AO with only 8 rays per pixel
2026-03-29 09:55:08 +02:00
9de53e5293 Phase 6.3: RT ambient occlusion with bilateral blur
- 8 cosine-weighted hemisphere rays per pixel (inline ray queries, SM 6.5)
- Distance-weighted AO: quadratic falloff (1-hitT/aoRadius)² instead of binary hit/miss
- World-space hash seed: voxel coord + tangent-plane frac position (stable, no flicker)
- Bilateral blur pipeline: 2-pass separable (H+V), radius 6, depth+normal edge-stopping
- 4-pass dispatch: shadow+rawAO → blur H → blur V → apply
- AO written to separate R8_UNORM texture, blurred, then applied to color buffer
- Debug mode (F5 x3): grayscale AO visualization
2026-03-29 09:31:19 +02:00
6b41da0932 Phase 6.2: RT shadows — inline ray queries with BLAS/TLAS fix
Add shadow compute shader (voxelShadowCS.hlsl) that traces rays toward
the sun using DXR inline ray queries (RayQuery<>, SM 6.5). Shadows
modulate voxelRT_ in-place via RWTexture2D (no extra render target).

Key fixes to Phase 6.1 BLAS/TLAS infrastructure:
- Sequential index buffer required: Wicked treats IndexCount=0 with
  non-null IndexBuffer as "0 indexed triangles" → empty BLAS
- Memory barriers between BLAS→TLAS→RT: without GPUBarrier::Memory()
  the TLAS build races with BLAS builds, causing zero ray hits
- inverseViewProjection added to VoxelCB for depth reconstruction

F5 toggles shadows OFF→ON→DEBUG (red=hit, green=miss, blue=backface).
2026-03-28 20:01:18 +01:00
7f36bdae38 Phase 6.1: RT infrastructure — MRT normals + BLAS/TLAS build
- Normal render target (R16G16B16A16_SNORM) as MRT SV_TARGET1 in all 3 pixel
  shaders (voxelPS, voxelTopingPS, voxelSmoothPS) for future RT shadow/AO
- BLAS extraction compute shader (voxelBLASExtractCS.hlsl): converts PackedQuad
  StructuredBuffer to float3 position buffer for DXR BLAS input
- Blocky BLAS: single BLAS from all GPU-meshed quads (~1.5M triangles)
- Smooth BLAS: single BLAS from smooth vertex buffer directly
- TLAS: 2 instances (blocky + smooth), identity transforms, CreateBuffer2 with
  callback to avoid UpdateBuffer on RAY_TRACING flagged buffers
- Fix: Wicked always accesses index_buffer in CreateRaytracingAccelerationStructure
  via to_internal() even for non-indexed geometry — provide dummy valid buffer
2026-03-28 14:48:11 +01:00
cd9814e494 Phase 5.2-5.3: CPU perf optimizations + GPU compute Surface Nets
CPU smooth mesher optimizations (560ms → 17ms):
- VoxelData grid cache eliminates redundant readVoxel calls
- Pre-cached 27 neighbor chunk pointers (readVoxelFast)
- smoothNear dilation (8 lookups/cell instead of 56)
- Early exit via containsSmooth flag on chunks
- Thread-local scratch buffers (SmoothScratch ~600KB)
- wi::jobsystem parallelization across all cores
- Persistent staging vectors for upload

TopingSystem optimizations (58ms → 6ms):
- collectInstancesParallel() with per-chunk local vectors
- Neighbor chunk pointer caching

GPU compute Surface Nets (Phase 5.3):
- Two-pass compute shader: centroid grid + emit with smooth normals
- Pass 1 (voxelSmoothCentroidCS): computes centroids + solid flags
  for cells [-1..32], cross-chunk neighbor voxel reading
- Pass 2 (voxelSmoothCS): reads ONLY from centroid grid, computes
  area-weighted smooth normals from 12 incident edges per vertex
- Batched dispatch: all centroid passes then all emit passes with
  single UAV→SRV barrier (instead of 2 barriers per chunk)
- Smooth chunk filtering: only dispatches chunks with containsSmooth
- Centroid grid buffer dynamically sized per smooth chunk count
- 1-frame readback delay with auto-redispatch on first frame
2026-03-27 22:30:43 +01:00
d075a8492c Phase 5.1: smooth normals, triplanar fix, depth bias, hasSmooth tighten
- Smooth vertex normals: area-weighted accumulation of face normals per
  indexed vertex before triangle expansion. Gives Gouraud-smooth shading
  without adding geometry.
- Triplanar fix: PS uses geometric normal (ddx/ddy of worldPos) for
  texture projection weights, smooth normal for lighting only. Prevents
  texture stretching on smoothed surfaces.
- Depth bias: custom rasterizer state (depth_bias=2, slope_scaled=1.0)
  on smooth PSO resolves z-fighting at smooth↔blocky overlap.
- hasSmooth filter tightened: check face-adjacent voxels of each corner
  (1-voxel reach) instead of neighbor cells' corners (2-cell cascade).
  Prevents smooth mesh from extending into underground blocky territory.
2026-03-27 15:08:35 +01:00
c755f20325 Fix smooth↔blocky gap by extending hasSmooth filter to adjacent cells
Cells at the smooth↔blocky boundary had no smooth corners themselves,
so the strict hasSmooth filter skipped them entirely. This prevented
quad emission between the smooth mesh and blocky territory, leaving
a visible gap. Now checks 6-connected neighbor cells for smooth corners,
ensuring boundary vertices exist for connecting quads.
2026-03-27 14:39:54 +01:00
b45d5a1884 Phase 5.1: smooth PS blending uses same logic as blocky PS + debug scene
Rewrote voxelSmoothPS.hlsl to derive a dominant face axis from the smooth
normal, then use the exact same neighbor verification as voxelPS.hlsl:
faceU/faceV tangent tables, stair-priority getNeighborMat(), face-aligned
fractional coords, blendZone 0.25, corner attenuation, bleedMask checks.

Added generateDebugSmooth() with 11 isolated test configurations
(smooth↔blocky transitions, staircases, surrounded patches, reference
blocky pairs). Launch with: BVLEVoxels.exe debugsmooth
2026-03-27 14:21:35 +01:00
aab38bb9b9 Phase 5.1: Naive Surface Nets smooth rendering
Implement CPU-side Naive Surface Nets for smooth voxel surfaces (SmoothStone,
Snow) coexisting with blocky voxels (Grass, Dirt, Stone, Sand).

Key features:
- SmoothMesher with binary SDF, centroid vertex placement, per-axis boundary
  clamping to align with blocky grid at smooth↔blocky transitions
- Cross-chunk connectivity: PAD=2 SDF grid, vertex range [-1, CHUNK_SIZE),
  canonical edge ownership (no duplicate triangles, no z-fighting)
- Face normals oriented by edge axis+sign (robust with binary SDF, unlike
  SDF gradient dot or centroid sampling approaches)
- Y-axis winding fix: sharing cells have different spatial arrangement,
  requiring opposite winding from X and Z axes
- GPU mesher treats smooth neighbors as solid (no blocky faces toward smooth)
- Material blending: primary (smooth-only) + secondary (all counts) per vertex
- Dedicated shaders: voxelSmoothVS (vertex pulling t6) + voxelSmoothPS
  (triplanar + lerp blending between two materials)
- Separate render pass with LoadOp::LOAD after voxels+topings
- New materials: SmoothStone (mat 6), blocky Stone (mat 3) and Dirt patches
  added to world generation for boundary testing
2026-03-27 13:03:55 +01:00
72af8af979 Tweak grass blade color 2026-03-26 20:00:33 +01:00
36b8de9285 Phase 4.2: match grass blade colors to voxel faces + stronger translucency
Use same ambient (0.15, 0.18, 0.25) as voxel PS instead of greener
tint. Increase translucency (0.6) to reduce contrast when orbiting
around grass. Wrap at 0.85 for balanced lit-side brightness.
2026-03-26 19:07:04 +01:00
9086a794a8 Add wind to grass toping 2026-03-26 18:58:19 +01:00
ef89bd8c49 Phase 4.2: grass blade tufts, stone corner fills/caps, vegetation shading
Stone: add corner fill triangles at adjacent open edges and cap
triangles at strip terminaisons. Grass: replace bevel strips with
tuft-based grass blades — clusters of 3-9 curved double-sided
blades with per-tuft height/lean personality and hash-driven
placement (quadratic inset 0-0.30 from edge). Vegetation PS uses
half-Lambert wrap lighting + translucency for soft stylized shading
(inspired by Airborn Trees). Stone keeps classic Lambert.
2026-03-26 18:48:35 +01:00
bc29a02c35 Phase 4.2: GPU toping rendering pipeline + winding/lighting fixes
Add instanced rendering for toping bevels: dedicated shaders
(voxelTopingVS/PS), PSO, GPU buffers (t4 vertices, t5 instances),
per-group DrawInstanced in a separate render pass with LoadOp::LOAD.
Fix inverted face winding (emitTri auto-winding condition flipped for
CW front-facing), slope normals (use inward direction not outward),
and PS lighting (negate sunDirection like voxelPS). Update CLAUDE.md
with Phase 4.1/4.2 documentation.
2026-03-26 17:47:08 +01:00
9e777d653b Phase 4.1: TopingSystem infrastructure + procedural mesh generation
- TopingSystem with TopingDef registry, procedural mesh gen, instance collection
- 2 toping types: stone bevel (h=0.06, smooth) + grass edge (h=0.12, bumpy)
- 16 mesh variants per type indexed by 4-bit adjacency bitmask (~6 unique with symmetry)
- Wedge cross-section: outer wall + sloped top, grass has sinusoidal height profile
- Instance collection scans exposed +Y faces, checks same-material neighbors
- Cross-chunk adjacency via VoxelWorld::getVoxel()
- Integrated into VoxelRenderPath: init at Start(), stats in HUD
- ~191K instances, 1920 mesh vertices for 170 chunks (validated)
- Research doc (research_connected_meshes.md) + plan (plan_phase4.md)
2026-03-26 15:27:15 +01:00
f166394b60 Phase 3: per-material bleed flags + patch-based terrain for blend testing
- Add bleedMask/resistBleedMask bitmasks to CB for per-material blend control
  - Grass: canBleed + resistsBleed (bleeds onto others, nothing bleeds onto it)
  - Stone: no bleed (doesn't overflow, but accepts bleed from others)
  - Other materials: normal bidirectional blending
- PS checks flags before blending: mainResists → skip, !neighCanBleed → skip
- Flatten terrain (heightScale 64→20) for better surface visibility
- Replace altitude-based material bands with noise-based 2D patches
  (3 noise channels create organic patches of all 5 materials on surface)
- Make stone/sand more visually distinct (stone=blue-gray, sand=warm yellow)
- Lower stone heightContrast (1.2→0.5) so neighbors bleed onto it more
2026-03-26 12:47:10 +01:00
d7e69f97ca Phase 3: PS-based texture blending with winner-takes-all heightmap
Replace pre-encoded quad blend data (v1) with per-pixel voxel data
lookups in the pixel shader. The PS reads voxelDataBuffer (SRV t3)
to find neighbor materials dynamically, enabling 2 independent blend
axes, stair-priority neighbor detection, and winner-takes-all
heightmap-driven transitions.

Key design decisions validated through 6 iterations (see
blending_experiments.md):
- Winner-takes-all: material with highest heightmap score wins 100%
  (sharp but organic transitions, not smooth gradient)
- Symmetric bias: bias = 0.5 - weight ensures equal chance at border
- Subtractive corner attenuation (param=0.80): xAdj = xEdge -
  saturate(yEdge - 0.80) reduces blend at corners naturally
- Blend zone = 0.25 voxels from each edge (50% of face)
- Debug mode (F4) visualizes blend zones as colors
2026-03-26 12:14:08 +01:00
21f1bd1a12 Phase 2.5: GPU meshing production pipeline + perf optimizations (80+ FPS)
Replace CPU greedy mesher with GPU compute mesher as default rendering pipeline.
Key optimizations identified via CPU profiling (ProfileAccum, 5s averages):
- Fused regenerate+pack: parallel noise gen + memcpy in same jobsystem pass (6ms → 0ms)
- VoxelData memcpy: sizeof(VoxelData)==2 enables direct memcpy instead of bit-shift loop (28ms → <1ms)
- Dirty-skip: GPU dispatch/upload only when chunks change, not every frame
- Animation: 2 fBm octaves + no caves in animation mode (54ms → 8ms)
- Result: 80-110 FPS with 60Hz terrain animation, 700+ FPS static
2026-03-26 09:05:52 +01:00
9a8f80de51 Phase 2.4: GPU compute mesher benchmark (CPU greedy vs GPU baseline)
One-shot benchmark runs automatically after world generation:
- CPU greedy mesher: 277ms, 358K quads (binary greedy merge)
- GPU baseline (1x1): 5.3ms, 2.43M quads (no merge, 52x faster)
- Greedy merge reduces quad count by 6.8x

Implementation:
- State machine: DISPATCH (upload voxels + dispatch) → READBACK → DONE
- GPU timestamps for accurate timing
- Readback buffer for quad counter
- Each chunk's voxel data uploaded and dispatched sequentially
2026-03-25 22:51:22 +01:00
1bfadc2f7c Phase 2.3: GPU compute culling with frustum + backface cull
Compute shader fills indirect args buffer, replacing CPU cull loop.
Single DrawInstancedIndirectCount renders all visible face groups.

Key fixes:
- Compute shader: pack chunkIndex|(faceIndex<<16) in push constant,
  startVertexLocation=0 (aligned with Phase 2.2 SV_VertexID fix)
- PushConstants must be called AFTER BindPipelineState, not before.
  Wicked Engine dispatches to SetGraphicsRoot32BitConstants only when
  active_pso is set; after BindComputeShader it targets compute instead.
- Barriers: UNDEFINED(COMMON)→UAV before compute, UAV→INDIRECT_ARGUMENT after
- Buffer decay: DX12 buffers always return to COMMON between frames,
  no cross-frame state tracking needed
2026-03-25 22:30:50 +01:00
45af49a659 Phase 2.2: MDI rendering with CPU-filled indirect args
Replace per-chunk DrawInstanced loop with a single DrawInstancedIndirectCount.
CPU fills indirect args buffer with same frustum+backface cull logic as Phase 2.1.

Key discoveries:
- Wicked Engine command signature includes push constant (20-byte stride, not 16)
- SV_VertexID does not reliably include startVertexLocation with ExecuteIndirect
- Solution: pack chunkIndex|(faceIndex<<16) in push constant, VS reconstructs
  quad offset from GPUChunkInfo lookup
- No explicit DX12 barriers needed (implicit promotion from COMMON suffices)

Also adds voxel_engine_spec.md and updates references from .docx to .md.
2026-03-25 22:07:22 +01:00
abc640c2d0 cleanup 2026-03-25 19:38:50 +01:00
46e8f50f37 Phase 2 complete: per-face-group backface culling, frustum planes, GPU cull infrastructure
- VS supports dual mode: CPU path (push constants) and MDI path (binary search)
- CPU render loop now does per-face-group draws with backface culling (6 draws/chunk max)
- Frustum planes extracted and populated in constant buffer for GPU cull shader
- GPU cull + MDI path fully implemented but disabled (barrier/state debugging needed)
- GPU timestamp query infrastructure with readback for cull/draw timing
- HUD shows rendering mode (GPU cull vs CPU fallback)
2026-03-25 14:50:55 +01:00
5f346bb14a Phase 2: GPU-driven voxel rendering pipeline
Mega-buffer architecture replacing per-chunk GPU buffers:
- Single StructuredBuffer<PackedQuad> for all chunks (2M quads, 16 MB)
- StructuredBuffer<GPUChunkInfo> with per-chunk metadata (position, quad offsets, face groups)
- VS reads chunk info via push constants (b999) for driver-safe chunk indexing
- CPU frustum culling with wi::primitive::Frustum + AABB per chunk
- Quads sorted by face direction in greedy mesher (faceOffsets/faceCounts)
- GPU frustum + backface cull compute shader (voxelCullCS.hlsl)
- GPU binary mesher compute shader baseline (voxelMeshCS.hlsl)
- Indirect draw buffers and timestamp query infrastructure
- README with build instructions and project architecture
2026-03-25 14:24:05 +01:00