Chessistics/autonomous_plan.md

150 lines
6.5 KiB
Markdown
Raw Normal View History

Headless Linux dev container: Godot + .NET + Xvfb for autonomous testing Claude Code running inside the project's dev container can now build the game, launch a real Godot instance under Xvfb, and drive the automation harness end-to-end — no Windows dependency. Dockerfile adds (as root, before USER node): - X11 / Mesa software GL / audio runtime deps + python3 - .NET SDK 9.0 via upstream dot.net install script -> /usr/local/dotnet - Godot 4.6.2-stable mono Linux x86_64 -> /opt/godot/godot - /usr/local/bin/godot-xvfb wrapper: auto-wraps invocations in xvfb-run -a --server-args="-screen 0 1280x720x24 ..." harness.py picks GODOT_BIN from env, defaults to /opt/godot/godot on Linux, and auto-wraps the subprocess in xvfb-run when DISPLAY is unset. Windows code path unchanged. init-firewall.sh adds api.nuget.org to the allowlist so dotnet restore works post-boot. Godot + .NET SDK are fetched at image build time, before the firewall exists. New docs: - autonomous_plan.md: design rationale, alternatives considered - README.md: launch instructions for Windows terminal / Docker Desktop / VS Code Dev Containers / WSL2 natif - CLAUDE.md already documents the harness (done in previous commit) Validation: docker build succeeds; inside the container, dotnet --version =9.0.313, godot --version=4.6.2.stable.mono, dotnet test=102/102, python3 tools/automation/smoke.py passes end-to-end with 14 non-black 1280x720 PNGs. Mission 1 screenshot is visually identical to the Windows build, and Xvfb determinism is a bonus (det_a.png ≡ det_b.png bytewise).
2026-04-17 16:57:56 +02:00
# Headless Linux dev container for autonomous Chessistics testing
## Why
Today the automation harness only runs on Windows because the Godot binary is
hardcoded to `C:\Apps\godot\Godot_v4.6.2-stable_mono_win64_console.exe`. When
Claude Code runs inside the project's dev container (Linux / `node:20`), it
can read source code and run `dotnet test`, but it **cannot launch the actual
game** — there's no Godot binary, no .NET SDK, and no display server for the
renderer.
The goal: make the dev container a self-contained environment where Claude
Code can build the project, launch a real Godot instance in headless Linux
mode, drive it via the automation harness, and read back 1280×720 PNG
screenshots — all without any Windows dependency.
## Design
### Pieces required
1. **Godot 4.6.2-stable Mono for Linux** — matches the Windows editor the
project already uses. Installed once at image build time to `/opt/godot/`
with a symlink `/opt/godot/godot`.
2. **.NET SDK 9.0** — the project targets `net9.0`. Installed via the
upstream `dot.net` install script to `/usr/local/dotnet/`.
3. **Xvfb + Mesa software GL** — a virtual framebuffer at `:99` so Godot's
GL-compatibility renderer has somewhere to draw. `xvfb-run` wraps any
command transparently.
4. **Python 3** — the automation harness is stdlib-only Python.
5. **Minimal X / audio runtime deps**`libx11`, `libxcursor`, `libxrandr`,
`libxi`, `libgl1`, `libgles2`, `libasound2`, `libxkbcommon0`, etc.
Without these, Godot exits on startup with `libXext not found`-style errors.
### How Godot reaches the framebuffer
Two options considered:
- **(A)** Run `Xvfb :99 -screen 0 1280x720x24 &` as a background process,
export `DISPLAY=:99`, launch Godot normally. Persistent display, shared by
many Godot runs.
- **(B)** Use `xvfb-run -a --server-args="-screen 0 1280x720x24"` as a prefix
on every Godot invocation. A fresh display per launch; cleans up
automatically on exit.
**Chosen: (B)**, because the automation harness already spawns Godot once per
`Harness.launch()` and cleans up on context exit — matches the per-launch
lifecycle naturally, no daemon to keep alive, no race on the display number.
A tiny wrapper `/usr/local/bin/godot-xvfb` wraps `xvfb-run … $GODOT_BIN
"$@"`, so the harness (or a human) only has to invoke one path.
### Integration with the existing harness
`tools/automation/harness.py` currently hardcodes the Windows Godot path. We
teach it two things:
1. Read `GODOT_BIN` from the environment first; fall back to the platform
default (Windows path on Windows, `/opt/godot/godot` on Linux).
2. On Linux, auto-prepend `["xvfb-run", "-a", "--server-args=-screen 0
1280x720x24"]` to the Godot launch command unless `DISPLAY` is already
set (someone has a real display, skip the wrap).
With those two tweaks, every existing Python helper (`smoke.py`,
`run_game.py`, `solve_*.py`) works unchanged inside the container.
### Firewall considerations
The container's `init-firewall.sh` runs at `postStartCommand`, **after** the
image is built, and drops all outbound traffic except to a small allowlist
(GitHub, npmjs, Anthropic, Sentry, Statsig). Impact on our pieces:
- **Godot binary + .NET SDK**: downloaded during `docker build`, which runs
_before_ the firewall exists → works unconditionally.
- **`dotnet restore`** (runtime, e.g. after a `git pull`): needs
`api.nuget.org`. Added to the allowlist.
- **Godot runtime**: no outbound traffic required — the engine runs fully
offline once installed.
### Build sequence inside the Dockerfile
As `root`, before the existing `USER node` switch:
```
# 1. X/GL/audio runtime + python + xvfb
apt-get install xvfb xauth libx11-6 libxcursor1 libxinerama1 libxrandr2 \
libxi6 libxext6 libgl1 libglx-mesa0 libgl1-mesa-dri libglu1-mesa \
libasound2 libxkbcommon0 libxkbcommon-x11-0 libfontconfig1 libdbus-1-3 \
python3 python3-pip
# 2. .NET SDK 9.0 via upstream install script
curl -sSL https://dot.net/v1/dotnet-install.sh | bash -s -- \
--channel 9.0 --install-dir /usr/local/dotnet
ln -s /usr/local/dotnet/dotnet /usr/local/bin/dotnet
# 3. Godot 4.6.2-stable mono, Linux x86_64
wget https://github.com/godotengine/godot/releases/download/${VERSION}/\
Godot_v${VERSION}_mono_linux_x86_64.zip
unzip … -d /opt/godot
ln -s /opt/godot/Godot_v…/Godot_v…_mono_linux.x86_64 /opt/godot/godot
```
Then:
- `ENV GODOT_BIN=/opt/godot/godot`
- `ENV PATH=$PATH:/opt/godot:/usr/local/dotnet`
- Drop in `/usr/local/bin/godot-xvfb` wrapper
## File-by-file change list
| File | Change |
|------|--------|
| `.devcontainer/Dockerfile` | Add Godot / dotnet / xvfb installs before `USER node` |
| `.devcontainer/init-firewall.sh` | Append `api.nuget.org` to the domain allowlist |
| `.devcontainer/godot-xvfb.sh` *(new)* | `exec xvfb-run -a … "$GODOT_BIN" "$@"` |
| `tools/automation/harness.py` | Env-aware Godot path + Linux xvfb auto-wrap |
| `README.md` *(new)* | Windows / WSL2 launch instructions for the dev container |
Nothing inside `Scripts/` or `chessistics-engine/` changes. The harness
contract (inbox/outbox/screens) is platform-agnostic already.
## Verification
After rebuild:
1. `docker build .devcontainer -t chessistics-dev` succeeds.
2. `devcontainer up --workspace-folder .` starts the container and
post-start firewall passes.
3. Inside: `dotnet --version``9.0.x`, `godot --version`
`4.6.2.stable.mono.official.*`.
4. `dotnet build Chessistics.csproj` → green.
5. `dotnet test chessistics-tests/` → 102 / 102.
6. `python3 tools/automation/smoke.py` → loads mission 1, takes PNG
screenshots that are non-black, quits cleanly.
7. `Read` one of the PNGs — it should show the same mission 1 UI as the
Windows run (title bar, board, objectives, stock panel).
## Out of scope (explicit non-goals)
- **GPU acceleration**: we use Mesa software rendering. Xvfb + llvmpipe is
enough for 1280×720 at a few FPS, which is what the harness needs.
- **Real display forwarding** (X11 forwarding, VNC, noVNC): doable but
unnecessary — Claude reads PNGs, not a live video feed.
- **Multi-arch images**: we ship x86_64 only. ARM (Apple Silicon via
Docker Desktop emulation) would need `Godot_v…_mono_linux_arm64.zip`
straightforward to add if needed, not done here.
- **Shrinking the image**: Godot + .NET SDK adds ~500 MB. Worth it;
multi-stage builds could trim later.
- **Keeping Xvfb warm across launches**: the single-launch pattern is clean
enough. If someone ever scripts dozens of rapid Godot starts and the
xvfb-run startup cost shows up, revisit approach (A).