Chessistics/autonomous_plan.md
Samuel Bouchet c451a50349 Headless Linux dev container: Godot + .NET + Xvfb for autonomous testing
Claude Code running inside the project's dev container can now build the
game, launch a real Godot instance under Xvfb, and drive the automation
harness end-to-end — no Windows dependency.

Dockerfile adds (as root, before USER node):
- X11 / Mesa software GL / audio runtime deps + python3
- .NET SDK 9.0 via upstream dot.net install script -> /usr/local/dotnet
- Godot 4.6.2-stable mono Linux x86_64 -> /opt/godot/godot
- /usr/local/bin/godot-xvfb wrapper: auto-wraps invocations in
  xvfb-run -a --server-args="-screen 0 1280x720x24 ..."

harness.py picks GODOT_BIN from env, defaults to /opt/godot/godot on
Linux, and auto-wraps the subprocess in xvfb-run when DISPLAY is unset.
Windows code path unchanged.

init-firewall.sh adds api.nuget.org to the allowlist so dotnet restore
works post-boot. Godot + .NET SDK are fetched at image build time, before
the firewall exists.

New docs:
- autonomous_plan.md: design rationale, alternatives considered
- README.md: launch instructions for Windows terminal / Docker Desktop /
  VS Code Dev Containers / WSL2 natif
- CLAUDE.md already documents the harness (done in previous commit)

Validation: docker build succeeds; inside the container, dotnet --version
=9.0.313, godot --version=4.6.2.stable.mono, dotnet test=102/102,
python3 tools/automation/smoke.py passes end-to-end with 14 non-black
1280x720 PNGs. Mission 1 screenshot is visually identical to the Windows
build, and Xvfb determinism is a bonus (det_a.png ≡ det_b.png bytewise).
2026-04-17 16:57:56 +02:00

149 lines
6.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Headless Linux dev container for autonomous Chessistics testing
## Why
Today the automation harness only runs on Windows because the Godot binary is
hardcoded to `C:\Apps\godot\Godot_v4.6.2-stable_mono_win64_console.exe`. When
Claude Code runs inside the project's dev container (Linux / `node:20`), it
can read source code and run `dotnet test`, but it **cannot launch the actual
game** — there's no Godot binary, no .NET SDK, and no display server for the
renderer.
The goal: make the dev container a self-contained environment where Claude
Code can build the project, launch a real Godot instance in headless Linux
mode, drive it via the automation harness, and read back 1280×720 PNG
screenshots — all without any Windows dependency.
## Design
### Pieces required
1. **Godot 4.6.2-stable Mono for Linux** — matches the Windows editor the
project already uses. Installed once at image build time to `/opt/godot/`
with a symlink `/opt/godot/godot`.
2. **.NET SDK 9.0** — the project targets `net9.0`. Installed via the
upstream `dot.net` install script to `/usr/local/dotnet/`.
3. **Xvfb + Mesa software GL** — a virtual framebuffer at `:99` so Godot's
GL-compatibility renderer has somewhere to draw. `xvfb-run` wraps any
command transparently.
4. **Python 3** — the automation harness is stdlib-only Python.
5. **Minimal X / audio runtime deps**`libx11`, `libxcursor`, `libxrandr`,
`libxi`, `libgl1`, `libgles2`, `libasound2`, `libxkbcommon0`, etc.
Without these, Godot exits on startup with `libXext not found`-style errors.
### How Godot reaches the framebuffer
Two options considered:
- **(A)** Run `Xvfb :99 -screen 0 1280x720x24 &` as a background process,
export `DISPLAY=:99`, launch Godot normally. Persistent display, shared by
many Godot runs.
- **(B)** Use `xvfb-run -a --server-args="-screen 0 1280x720x24"` as a prefix
on every Godot invocation. A fresh display per launch; cleans up
automatically on exit.
**Chosen: (B)**, because the automation harness already spawns Godot once per
`Harness.launch()` and cleans up on context exit — matches the per-launch
lifecycle naturally, no daemon to keep alive, no race on the display number.
A tiny wrapper `/usr/local/bin/godot-xvfb` wraps `xvfb-run … $GODOT_BIN
"$@"`, so the harness (or a human) only has to invoke one path.
### Integration with the existing harness
`tools/automation/harness.py` currently hardcodes the Windows Godot path. We
teach it two things:
1. Read `GODOT_BIN` from the environment first; fall back to the platform
default (Windows path on Windows, `/opt/godot/godot` on Linux).
2. On Linux, auto-prepend `["xvfb-run", "-a", "--server-args=-screen 0
1280x720x24"]` to the Godot launch command unless `DISPLAY` is already
set (someone has a real display, skip the wrap).
With those two tweaks, every existing Python helper (`smoke.py`,
`run_game.py`, `solve_*.py`) works unchanged inside the container.
### Firewall considerations
The container's `init-firewall.sh` runs at `postStartCommand`, **after** the
image is built, and drops all outbound traffic except to a small allowlist
(GitHub, npmjs, Anthropic, Sentry, Statsig). Impact on our pieces:
- **Godot binary + .NET SDK**: downloaded during `docker build`, which runs
_before_ the firewall exists → works unconditionally.
- **`dotnet restore`** (runtime, e.g. after a `git pull`): needs
`api.nuget.org`. Added to the allowlist.
- **Godot runtime**: no outbound traffic required — the engine runs fully
offline once installed.
### Build sequence inside the Dockerfile
As `root`, before the existing `USER node` switch:
```
# 1. X/GL/audio runtime + python + xvfb
apt-get install xvfb xauth libx11-6 libxcursor1 libxinerama1 libxrandr2 \
libxi6 libxext6 libgl1 libglx-mesa0 libgl1-mesa-dri libglu1-mesa \
libasound2 libxkbcommon0 libxkbcommon-x11-0 libfontconfig1 libdbus-1-3 \
python3 python3-pip
# 2. .NET SDK 9.0 via upstream install script
curl -sSL https://dot.net/v1/dotnet-install.sh | bash -s -- \
--channel 9.0 --install-dir /usr/local/dotnet
ln -s /usr/local/dotnet/dotnet /usr/local/bin/dotnet
# 3. Godot 4.6.2-stable mono, Linux x86_64
wget https://github.com/godotengine/godot/releases/download/${VERSION}/\
Godot_v${VERSION}_mono_linux_x86_64.zip
unzip … -d /opt/godot
ln -s /opt/godot/Godot_v…/Godot_v…_mono_linux.x86_64 /opt/godot/godot
```
Then:
- `ENV GODOT_BIN=/opt/godot/godot`
- `ENV PATH=$PATH:/opt/godot:/usr/local/dotnet`
- Drop in `/usr/local/bin/godot-xvfb` wrapper
## File-by-file change list
| File | Change |
|------|--------|
| `.devcontainer/Dockerfile` | Add Godot / dotnet / xvfb installs before `USER node` |
| `.devcontainer/init-firewall.sh` | Append `api.nuget.org` to the domain allowlist |
| `.devcontainer/godot-xvfb.sh` *(new)* | `exec xvfb-run -a … "$GODOT_BIN" "$@"` |
| `tools/automation/harness.py` | Env-aware Godot path + Linux xvfb auto-wrap |
| `README.md` *(new)* | Windows / WSL2 launch instructions for the dev container |
Nothing inside `Scripts/` or `chessistics-engine/` changes. The harness
contract (inbox/outbox/screens) is platform-agnostic already.
## Verification
After rebuild:
1. `docker build .devcontainer -t chessistics-dev` succeeds.
2. `devcontainer up --workspace-folder .` starts the container and
post-start firewall passes.
3. Inside: `dotnet --version` → `9.0.x`, `godot --version` →
`4.6.2.stable.mono.official.*`.
4. `dotnet build Chessistics.csproj` → green.
5. `dotnet test chessistics-tests/` → 102 / 102.
6. `python3 tools/automation/smoke.py` → loads mission 1, takes PNG
screenshots that are non-black, quits cleanly.
7. `Read` one of the PNGs — it should show the same mission 1 UI as the
Windows run (title bar, board, objectives, stock panel).
## Out of scope (explicit non-goals)
- **GPU acceleration**: we use Mesa software rendering. Xvfb + llvmpipe is
enough for 1280×720 at a few FPS, which is what the harness needs.
- **Real display forwarding** (X11 forwarding, VNC, noVNC): doable but
unnecessary — Claude reads PNGs, not a live video feed.
- **Multi-arch images**: we ship x86_64 only. ARM (Apple Silicon via
Docker Desktop emulation) would need `Godot_v…_mono_linux_arm64.zip` —
straightforward to add if needed, not done here.
- **Shrinking the image**: Godot + .NET SDK adds ~500 MB. Worth it;
multi-stage builds could trim later.
- **Keeping Xvfb warm across launches**: the single-launch pattern is clean
enough. If someone ever scripts dozens of rapid Godot starts and the
xvfb-run startup cost shows up, revisit approach (A).