Chessistics/autonomous_plan.md
Samuel Bouchet c451a50349 Headless Linux dev container: Godot + .NET + Xvfb for autonomous testing
Claude Code running inside the project's dev container can now build the
game, launch a real Godot instance under Xvfb, and drive the automation
harness end-to-end — no Windows dependency.

Dockerfile adds (as root, before USER node):
- X11 / Mesa software GL / audio runtime deps + python3
- .NET SDK 9.0 via upstream dot.net install script -> /usr/local/dotnet
- Godot 4.6.2-stable mono Linux x86_64 -> /opt/godot/godot
- /usr/local/bin/godot-xvfb wrapper: auto-wraps invocations in
  xvfb-run -a --server-args="-screen 0 1280x720x24 ..."

harness.py picks GODOT_BIN from env, defaults to /opt/godot/godot on
Linux, and auto-wraps the subprocess in xvfb-run when DISPLAY is unset.
Windows code path unchanged.

init-firewall.sh adds api.nuget.org to the allowlist so dotnet restore
works post-boot. Godot + .NET SDK are fetched at image build time, before
the firewall exists.

New docs:
- autonomous_plan.md: design rationale, alternatives considered
- README.md: launch instructions for Windows terminal / Docker Desktop /
  VS Code Dev Containers / WSL2 natif
- CLAUDE.md already documents the harness (done in previous commit)

Validation: docker build succeeds; inside the container, dotnet --version
=9.0.313, godot --version=4.6.2.stable.mono, dotnet test=102/102,
python3 tools/automation/smoke.py passes end-to-end with 14 non-black
1280x720 PNGs. Mission 1 screenshot is visually identical to the Windows
build, and Xvfb determinism is a bonus (det_a.png ≡ det_b.png bytewise).
2026-04-17 16:57:56 +02:00

6.5 KiB
Raw Blame History

Headless Linux dev container for autonomous Chessistics testing

Why

Today the automation harness only runs on Windows because the Godot binary is hardcoded to C:\Apps\godot\Godot_v4.6.2-stable_mono_win64_console.exe. When Claude Code runs inside the project's dev container (Linux / node:20), it can read source code and run dotnet test, but it cannot launch the actual game — there's no Godot binary, no .NET SDK, and no display server for the renderer.

The goal: make the dev container a self-contained environment where Claude Code can build the project, launch a real Godot instance in headless Linux mode, drive it via the automation harness, and read back 1280×720 PNG screenshots — all without any Windows dependency.

Design

Pieces required

  1. Godot 4.6.2-stable Mono for Linux — matches the Windows editor the project already uses. Installed once at image build time to /opt/godot/ with a symlink /opt/godot/godot.
  2. .NET SDK 9.0 — the project targets net9.0. Installed via the upstream dot.net install script to /usr/local/dotnet/.
  3. Xvfb + Mesa software GL — a virtual framebuffer at :99 so Godot's GL-compatibility renderer has somewhere to draw. xvfb-run wraps any command transparently.
  4. Python 3 — the automation harness is stdlib-only Python.
  5. Minimal X / audio runtime depslibx11, libxcursor, libxrandr, libxi, libgl1, libgles2, libasound2, libxkbcommon0, etc. Without these, Godot exits on startup with libXext not found-style errors.

How Godot reaches the framebuffer

Two options considered:

  • (A) Run Xvfb :99 -screen 0 1280x720x24 & as a background process, export DISPLAY=:99, launch Godot normally. Persistent display, shared by many Godot runs.
  • (B) Use xvfb-run -a --server-args="-screen 0 1280x720x24" as a prefix on every Godot invocation. A fresh display per launch; cleans up automatically on exit.

Chosen: (B), because the automation harness already spawns Godot once per Harness.launch() and cleans up on context exit — matches the per-launch lifecycle naturally, no daemon to keep alive, no race on the display number.

A tiny wrapper /usr/local/bin/godot-xvfb wraps xvfb-run … $GODOT_BIN "$@", so the harness (or a human) only has to invoke one path.

Integration with the existing harness

tools/automation/harness.py currently hardcodes the Windows Godot path. We teach it two things:

  1. Read GODOT_BIN from the environment first; fall back to the platform default (Windows path on Windows, /opt/godot/godot on Linux).
  2. On Linux, auto-prepend ["xvfb-run", "-a", "--server-args=-screen 0 1280x720x24"] to the Godot launch command unless DISPLAY is already set (someone has a real display, skip the wrap).

With those two tweaks, every existing Python helper (smoke.py, run_game.py, solve_*.py) works unchanged inside the container.

Firewall considerations

The container's init-firewall.sh runs at postStartCommand, after the image is built, and drops all outbound traffic except to a small allowlist (GitHub, npmjs, Anthropic, Sentry, Statsig). Impact on our pieces:

  • Godot binary + .NET SDK: downloaded during docker build, which runs before the firewall exists → works unconditionally.
  • dotnet restore (runtime, e.g. after a git pull): needs api.nuget.org. Added to the allowlist.
  • Godot runtime: no outbound traffic required — the engine runs fully offline once installed.

Build sequence inside the Dockerfile

As root, before the existing USER node switch:

# 1. X/GL/audio runtime + python + xvfb
apt-get install xvfb xauth libx11-6 libxcursor1 libxinerama1 libxrandr2 \
  libxi6 libxext6 libgl1 libglx-mesa0 libgl1-mesa-dri libglu1-mesa \
  libasound2 libxkbcommon0 libxkbcommon-x11-0 libfontconfig1 libdbus-1-3 \
  python3 python3-pip

# 2. .NET SDK 9.0 via upstream install script
curl -sSL https://dot.net/v1/dotnet-install.sh | bash -s -- \
  --channel 9.0 --install-dir /usr/local/dotnet
ln -s /usr/local/dotnet/dotnet /usr/local/bin/dotnet

# 3. Godot 4.6.2-stable mono, Linux x86_64
wget https://github.com/godotengine/godot/releases/download/${VERSION}/\
Godot_v${VERSION}_mono_linux_x86_64.zip
unzip … -d /opt/godot
ln -s /opt/godot/Godot_v…/Godot_v…_mono_linux.x86_64 /opt/godot/godot

Then:

  • ENV GODOT_BIN=/opt/godot/godot
  • ENV PATH=$PATH:/opt/godot:/usr/local/dotnet
  • Drop in /usr/local/bin/godot-xvfb wrapper

File-by-file change list

File Change
.devcontainer/Dockerfile Add Godot / dotnet / xvfb installs before USER node
.devcontainer/init-firewall.sh Append api.nuget.org to the domain allowlist
.devcontainer/godot-xvfb.sh (new) exec xvfb-run -a … "$GODOT_BIN" "$@"
tools/automation/harness.py Env-aware Godot path + Linux xvfb auto-wrap
README.md (new) Windows / WSL2 launch instructions for the dev container

Nothing inside Scripts/ or chessistics-engine/ changes. The harness contract (inbox/outbox/screens) is platform-agnostic already.

Verification

After rebuild:

  1. docker build .devcontainer -t chessistics-dev succeeds.
  2. devcontainer up --workspace-folder . starts the container and post-start firewall passes.
  3. Inside: dotnet --version9.0.x, godot --version4.6.2.stable.mono.official.*.
  4. dotnet build Chessistics.csproj → green.
  5. dotnet test chessistics-tests/ → 102 / 102.
  6. python3 tools/automation/smoke.py → loads mission 1, takes PNG screenshots that are non-black, quits cleanly.
  7. Read one of the PNGs — it should show the same mission 1 UI as the Windows run (title bar, board, objectives, stock panel).

Out of scope (explicit non-goals)

  • GPU acceleration: we use Mesa software rendering. Xvfb + llvmpipe is enough for 1280×720 at a few FPS, which is what the harness needs.
  • Real display forwarding (X11 forwarding, VNC, noVNC): doable but unnecessary — Claude reads PNGs, not a live video feed.
  • Multi-arch images: we ship x86_64 only. ARM (Apple Silicon via Docker Desktop emulation) would need Godot_v…_mono_linux_arm64.zip — straightforward to add if needed, not done here.
  • Shrinking the image: Godot + .NET SDK adds ~500 MB. Worth it; multi-stage builds could trim later.
  • Keeping Xvfb warm across launches: the single-launch pattern is clean enough. If someone ever scripts dozens of rapid Godot starts and the xvfb-run startup cost shows up, revisit approach (A).