Contributing

Testing

Overview

Tests are split into two categories:

  • Unit tests — no infrastructure dependency; run by default.
  • End-to-end tests — exercise the Docker-backed orchestrator lifecycle with real runtime assets; gated by the e2e build tag.

End-to-end tests are reserved for system-level paths that are not well represented by unit tests:

ScopeTestsWhat it provesRequirement
OrchestratorTestE2E_Orchestrator_*Real orchestrator and agent assets can pass unauthenticated lifecycle flows.docker on PATH and daemon

Conventions

CategoryBuild tagFile suffixExample
Unit(none)_test.godocker_test.go
End-to-end//go:build e2e_e2e_test.goorchestrator_e2e_test.go

Every end-to-end file must start with:

//go:build e2e

(blank line after the directive is required by Go tooling.)

End-to-end files may also keep a requireDockerOrSkip(t) helper for local runs where Docker is absent but the tag is active.

End-to-end test functions must use the TestE2E_ prefix:

go test -tags=e2e -run '^TestE2E_' ./...

Running tests locally

# Unit tests only (no Docker required)
make test-unit

# All Go tests with end-to-end files enabled
make test-e2e

# Only end-to-end tests
make test-e2e-only

# Only one orchestrator's end-to-end tests
make test-e2e-claude-code
make test-e2e-codex
make test-e2e-gastown

# Only up tests, across orchestrators
make test-e2e-up

# Only phase-by-phase tests, across orchestrators
make test-e2e-phase-by-phase

# Shell script tests
make test-scripts

make test is an alias for make test-unit.

Command Reference

CommandWhat it runsIncludes unit testsIncludes end-to-end tests
make test-unitgo test ./...yesno
make test-e2ego test -tags=e2e ./...yesyes
make test-e2e-onlygo test -tags=e2e -run '^TestE2E_' ./...noyes
make test-e2e-upAggregate taxiway up tests for all orchestratorsnoyes
make test-e2e-phase-by-phasePhase-by-phase lifecycle tests for all orchestratorsnoyes
make test-e2e-claude-codeClaude Code orchestrator end-to-end testsnoyes
make test-e2e-claude-code-upClaude Code taxiway up end-to-end testnoyes
make test-e2e-claude-code-phase-by-phaseClaude Code phase-by-phase end-to-end testnoyes
make test-e2e-codexCodex orchestrator end-to-end testsnoyes
make test-e2e-codex-upCodex taxiway up end-to-end testnoyes
make test-e2e-codex-phase-by-phaseCodex phase-by-phase end-to-end testnoyes
make test-e2e-gastownGas Town orchestrator end-to-end testsnoyes
make test-e2e-gastown-upGas Town taxiway up end-to-end testnoyes
make test-e2e-gastown-phase-by-phaseGas Town phase-by-phase end-to-end testnoyes
make test-scriptsshell script tests under tests/scripts/nono

Use make test-e2e for the Go end-to-end gate. Use make test-e2e-only when you want only end-to-end test functions. Use the orchestrator-specific targets when debugging one integration locally. Use the *-up targets for the shorter up path before running the *-phase-by-phase targets.

E2E targets use E2E_TIMEOUT as their Go test timeout. Override it locally when debugging slow Docker pulls or first-run setup:

E2E_TIMEOUT=1800s make test-e2e-gastown-up

Shell Script Tests

Shell script tests validate small runtime shell contracts without creating a lab or requiring Docker. They run through make test-scripts:

ScriptWhat it validates
tests/scripts/test_standalone_install.shRoot install.sh defaults, help text, runtime directory, and staged runtime replacement.
tests/scripts/test_verify_headers.shVerify scripts start user-visible output with section headers before OK lines.
tests/scripts/infra/commands/test_reset.shinfra/commands/reset.sh confirmation behavior, cleanup, and marker preservation.
tests/scripts/infra/trace/test_events.shinfra/trace/events.sh event output, source fallback, and ignored trace env vars.
tests/scripts/infra/workspace/test_clone.shinfra/workspace/clone.sh URL rewriting, clone flow, ref checkout, and token handling.
tests/scripts/orchestrators/gastown/test_workspace.shorchestrators/gastown/workspace.sh skip behavior, name validation, HQ initialization, rig registration, ref checkout, and crew workspace provisioning.

End-to-end Scope

The end-to-end suite runs Taxiway through the Docker driver and real orchestrator/agent asset layouts for claude-code, codex, and gastown. It runs install and verify scripts for each orchestrator and its declared agents, then verifies that each lab can reconcile gateway access, mirror the public manufacture-dev/agreement-hub fixture repository into a lab-local bare Git repository, let the orchestrator workspace scripts clone from that isolated local remote, start, stop, restart, diagnose, reset, and remove a lab. After start and restart, the suite also runs taxiway shell <lab> --check to verify that the public shell command can resolve a ready target without opening an interactive session.

Separate Docker-driver, fixture-orchestrator, and installed-runtime-layout end-to-end tests are redundant for this lifecycle path; driver details, runtime path resolution, gateway handling, and phase edge cases are covered by unit tests, including Docker driver command construction against a fake docker binary.

These tests intentionally pass --skip-auth-check, do not run the interactive taxiway auth command, do not use real API keys, and do not exercise browser or device-flow login. Authenticated agent execution is outside this layer because it depends on external accounts, quotas, opaque local credential stores, and interactive state.

Docker-backed end-to-end tests skip cleanly when Docker is unavailable. Set LAB_NO_DOCKER=1 to force the skip path:

LAB_NO_DOCKER=1 make test-e2e-only

The Docker driver details, prerequisites, and troubleshooting notes live in Docker.

CI Structure

core-checks runs on every push and pull request. It covers build, unit tests, shell script tests, and lints.

JobWhat it runsDocker daemon
core-checksgo build + make test-unit + make test-scripts + lintsNo

The Docker-backed orchestrator end-to-end suite is intentionally not a PR merge gate because real installs make it too slow and sometimes dependent on external package availability. It runs from the End-to-end workflow on a daily schedule and through workflow_dispatch. That workflow runs a matrix with one job per orchestrator. Each job runs the taxiway up test for its orchestrator first, then runs that same orchestrator's phase-by-phase test. The orchestrator jobs run in parallel with fail-fast disabled.

Adding a new test

Unit test — write it in the nearest _test.go file (no build tag needed). Verify it compiles and passes with make test-unit.

End-to-end test — create or extend a *_e2e_test.go file in the same package. Add //go:build e2e as the very first line, followed by a blank line. Name the test function TestE2E_<area>_<case>. Verify with make test-e2e-only and make test-e2e.