Testing Guide
Overview
Clawker uses a multi-tier testing strategy with no build tags — test categories are separated by directory.| Category | Directory | Docker Required | Purpose |
|---|---|---|---|
| Unit | *_test.go (co-located) | No | Pure logic, fakes, mocks |
| E2E | test/e2e/ | Yes | Full-stack integration (firewall, mounts, migrations, presets) |
| Whail | test/whail/ | Yes + BuildKit | BuildKit integration, engine-level builds |
Running Tests
Additional Makefile Targets
| Target | Purpose |
|---|---|
make test / make test-unit | Unit tests only (excludes test/ suites) |
make test-ci | Unit tests with race detector + coverage output |
make test-all | All test suites in sequence |
make test-coverage | Unit tests with HTML coverage report |
make test-clean | Remove Docker resources labeled dev.clawker.test=true |
gotestsum (if installed) for human-friendly output with icons and colors, falling back to go test.
Running Specific Tests
Golden File Testing
Standard Golden Files
Some tests compare output against golden files or recorded data. To update after intentional changes:Firewall Corefile Golden
The firewall package has a golden file test for CoreDNS config generation (internal/firewall/coredns_test.go). The golden file at internal/firewall/testdata/corefile_basic.golden must be hand-edited to update.
Storage Oracle + Golden Strategy
Theinternal/storage package uses a defense-in-depth approach with two independent guards for merge correctness:
| Layer | How it works | What it catches |
|---|---|---|
| Oracle (randomized) | Computes expected merge from spec rules (~15 lines), independent of prod code. Runs every time with a new seed. | Any merge bug that manifests for the random placement |
| Golden (fixed seed) | Hardcoded struct literal blessed from known-correct state. No auto-update. | Any regression from the blessed baseline, including oracle bugs |
make storage-golden prints new values with interactive confirmation. The STORAGE_GOLDEN_BLESS env var is specific to this one test (no global sweep risk).
Local Development Environment
Themake localenv target creates an isolated XDG directory tree for manual UAT without polluting your real config:
.config/, .local/share/, .local/state/, .cache/). The CLI creates its own clawker/ subdirectories on first use (e.g., clawker project init). The exported env vars point to the app-level paths so the storage resolver picks them up.
Writing Tests
Isolated Test Environments (internal/testenv)
The testenv package provides unified, progressively-configured test environments for any test that needs XDG directory isolation. It eliminates duplicated directory setup across test helpers.
Writing Config Files in Tests
UseWriteYAML to place config files at canonical locations:
ConfigFile constants: ProjectConfig, ProjectConfigLocal, Settings, EgressRules, ProjectRegistry.
Delegation
Higher-level helpers delegate to testenv:configmocks.NewIsolatedTestConfig(t)→testenv.New(t, testenv.WithConfig())projectmocks.NewTestProjectManager(t, gf)→testenv.New(t, testenv.WithProjectManager(gf))test/e2e/harness.NewIsolatedFS()→testenv.New(h.T)+ project dir + chdir
Test Infrastructure
Each package in the dependency DAG provides test utilities so dependents can mock the entire chain:| Package | Test Utils | Provides |
|---|---|---|
internal/testenv | testenv/ | New(t, opts...) → isolated XDG dirs + optional Config/ProjectManager; WriteYAML for config file placement |
internal/docker | dockertest/ | FakeClient (wraps whailtest.FakeAPIClient), SetupXxx helpers, fixtures, assertions (AssertCalled, AssertNotCalled, AssertCalledN) |
internal/config | mocks/ | NewBlankConfig(), NewFromString(projectYAML, settingsYAML), NewIsolatedTestConfig(t), ConfigMock (moq-generated) |
internal/git | gittest/ | InMemoryGitManager (memfs-backed, seeded with initial commit) |
internal/project | mocks/ | NewMockProjectManager(), NewMockProject(name, repoPath), NewTestProjectManager(t, gitFactory) |
pkg/whail | whailtest/ | FakeAPIClient (80+ Fn fields, call recording), build scenarios (Simple, Cached, MultiStage, Error, etc.), EventRecorder |
internal/firewall | mocks/ | FirewallManagerMock (moq-generated, 15 methods) |
internal/iostreams | Test() | iostreams.Test() → (*IOStreams, *bytes.Buffer, *bytes.Buffer, *bytes.Buffer) |
internal/hostproxy | hostproxytest/ | MockHostProxy for integration tests |
internal/storage | ValidateDirectories() | XDG directory collision detection |
Command Test Pattern
Commands are tested using the Cobra+Factory pattern withdockertest.FakeClient. Each command’s test file typically defines a testFactory helper that wires the minimum closures needed (Config, Logger, Client, etc.). The pattern looks like:
Three Test Tiers for Commands
| Tier | Method | What It Tests |
|---|---|---|
| 1. Flag Parsing | runF trapdoor | Flags map correctly to Options fields |
| 2. Integration | nil runF + fake Docker | Full pipeline (flags + Docker calls + output) |
| 3. Unit | Direct function call | Domain logic without Cobra or Factory |
E2E Test Harness (test/e2e/harness/)
For E2E tests exercising the full stack with real Docker:
configmocks.NewBlankConfig, dockertest.FakeClient, hostproxytest.MockManager, firewallmocks.FirewallManagerMock), while Logger always creates a real file logger via logger.New, and ProjectManager, GitManager, and SocketBridge default to nil.
Harness Types
| Type | Purpose |
|---|---|
Harness | Isolated test environment with CLI execution (T, Opts) |
RunResult | CLI command outcome (ExitCode, Err, Stdout, Stderr, Factory) |
SetupResult | Embeds *testenv.Env + ProjectDir from NewIsolatedFS |
FSOptions | Override project dir name (default: "testproject") |
FactoryOptions | 7 pluggable constructors: Config, Client, ProjectManager, GitManager, HostProxy, SocketBridge, Firewall |
Harness Functions
| Function | Purpose |
|---|---|
NewIsolatedFS(opts) | Creates isolated XDG dirs, builds clawker binary, registers cleanup |
Run(args...) | Fresh Factory → root.NewCmdRoot → execute (full Cobra pipeline) |
RunInContainer(agent, cmd...) | container run --rm --agent <agent> @ <cmd> |
ExecInContainer(agent, cmd...) | container exec --user claude --agent <agent> <cmd> |
ExecInContainerAsRoot(agent, cmd...) | container exec --agent <agent> <cmd> (root) |
NewFactory(t, opts) | Constructs Factory with lazy singletons; returns in/out/err buffers |
Cleanup
NewIsolatedFS registers a single cleanup chain:
- Stop daemons (firewall down, host-proxy stop)
- Remove shared firewall containers (clawker-envoy, clawker-coredns)
- Remove test-labeled containers, volumes, networks (by
dev.clawker.test.namelabel)
clawker.log and firewall.log from the test’s state dir.
Project Test Double Scenarios
Useinternal/project/mocks/stubs.go to pick the lightest project dependency double:
| Need | Helper | What You Get |
|---|---|---|
| Pure behavior mock, no config/git I/O | projectmocks.NewMockProjectManager() | Panic-safe ProjectManagerMock with no-op defaults, easy per-method overrides |
| Mock project with identity | projectmocks.NewMockProject(name, repoPath) | ProjectMock with read accessors (Name, RepoPath, Record) populated; mutation methods return zero values |
| Isolated file-backed config + real PM | projectmocks.NewTestProjectManager(t, gitFactory) | Real ProjectManager backed by testenv; supports Register/Remove/List round-trips |
Key Conventions
- All tests must pass before any change is complete —
make testat minimum - No build tags — test categories separated by directory
- Always use
t.Cleanup()for resource cleanup - Use
context.Background()in cleanup functions — parent context may be cancelled - Unique agent names — include timestamp + random suffix for parallel safety
- Never import
test/e2e/harnessin co-located unit tests — too heavy (pulls Docker SDK) - Never call
factory.New()in tests — construct&cmdutil.Factory{}struct literals directly - Docker resource labeling — all test resources carry
dev.clawker.test=true+dev.clawker.test.name=TestName; whail tests usecom.whail.test.managed=true - Use
make test-cleanto remove leaked Docker resources from failed test runs