Testing Guide
Overview
Clawker uses a multi-tier testing strategy with no build tags — test categories are separated by directory.| Category | Directory | Docker Required | Purpose |
|---|---|---|---|
| Unit | *_test.go (co-located) | No | Pure logic, fakes, mocks |
| CLI | test/cli/ | Yes | Testscript-based CLI workflow validation |
| Commands | test/commands/ | Yes | Command integration (create/exec/run/start) |
| Internals | test/internals/ | Yes | Container scripts/services (firewall, SSH) |
| Whail | test/whail/ | Yes + BuildKit | BuildKit integration, engine-level builds |
| Agents | test/agents/ | Yes | Full agent lifecycle, loop tests |
Running Tests
Running Specific CLI Tests
Golden File Testing
Some tests compare output against golden files. To update after intentional changes:Fawker Demo CLI
Fawker is a demo CLI with faked dependencies and recorded scenarios — no Docker required. Use it for visual UAT:Writing Tests
Test Infrastructure
Each package in the dependency DAG provides test utilities so dependents can mock the entire chain:| Package | Test Utils | Provides |
|---|---|---|
internal/docker | dockertest/ | FakeClient, fixtures, assertions |
internal/config | configtest/ | InMemoryRegistryBuilder, InMemoryProjectBuilder |
internal/git | gittest/ | InMemoryGitManager |
pkg/whail | whailtest/ | FakeAPIClient |
internal/iostreams | iostreamstest/ | iostreamstest.New() |
Command Test Pattern
Commands are tested using the Cobra+Factory pattern withdockertest.FakeClient:
Three Test Tiers for Commands
| Tier | Method | What It Tests |
|---|---|---|
| 1. Flag Parsing | runF trapdoor | Flags map correctly to Options fields |
| 2. Integration | nil runF + fake Docker | Full pipeline (flags + Docker calls + output) |
| 3. Unit | Direct function call | Domain logic without Cobra or Factory |
Test Harness (test/harness/)
For integration tests with real Docker:
Config Builder Presets
Key Conventions
- All tests must pass before any change is complete —
make testat minimum - No build tags — test categories separated by directory
- Always use
t.Cleanup()for resource cleanup - Use
context.Background()in cleanup functions — parent context may be cancelled - Unique agent names — include timestamp + random suffix for parallel safety
- Never import
test/harnessin co-located unit tests — too heavy (pulls Docker SDK) - Never call
factory.New()in tests — construct&cmdutil.Factory{}struct literals directly