Docker harness contract

Applies when you pass --docker to archal run. Service simulation outside Docker or sandbox mode is low-fidelity debug only. In Docker mode, Archal builds and runs your repo in Docker, injects env vars describing the scenario, and captures stdout as the agent response.

How it works

Archal resolves the repo-local harness you passed to --harness.
Archal builds your image from the repo-root Dockerfile (or a generated one if none exists).
Archal starts a TLS intercept sidecar that transparently forwards clone API traffic to the hosted cloud clones and adds the session auth header.
Archal runs the container, passing all env vars described below and mounting an output directory at /agent-output/.
Your container exits. Archal reads stdout as the agent response, stderr is logged, and non-zero exit code marks the run as failed.

Environment variables

All values are strings.

Required

Variable	Contains
`AGENT_TASK`	The scenario `## Prompt` text. `## Setup` seeds the clone and is not included; `## Expected Behavior` is the evaluator holdout and is also excluded.
`AGENT_RUN_MODE`	Always `"local"`. Identifies this as a local harness run.
`AGENT_METRICS_FILE`	Absolute path to `/agent-output/metrics.json`. Write a JSON metrics payload here before exiting (see Metrics below).
`AGENT_TRACE_FILE`	Absolute path to `/agent-output/agent-trace.json`. Write a trace payload here before exiting (optional).
`NODE_EXTRA_CA_CERTS` / `SSL_CERT_FILE` / `REQUESTS_CA_BUNDLE`	CA bundle paths for clients that need to trust the TLS intercept sidecar.

The harness should call normal service domains and use normal SDK credentials. Archal-owned clone URLs, MCP server configs, and bearer tokens stay outside the container process.

Optional - present when set by the caller

Variable	Contains
`AGENT_MODEL`	Model identifier the harness should use, e.g. `claude-sonnet-4-6`. Set by `--agent-model` or `-m`.
`AGENT_SESSION_ID`	Session ID for the current run.
`ANTHROPIC_API_KEY`	Forwarded from the host environment if set.
`OPENAI_API_KEY`	Forwarded from the host environment if set.
`GEMINI_API_KEY`	Forwarded from the host environment if set.
`NODE_ENV`	Forwarded from the host environment if set.
`AGENT_CLONE_URLS`	JSON map of clone names to service-shaped base URLs reachable inside the container, e.g. `{"stripe":"https://api.stripe.com"}`. Most harnesses can ignore this and call normal SDK defaults.

AGENT_CLONE_URLS never contains hosted Archal clone URLs or Archal bearer tokens in Docker mode. If a custom client reads it, it should still make normal service-domain requests; the sidecar handles routing and run auth.

TLS trust vars

These are injected automatically for clients that do not use the container trust store. They do not route service traffic; Docker networking maps real service domains to the TLS intercept sidecar.

Variable	Value
`NODE_EXTRA_CA_CERTS`	`/agent-output/ca.crt`
`SSL_CERT_FILE`	`/agent-output/ca.crt`
`REQUESTS_CA_BUNDLE`	`/agent-output/ca.crt` (for Python `requests`)
`CURL_CA_BUNDLE`	`/agent-output/ca.crt`

Node.js, Python requests, and curl all respect these vars out of the box. Other runtimes may need explicit CA configuration pointing to /agent-output/ca.crt.

Mounted files

The directory /agent-output/ is bind-mounted into the container. It contains:

File	Contents
`ca.crt`	PEM-encoded CA certificate for the intercepting proxy. Trust this cert in any runtime that does not read the standard env vars above.

Service access

Call normal service domains with normal SDKs or REST clients. For example, GitHub harnesses can use gh, Octokit, or native fetch with signal: AbortSignal.timeout(15000); Slack harnesses can use https://slack.com/api/...; Stripe harnesses can use https://api.stripe.com/.... Archal routes supported service traffic to the scenario clones and applies the run credential.

Output contract

Stream / code	Meaning
stdout	Captured as the agent response text. Write your final answer or summary here.
stderr	Logged by Archal for debugging. Write progress, tool call results, and diagnostics here.
exit 0	Run succeeded. Archal proceeds to evaluation.
exit non-zero	Run failed. Archal marks the run as an error and skips evaluation for this run.

Your harness should write its final answer to stdout exactly once, at the end of execution. Progress output, tool call logs, and error messages should go to stderr.

Metrics file (optional)

Write a JSON payload to $AGENT_METRICS_FILE before exiting to surface user-agent token usage in the run report and dashboard:

{
  "version": 1,
  "inputTokens": 12400,
  "outputTokens": 830,
  "llmCallCount": 7,
  "toolCallCount": 14,
  "toolErrorCount": 0,
  "totalTimeMs": 18240,
  "exitReason": "completed",
  "provider": "anthropic",
  "model": "claude-sonnet-4-6"
}

Archal reads the file after the container exits. If the file is absent or malformed, metrics are not inferred from the black-box harness; the run shows not reported by harness instead of token counts. This does not cause a run failure. Archal-managed evaluator and seed-generation tokens are not included in customer-visible token usage. exitReason should be one of: completed, max_steps, no_tool_calls, consecutive_errors, llm_error.

Minimal harness example

#!/usr/bin/env node
// minimal-harness.mjs

const task  = process.env.AGENT_TASK;
const model = process.env.AGENT_MODEL;

// Run your normal agent loop here. Service SDKs should use their standard
// domains and env vars; the proxy handles routing and credentials.

process.stdout.write('Agent completed the task.\n');

​How it works

​Environment variables

​Required

​Optional - present when set by the caller

​TLS trust vars

​Mounted files

​Service access

​Output contract

​Metrics file (optional)

​Minimal harness example