Dockerfile plus a drive
script that reads the task, runs the agent, and prints its answer to stdout.
Archal ships a few, and you can package your own. You run one with a single flag:
api.github.com
and api.openai.com exactly as it would in production.
The mental model
A packaged agent runs in its own container. Alongside it, Archal runs a sidecar that owns all of the container’s networking and transparently intercepts outbound HTTPS by DNS + TLS man-in-the-middle (it writes its CA into the agent container and the harness trusts it). Two kinds of domains are handled differently:- Service domains (
api.github.com,api.stripe.com, …) are routed to the matching clone, so the agent’s real API calls hit a fake, evidence-scoped service instead of production. - Model-provider domains (
api.openai.com,api.anthropic.com,generativelanguage.googleapis.com) are intercepted too: the agent container only ever receives a placeholder provider key, and the sidecar swaps in your real host key (for whichever provider keys you set) before forwarding to the real provider. The agent runs against a real model without ever seeing your key.
github-octokit
is a deterministic Octokit script, while openclaw and hermes drive a model.)
The result: a real, unmodified agent drives a real model against fake
services, and every clone tool call is captured in the trace and scored.
Bundled agents
List the agents that ship with the CLI:| Agent | What it is | Clone |
|---|---|---|
openclaw | The real OpenClaw gateway agent (successor to the legacy --sandbox path) | github |
hermes | A full third-party Stripe support agent | stripe |
github-octokit | A thin single-file Octokit GitHub agent | github |
examples/agents/<name> and is bundled into the published
archal package, so --agent <name> resolves whether you run from a source
checkout or an installed CLI.
--agent <name|dir>
--agent accepts either a bundled name or a path to your own packaged-agent
directory (one holding a Dockerfile + drive script):
--agent ./dir is equivalent to
--harness ./dir --dockerfile ./dir/Dockerfile. Name the agent once: --agent
is mutually exclusive with --sandbox, --harness, and --dockerfile.
You can also declare the agent in .archal.json so a bare archal run scenario.md
picks it up. See Run scenarios against your agent for the
config shape and the Docker harness contract for
the image/stdout/env contract every packaged agent follows.
Choosing the model
--agent-model sets AGENT_MODEL in the container. A model-driven agent reads it
to pick its model (OpenClaw does); an agent that selects its own model, or a
deterministic one, ignores it. Use a provider-prefixed id and set the key for
that provider on the host (OPENAI_API_KEY, ANTHROPIC_API_KEY, or
GEMINI_API_KEY):
openai-responses, anthropic-messages,
google-generative-ai) so non-OpenAI models reach their native paths
(/v1/messages, :generateContent) instead of OpenAI-compatible
/chat/completions; a packaged agent you write calls whatever endpoint its own
SDK uses. Either way the sidecar intercepts the domain and swaps the placeholder
key for your real one. (-m, --model is a separate flag: it sets the
evaluator model, not the agent’s.)
OpenClaw home and eval mode
OpenClaw keeps a couple of dedicated options (other packaged agents do not need them):--openclaw-home <dir>: mount your own OpenClaw home (auth profiles, extensions, persona) read-only into the agent. If the path does not exist, Archal warns and falls back to the bundled persona rather than silently ignoring the flag.--openclaw-eval-mode <isolated|stateful>:isolateddrops the business-tool plugins for a clean eval; both modes surface the mode to the drive script.
Common failures
Docker is not available: packaged agents run in Docker so the sidecar can control DNS and TLS. Start Docker and retry.- The image fails to build: the error includes the Docker build log; the agent
Dockerfile is at
<agent-dir>/Dockerfile. - Missing provider key: set the env var matching your
--agent-modelprovider (OPENAI_API_KEY/ANTHROPIC_API_KEY/GEMINI_API_KEY). - No interception observed: the trace shows zero clone tool calls; confirm the agent actually called the service domain and that the scenario targets the agent’s declared clone.
Go deeper
- Run scenarios against your agent: wire your own agent +
.archal.json - Docker harness contract: the image/env/stdout contract
- Sandbox mode: the OpenClaw-specific options in full
