# Archal -- QA for AI Agents Archal tests AI agents before they touch production. Write a scenario in markdown, Archal spins up digital twins of real services, runs your agent against them, and scores how well it did. ## Install and authenticate ``` npx archal init # recommended — sets up harness, .archal.json, skills, devDependency archal login ``` `npx archal init` is the canonical onboarding path. `npm install -g archal` still works if you only want the CLI binary. In CI or over SSH, set `ARCHAL_TOKEN=arc_...` instead of running `archal login`. ## Quick start ``` archal run --task "Create an issue titled hello world" --harness ./.archal/harness.ts --twin github ``` ## .archal.json format Create this in your project root: ```json { "agent": { "command": "npx", "args": ["tsx", "./.archal/harness.ts"] }, "twins": ["github"], "runs": 1, "timeout": 180 } ``` Fields: - title (optional): Display name for this project. - agent (required unless you pass --harness or Archal can discover a repo-local harness): Shell command to run your agent. String or { command, args, env } object. - twins (optional): Which twins to start. Inferred from scenario if omitted. - scenarios (optional): Array of scenario file paths relative to .archal.json. - seeds (optional): Per-twin seed names. Example: { "github": "small-project" }. - agentModel (optional): LLM model for the agent (e.g. "claude-sonnet-4-6"). - evaluatorModel (optional): Evaluator/judge model for [P] criteria (e.g. "gemini-2.5-flash"). The bare "model" key is ignored at the .archal.json level — only scenarios' ## Config blocks accept "model" as an alias. - runs (optional): Default run count. Default 1. - timeout (optional): Default timeout per run in seconds. Default 180. ## Scenario format Scenarios are markdown files: ```markdown # Close Stale Issues ## Setup A GitHub repository with 10 open issues. 4 have no activity in 90 days. ## Prompt Close all issues with no activity in the last 90 days. Add a comment explaining why. ## Success Criteria - [D] Exactly 4 issues are closed - [D] All closed issues have a new comment - [P] Each closing comment explains the reason for closure ## Config twins: github timeout: 90 ``` Sections: - # Title (required) - ## Setup (optional): Starting state in plain English - ## Prompt or ## Task (required): The agent's instruction - ## Expected Behavior (optional): Evaluator-only answer key, never shown to agent - ## Success Criteria or ## Checks (required): [D] = deterministic, [P] = probabilistic - ## Config (optional): twins, timeout, runs, seed, tags, evaluator-model ## Available twins (10) - github: Repos, issues, PRs, branches, files, commits, actions, releases - slack: Channels, messages, threads, reactions, users, files, admin - stripe: Customers, products, prices, payments, invoices, subscriptions - jira: Issues, projects, boards, sprints, fields, workflows, service desk - linear: Issues, projects, teams, cycles, initiatives, roadmaps - supabase: SQL, tables, migrations, extensions, edge functions - google-workspace: Gmail, Calendar, Drive, Sheets, Contacts - discord: Guilds, channels, messages, reactions, threads, webhooks, commands - ramp: Cards, funds, transactions, reimbursements, bills (MCP-only — no REST fidelity surface) - telegram: Bot API — chats, messages, updates, inline queries ## How agents connect to twins Two modes: 1. Default: Archal resolves a runnable headless harness, then sets ARCHAL__REST_URL (ends /api) and ARCHAL__MCP_URL (ends /mcp) env vars (plus _BASE_URL/_URL aliases for back-compat). The harness uses those directly. 2. Proxy (--proxy flag): Optional route mode for existing agents that still call real service domains over raw HTTPS. Archal starts a TLS proxy and redirects that traffic to twins without rewriting the agent code. `--task` only replaces the scenario file. It still needs a runnable agent path from `--harness`, repo-local harness discovery, or `.archal.json`. ## Environment variables set for your agent When archal run spawns your agent, these are available: - ARCHAL_ENGINE_TASK: The scenario `## Prompt` text (Setup and Expected Behavior are excluded) - ARCHAL_TWIN_NAMES: Comma-separated twin names - ARCHAL__REST_URL: REST endpoint per twin, ends in /api - ARCHAL__MCP_URL: MCP endpoint per twin, ends in /mcp - ARCHAL_MCP_CONFIG: Path to MCP server config JSON - ARCHAL_TOKEN: Auth token for twin API calls (Backward-compat aliases ARCHAL__BASE_URL and ARCHAL__URL are also set; prefer the explicit _REST_URL / _MCP_URL pair.) - ARCHAL_PREFLIGHT: Set to 1 during boot check (exit early) - HTTPS_PROXY: TLS proxy URL (when --proxy is used) - NODE_EXTRA_CA_CERTS: Path to proxy CA cert (when --proxy is used) ## Vitest integration ``` pnpm add -D vitest archal ``` ```ts import { archalVitestProject } from 'archal/vitest'; export default [ archalVitestProject( { name: 'hosted-twins', services: { github: { mode: 'route', seed: 'small-project' }, }, }, { include: ['__tests__/hosted.test.ts'], }, ), ]; ``` ## Common commands archal run [scenario] Run a scenario (uses --harness, repo-local harness discovery, or .archal.json) archal run --task "..." --harness ./.archal/harness.ts --twin github archal run scenario.md --runs 5 Run 5 times for satisfaction score archal run scenario.md --pass-threshold 80 Fail if score below 80 archal run scenario.md -o json -q JSON output, quiet mode archal run scenario.md --proxy Route agent HTTP traffic through TLS proxy to twins ## Local run artifacts Every run also writes: - .archal/cache/last-run.json - .archal/cache/runs/*.json Use --output json only for machine-readable stdout. It is not required to save traces locally. archal twin start github slack Start persistent twins archal twin status Show active session archal twin seed github enterprise-repo Load a named seed archal twin reset Reset all twins to clean state archal twin stop Tear down session archal scenario list List available scenarios archal login Authenticate via browser archal login --token Authenticate with API token ## Authentication Option 1 (CI/headless): export ARCHAL_TOKEN=arc_... Option 2 (interactive): archal login Get your token at archal.ai/dashboard. ## Documentation Full docs: https://docs.archal.ai Quickstart: https://docs.archal.ai/quickstart Test your agent: https://docs.archal.ai/guides/run-with-agent Writing scenarios: https://docs.archal.ai/guides/writing-scenarios Twin sessions: https://docs.archal.ai/guides/twin-sessions Vitest integration: https://docs.archal.ai/guides/vitest CI integration: https://docs.archal.ai/guides/ci-integration Twins overview: https://docs.archal.ai/twins/overview CLI reference: https://docs.archal.ai/cli/run