Skip to main content

Overview

Archal runs in any CI environment. Set your auth token, pick an output format, and set a pass threshold. If the agent’s satisfaction score drops below the threshold, the build fails.

Secrets

The only required secret is your Archal token:
ARCHAL_TOKEN=arc_...
If your agent needs a model API key, set that too:
ARCHAL_ENGINE_API_KEY=sk-...

GitHub Actions

name: Agent tests
on: [push]

jobs:
  archal:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - name: Install Archal
        run: npm install -g archal

      - name: Run scenarios
        env:
          ARCHAL_TOKEN: ${{ secrets.ARCHAL_TOKEN }}
          ARCHAL_ENGINE_API_KEY: ${{ secrets.ENGINE_API_KEY }}
        run: |
          archal run scenarios/close-stale-issues.md \
            --runs 3 \
            --pass-threshold 80 \
            -o json \
            -q

GitLab CI

archal:
  image: node:20
  script:
    - npm install -g archal
    - >
      archal run scenarios/close-stale-issues.md
      --runs 3
      --pass-threshold 80
      -o json
      -q
  variables:
    ARCHAL_TOKEN: $ARCHAL_TOKEN
    ARCHAL_ENGINE_API_KEY: $ENGINE_API_KEY

Useful flags

FlagWhat it does
--pass-threshold <score>Exit 1 if satisfaction is below this (0-100)
-o jsonMachine-readable JSON output
-qSuppress non-error output
-n, --runs <count>Run the scenario multiple times for a real satisfaction score
--tag <tag>Only run scenarios with a matching tag (exits 0 if no match)
--preflight-onlyValidate config and exit without running

Exit codes

CodeMeaning
0Score met the threshold (or scenario skipped by --tag)
1Score below threshold or runtime error
2Validation error (bad flags, missing scenario, invalid config)

Running multiple scenarios

If you have multiple scenario files, run them in a loop or use a suite in .archal.json:
{
  "agent": "npx tsx src/agent.ts",
  "twins": ["github"],
  "suites": {
    "ci": {
      "scenarios": [
        "scenarios/close-stale-issues.md",
        "scenarios/triage-new-issues.md"
      ],
      "runs": 3,
      "timeout": 120
    }
  }
}

Tips

  • Start with --pass-threshold 60 and tighten as your agent improves.
  • Use --runs 3 or higher. A single run can be noisy. Multiple runs give you a real satisfaction score.
  • Use -o json when you want to parse the output or store it as an artifact.
  • Set --timeout to something reasonable for CI. The default is 180 seconds per run.