Skip to main content

Using a coding agent?

If you use Claude Code, Cursor, or a similar coding agent, install the Archal skills and let your agent handle the rest:
npx @archal/skills
The onboard skill walks you through setup, detects which services your project uses, and runs a first test. You can skip the rest of this page.

Install

npm install -g archal
Requires Node.js 20 or later.

Log in

archal login
This opens a browser window where you approve the CLI. Once approved, your credentials are saved locally and you won’t need to log in again on this machine. If you’re in a CI environment or working over SSH, you can grab an API token from the dashboard and set it as an environment variable instead:
export ARCHAL_TOKEN=arc_...

Run your first test

The fastest way to see Archal in action is an inline task. This starts a GitHub twin, runs the task against it, and scores the result:
archal run --task "Create an issue titled 'hello world'" --twin github
You should see Archal provision a twin session, execute the task, and print a satisfaction score. The whole thing takes about thirty seconds on a cold start and a few seconds after that.

Test your own agent

To test your own agent, create a .archal.json file in your project root. The agent field tells Archal how to run your code, and the twins field tells it which services to spin up:
{
  "agent": "npx tsx src/my-agent.ts",
  "twins": ["github"]
}
Then run a task against it:
archal run --task "Close all issues older than 90 days"
Archal reads your config, starts the twins, spawns your agent as a child process, and passes it the task text along with the twin API endpoints as environment variables. Your agent makes its API calls against the twins instead of production, and when it exits, Archal evaluates what happened.

Write a scenario

Inline tasks are good for quick smoke tests, but for anything you want to run repeatedly you should write a scenario file. Scenarios are markdown files that describe the starting state, the task, and what success looks like:
# Close Stale Issues

## Setup
A GitHub repository with 10 open issues. 4 of them have no activity in 90 days.

## Prompt
Close all issues with no activity in the last 90 days. Add a comment explaining why.

## Success Criteria
- [D] Exactly 4 issues are closed
- [D] All closed issues have a new comment
- [P] Each closing comment explains the reason for closure

## Config
twins: github
timeout: 90
Criteria tagged [D] are checked deterministically against the twin’s final state. Criteria tagged [P] are assessed by an LLM that reviews the trace and state. Run the scenario once to see if it works, then run it multiple times for a statistical satisfaction score:
archal run scenarios/close-stale-issues.md
archal run scenarios/close-stale-issues.md --runs 5
  • Test your agent covers the full .archal.json configuration, how the proxy works, and the environment variables your agent receives
  • Writing scenarios explains the scenario format in detail, including how to write good success criteria and how evaluation works
  • Twin sessions is for when you want persistent twins you can interact with manually during development
  • Twins overview lists every available twin and what it covers