Using a coding agent?
If you use Claude Code, Cursor, or a similar coding agent, install the Archal skills and let your agent handle the rest:Install
Log in
Run your first test
The fastest way to see Archal in action is an inline task. This starts a GitHub twin, runs the task against it, and scores the result:Test your own agent
To test your own agent, create a.archal.json file in your project root. The agent field tells Archal how to run your code, and the twins field tells it which services to spin up:
Write a scenario
Inline tasks are good for quick smoke tests, but for anything you want to run repeatedly you should write a scenario file. Scenarios are markdown files that describe the starting state, the task, and what success looks like:[D] are checked deterministically against the twin’s final state. Criteria tagged [P] are assessed by an LLM that reviews the trace and state. Run the scenario once to see if it works, then run it multiple times for a statistical satisfaction score:
What to read next
- Test your agent covers the full
.archal.jsonconfiguration, how the proxy works, and the environment variables your agent receives - Writing scenarios explains the scenario format in detail, including how to write good success criteria and how evaluation works
- Twin sessions is for when you want persistent twins you can interact with manually during development
- Twins overview lists every available twin and what it covers