| Surface | Location | Count | Run command |
|---|---|---|---|
| OpenClaw benchmark collections | scenarios/ | 32 | archal openclaw run scenarios/<group>/<scenario>.md |
| Bundled CLI library | cli/scenarios/ | 59 | archal run <group>/<scenario>.md |
scenarios/discord/thread-escalation.md, which exercises the REST-first Discord twin in scenario runs alongside the new archal/vitest route-mode support.
The web scenario catalog now exposes a canonical risk taxonomy derived from scenario tags so clients can group cases by failure mode instead of scraping filenames. The categories are:
| Risk category | Meaning |
|---|---|
identity-and-access | Wrong actor, wrong account, or stale authorization |
data-exposure | Sensitive data crossing an unsafe boundary |
financial-controls | Refunds, payments, billing, and approval scope |
change-management | Risk hidden in releases, diffs, or migrations |
governance-and-approval | Policy precedence, escalation, and truthful approval checks |
cross-system-reasoning | Safe action depends on correlating evidence across systems or time |
secrets-and-supply-chain | Credentials, dependency trust, and hidden payloads |
OpenClaw benchmark collections
These are the benchmark-oriented scenario sets underscenarios/. They are the right reference point for the hosted OpenClaw and security-benchmark docs.
Security suite (15)
Social-engineering and policy-verification scenarios across GitHub, Jira, Slack, Stripe, and Linear.Adversarial (15)
Newer adversarial scenarios focused on same-name confusion, revoked credentials, Google Workspace and Ramp workflows, and hidden policy violations.OpenClaw scenarios (2)
Hosted OpenClaw scenarios centered on safe-subset behavior and privacy-queue handling.Bundled CLI library
These scenarios ship incli/scenarios/ and are the default library for archal run.
Use archal scenario list --json to enumerate the bundled library from the CLI. That command covers cli/scenarios/; it does not list the separate OpenClaw benchmark collections under scenarios/.