AI-powered development toolkit for Claude Code
38 agents · 53 skills · 29 hooks — 12 installable bundles
Get Started
Prerequisites
- Claude Code installed and working
- Node.js 22+ (for
npx arthai-activate) - Git
- macOS or Linux
- An Arth AI license key (
ARTH-XXXX-XXXX-XXXX-XXXX) - For observability: Docker Desktop running, ports
4319,3100,5432available onlocalhost
Eight steps from zero to your first AI-assisted session.
Get a license key
Email productive@getarth.ai to request your license key.
Activate
Add the marketplace
Install a bundle
prime is the everything bundle — all agents, skills, and hooks. To pick a focused bundle instead, see the specific bundles.
Enable auto-updates
Keeps your toolkit up to date automatically — new agents, skills, and fixes land without manual intervention.
Enable observability · optional · experimental, limited preview
Optional — you can skip this and jump to Step 7. The toolkit works without observability. Come back here whenever you want a dashboard view of what Claude Code is doing.
If you installed prime, /otel-setup is already available. Otherwise install sentinel@arthai-marketplace first. Pick Local in the prompt — starts a Docker container (engine on :4319, dashboard on :3100) and writes CLAUDE_CODE_ENABLE_TELEMETRY=1 + OTLP env to .claude/settings.local.json.
Then restart your Claude Code session so it picks up the new env block — without the restart, the env vars aren't loaded and traces won't flow.
Then verify it’s working — do these in order:
- Open the dashboard. Go to http://localhost:3100 in your browser. You should see the Arth Intelligence UI (Sessions / Traces / Insights tabs). If the page doesn’t load, run
docker ps— you should seearthai-intelligenceandarthai-db. If they’re missing, rundocker compose -f ~/.arthai/docker-compose.yml up -d. - Generate some activity. Back in Claude Code, run any prompt — even something trivial like “what’s in package.json?”. The toolkit emits trace spans for every prompt, tool call, agent spawn, and stop event.
- Refresh the dashboard. Your session appears in the Sessions list with a recent timestamp. If nothing shows after 10 seconds, check the engine:
curl -s http://localhost:4319/api/health. - Click into your session. You see a waterfall of spans — your prompt, tool calls, agent spawns — each with duration and metadata.
- Confirm cost columns are populated. If
cost_usd/ token columns show values, native OTEL is flowing — you’re done. If they show—or are empty, only the toolkit hook is on. Check:grep CLAUDE_CODE_ENABLE_TELEMETRY .claude/settings.local.jsonShould print"CLAUDE_CODE_ENABLE_TELEMETRY": "1". If missing, re-run/otel-setup, pick Local again, then restart Claude Code.
Observability is in active development — expect rough edges. Full guide →
Calibrate
Deep-learns your project’s architecture, patterns, and domain. Builds a knowledge graph that all agents query.
Start a new session
Restart your Claude Code session so the knowledge graph gets built and the OTEL env block is picked up. Then run /onboard for a prioritized work briefing.
Observability · Experimental, limited preview
The toolkit ships an OTEL hook that emits a span for every prompt, tool call, agent spawn, skill invocation, and stop event. Paired with Claude Code's native OTEL, you get cost USD and token data on those spans — both streams flow into the same local Arth Intelligence container.
Prerequisites
- Docker Desktop installed and running
- Ports 4319, 3100, 5432 available on
localhost(engine, dashboard, postgres) - The sentinel plugin installed (or prime, which includes sentinel)
Setup (one time)
The skill verifies Docker is running, writes ~/.arthai/docker-compose.yml, starts the engine + dashboard + Postgres + Watchtower auto-updater, and writes the OTEL env vars to .claude/settings.local.json (project-local, git-ignored):
CLAUDE_CODE_ENABLE_TELEMETRY=1 OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4319 OTEL_EXPORTER_OTLP_PROTOCOL=http/json OTEL_METRICS_EXPORTER=otlp OTEL_LOGS_EXPORTER=otlp OTEL_TRACES_EXPORTER=otlp
Why CLAUDE_CODE_ENABLE_TELEMETRY=1 is required
Two telemetry streams, complementary:
| Stream | Source | Carries |
|---|---|---|
| Trace spans | toolkit otel-telemetry hook | session, prompt, tool calls, agent spawns, skill invocations, stop events |
| Cost + token metrics | Claude Code native OTEL (env-gated) | per-call cost USD, input/output/cache tokens, model |
If native OTEL is off, traces still flow but the dashboard's cost and token columns stay empty.
Verify
After /otel-setup finishes, start a new Claude Code session in the project, run any prompt, then visit:
- Dashboard: http://localhost:3100 — Sessions, Traces, Insights tabs
- Engine health:
curl -s http://localhost:4319/api/health | jq .
If the dashboard shows your session with non-empty cost, you're done.
What survives a reboot
After a Mac reboot or Docker Desktop restart:
- ✅ Env vars in
.claude/settings.local.json— file on disk, picked up on next session - ✅ Compose file at
~/.arthai/docker-compose.yml— file on disk - ✅ Trace data in the
arthai_dataDocker volume — preserved across container restarts - ✅ Engine + DB + Watchtower containers — auto-restart because the compose template sets
restart: unless-stoppedon every service - ⚠ Docker Desktop itself — depends on a per-user OS toggle (Settings → General → "Start Docker Desktop when you log in"). We can't set this for you.
Quick verify after a reboot:
Should show three running containers (arthai-intelligence, arthai-db, arthai-watchtower). If any are missing:
Migration for existing customers (set up before the reboot-durability fix landed) — your engine and DB may have RestartPolicy: no. One-line fix, no data loss:
Or re-run /otel-setup → Local. The skill detects the legacy compose, prints the migration command, and the new template overwrites ~/.arthai/docker-compose.yml with the right policy on every service.
Updating & disabling
The Watchtower sidecar pulls the latest arthai/intelligence:latest image once a day. Trace data lives in the arthai_data Docker volume and is preserved across updates. Force-update now: docker compose -f ~/.arthai/docker-compose.yml pull && docker compose -f ~/.arthai/docker-compose.yml up -d.
To disable telemetry: export OTEL_DISABLED=true or remove the env block from .claude/settings.local.json. To opt out of auto-restart: docker update --restart no arthai-db arthai-intelligence arthai-watchtower.
What Can I Do?
Pick an intent to see the recommended workflow.
Learn
Set up your project and learn the codebase
Build
Plan features, implement with an AI team, ship PRs
Fix
Debug bugs, repair CI, triage incidents
Test
4-layer QA, E2E generation, visual regression, 8 agents
Automate
Autonomous mode and event-driven remediation
Learn Your Project
New to a project? /calibrate deep-scans your codebase — architecture, conventions, stack, domain model. Then /onboard gives you a prioritized briefing: what’s broken, what’s waiting, what to work on.
/calibrate
.claude/project-profile.md. Run once per project./onboard
Build a Feature
Full feature development with an adversarial planning team, parallel implementation agents, QA, and automated PR creation.
/planning
--lite, --fast, full./implement
/qa commit
/pr
Fix a Bug
6-step formal pipeline. Root cause analysis, scope lock, behavior contract, fix, verify, PR.
Hotfix mode
--hotfix for production emergencies (skips non-essential steps).Severity levels
--severity critical|high|medium to set priority.Fix CI
Auto-reads CI failure, diagnoses root cause, patches, resubmits. 3 retry attempts. Discord alert if all fail.
/ci-fix details
ci, staging, and prod targets.Quality Assurance
Four-layer test strategy with 8 specialized agents. Commit mode for fast checks, full mode for comprehensive validation, plus opt-in E2E generation and visual regression.
Four-Layer Test Strategy
1. Baseline Tests
Existing test suites — regression anchor, same every run
2. Generated Scenarios
Fresh every run — thinks like real users based on the diff
3. Property-Based Tests
Infer invariants from code changes, test with random/edge-case inputs
4. Coverage Audit
Reviews if existing tests still match the codebase
Modes
/qaCommit mode — targeted checks on changed files (~1-3 min)/qa fullComprehensive — all checks across full codebase (~10-20 min)/qa stagingHealth + smoke + E2E against deployed staging/qa prodRead-only health + smoke against production/qa e2e-genGenerate exploratory Playwright tests for changed components (opt-in)/qa visualComputer-use visual regression at desktop + mobile viewports (opt-in)QA Agents (8)
Orchestrator — spawns sub-agents, collects results, produces report
Playwright E2E tests for user workflows
Generates exploratory Playwright tests from diffs
Computer-use visual regression (desktop + mobile)
Adversarial red-teaming of test plans
Domain logic validation (state machines, constraints)
Manages test snapshots and golden files
Promotes generated tests that caught bugs to baselines
Typical Flow
Related skills
/qa-incident — log a QA incident from a known issue/qa-learn — review QA knowledge base stats, prune stale entries/ci-fix — auto-remediate CI failures (3 retry attempts)Ship Code
/precheck runs tests locally in ~30s. /qa validates changed files. /pr creates the PR.
/precheck
/qa commit
/pr
Autonomous Mode
Fully autonomous. Picks highest-priority unassigned issue, plans, implements, QAs, PRs. Stops for merge approval.
/autopilot details
Research & Knowledge
Build curated topic wikis. Init scaffolds, ingest processes sources, query synthesizes answers, lint health-checks.
init
ingest
query
lint
Operations
Health checks, log tailing, deploy watching, incident triage, server restarts.
/sre statusHealth check across all services and infrastructure/sre logsTail and analyze logs from running services/sre watchWatch an active deployment for issues/incidentClassify severity, diagnose in parallel, route to the right fix skill/restartDiscover, restart, and validate local dev servers with health checksEvent-Driven Monitors
Instead of polling ("check CI every 5 minutes"), monitors sleep until something happens and then wake the toolkit to respond automatically. Zero API calls while idle — you only pay when an event fires.
How it works
/calibrate detects your stackGitHub Actions, Railway, Sentry...
.claude/monitors/you add webhook URL on platform
/ci-fix, /sre, /qa, /fix
Each monitor is a JSON config in .claude/monitors/. Calibrate generates them from templates in monitors/ (repo root), adapted to your project's platform and branch.
Available monitor templates
github-ci.json
When CI fails on any non-default branch, /ci-fix automatically reads the failure log, diagnoses the issue, patches, and resubmits. Up to 3 retries with different strategies.
workflow_run.conclusion == failure
deploy-health.json
When a deploy fails or a service crashes, /sre debug investigates logs, checks health endpoints, and attempts remediation before paging you.
status == FAILED or CRASHED
staging-qa.json
When staging deploys successfully, /qa staging automatically runs the full QA suite against the live staging environment. No manual trigger needed.
status == SUCCESS and environment == staging
runtime-errors.json
When runtime errors exceed a threshold (e.g., 5+ occurrences of a new error), /fix auto-runs with the error details. Includes 60-min cooldown and deduplication by fingerprint.
occurrences >= 5 and status == unresolved
Setup
# Step 1: Run /calibrate — it auto-detects your stack and generates
# the right monitor configs in .claude/monitors/
/calibrate
# Step 2: Calibrate shows which monitors it generated:
# ✓ .claude/monitors/github-ci.json (GitHub Actions detected)
# ✓ .claude/monitors/deploy-health.json (Railway detected)
# ✗ runtime-errors.json (no Sentry/Datadog detected)
# Step 3: Add webhook URLs on your platforms
# GitHub: Repo Settings → Webhooks → paste monitor endpoint URL
# Railway: Project Settings → Webhooks → paste monitor endpoint URL
# Sentry: Settings → Integrations → Webhooks
# Step 4: Set webhook secrets in your environment
export GITHUB_WEBHOOK_SECRET="your-secret-here"
export DEPLOY_WEBHOOK_SECRET="your-secret-here"
# Done. Events fire → toolkit responds automatically.
Safety: Monitors include built-in loop guards — after 3 failed auto-fix attempts on the same branch, the monitor suspends itself and sends a Discord alert instead of retrying forever.
Re-running: If you add Sentry or change deploy platforms later, run /calibrate rescan to regenerate monitors for the new stack.
Consulting
Full consulting engagement pipeline. Discovery, assessment, opportunity mapping, solution design, deliverables.
/client-discovery
/consulting
/opportunity-map
/solution-architect
/deliverable-builder
Design
Add --design to planning for UX research and design critique before implementation.
Design workflow details
--design flag adds UX researcher and design critic agents to the planning team. They conduct user journey mapping, heuristic evaluation, and accessibility review before any code is written.All Agents (38)
Bundles (12)
Bundles are curated packages of agents, skills, and hooks that work together. Install one bundle to get a complete workflow. Bundles compose — install multiple without conflicts.
Which bundle should I pick?
atlascanvascompasscounselcruiseforgeprimeprismscalpelsentinelshieldsparkAll bundles
Hooks (29)
Lifecycle hooks fire automatically at key moments. No manual invocation needed.