RunbookAI is an open-source CLI agent that investigates production incidents using hypothesis-driven reasoning. When an alert fires, it forms ranked hypotheses, runs targeted queries against your infrastructure (AWS, Kubernetes, CloudWatch), and systematically narrows to root cause.
Key design decisions:
- Hypothesis branching/pruning instead of broad exploration. Forms 3-5 hypotheses, tests each with causal queries, prunes dead ends, branches deeper on strong evidence (max depth: 4).
- Every mutation requires human approval. Full audit trail of what the agent thought, what it queried, and why.
- Pulls context from your runbooks, postmortems, and architecture docs (Confluence, Notion, Google Drive, or local markdown). The agent doesn't investigate in a vacuum.
- Deep Claude Code integration — auto-injects relevant operational context into coding sessions via MCP.
Try the demo without any API keys:
npx @runbook-agent/runbook demoGitHub: https://github.com/Runbook-Agent/RunbookAI
Would love to hear any questions/feedback!