TL;DR: Slack-first incident creation, automatic paging, and AI root-cause analysis have shifted incident management from rare crisis handling to continuous operational response, according to WorkOS’s conversation with Incident.io CTO Chris Evans, including a multi-agent system that needed 18 months of ground-truthing before it became useful. The central lesson is that fast AI workflows without rigorous evaluation produce convincing demos, not dependable incident governance.
NHIMG editorial — based on content published by WorkOS: Incident.io is redefining what an incident can be
Questions worth separating out
Q: How should security teams govern AI-assisted incident response workflows?
A: Security teams should govern AI-assisted incident response as delegated authority, not as a convenience feature.
Q: Why do incident workflows need identity governance as much as operational runbooks?
A: Incident workflows depend on trusted identities to create channels, page responders, and move state between tools.
Q: What breaks when AI root-cause analysis is used without ground truth?
A: Without ground truth, AI root-cause analysis can sound persuasive while being operationally wrong.
Practitioner guidance
- Define who can trigger incident workflows Limit incident creation, channel spawning, and paging authority to roles that genuinely own urgent reactive work.
- Validate AI summaries against labelled incident history Use a curated set of past incidents with known outcomes to test whether AI summaries improve triage, not just readability.
- Treat incident orchestration as governed delegation Document which identities can open incidents, call tools, notify customers, and modify the response state.
What's in the full article
WorkOS's full interview covers the operational detail this post intentionally leaves for the source:
- How Incident.io structures Slack-first incident creation, paging, and customer communication in live environments
- How the multi-agent root-cause system was evaluated against real incident data over an 18-month build cycle
- How the team used labelled incidents to separate useful analysis from confident but incorrect outputs
- How incident response workflows are adapted for continuous operational use instead of rare crisis events
👉 Read WorkOS's interview on Incident.io's approach to AI-assisted incident response →
AI incident response at scale: are your controls keeping up?
Explore further
Incident response is becoming an identity problem, not just an operations problem. The article shows that the value of incident tooling comes from low-friction delegation: triggering, paging, channel creation, and customer communication all depend on trusted identities and pre-authorised workflows. Once response is continuous, the identity layer is no longer background plumbing. It becomes the mechanism that determines whether urgent work moves fast or becomes chaotic. Practitioners should treat incident orchestration as governed access to action, not only a communications process.
A few things that frame the scale:
- the average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
A question worth separating out:
Q: How do organisations know if incident automation is actually helping?
A: They know it is helping when it reduces time to correct action, not just time to generate output. Measure whether the system improves triage accuracy, lowers rework, and shortens the path to a verified root cause. If it only creates cleaner summaries, it is documentation support, not operational intelligence.
👉 Read our full editorial: Incident.io shows why AI incident response needs real evaluation