Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

Durable execution for AI agents: are your controls keeping up?


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 2182
Topic starter  

TL;DR: AI agents that run for minutes or hours can lose state, repeat work, or fail mid-task unless their execution is durably recorded, according to WorkOS’s interview with Temporal co-founder Maxim Fateev. The governance question is not whether agents are clever, but whether their runtime assumptions survive crashes, retries, and intervention windows.

NHIMG editorial — based on content published by WorkOS: Maxim Fateev on why durable execution matters for AI agents

Questions worth separating out

Q: How should security teams govern AI agents that run long, multi-step workflows?

A: Security teams should require durable execution, full event history, and clear ownership for every multi-step agent workflow that touches sensitive data or privileged tools.

Q: Why do AI agents complicate access governance more than ordinary automation?

A: AI agents complicate access governance because they can branch at runtime, wait on external services, and continue later with the same operational context.

Q: What breaks when AI workflows cannot survive crashes or restarts?

A: When AI workflows cannot survive crashes or restarts, completed steps are lost, retries become manual, and teams risk duplicate actions or incomplete records.

Practitioner guidance

  • Inventory agent workflows that cannot survive interruption Map every AI agent flow that loses state on crash, restart, or worker reschedule.
  • Require persisted event history for all privileged agent tasks Make complete step history mandatory for any workflow that touches customer data, secrets, or production systems.
  • Separate business logic from retry and recovery logic Keep workflow decisions in the agent code and place retries, persistence, and scheduling in the orchestration layer.

What's in the full article

WorkOS's full interview covers the operational detail this post intentionally leaves for the source:

  • How Temporal records workflow history and resumes execution after worker failure
  • Why distributed systems patterns matter for AI agent reliability in production
  • How event history supports debugging, observability, and intervention during long-running runs
  • What teams should consider when choosing an orchestration layer for agentic workloads

👉 Read WorkOS's interview on durable execution for AI agents →

Durable execution for AI agents: are your controls keeping up?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 4 weeks ago
Posts: 742
 

Durable execution is becoming an identity control surface for AI agents. Once an agent can run for extended periods, branch at runtime, and survive failures, the workflow engine is no longer just infrastructure. It becomes the place where state, retries, and evidence are preserved. That makes it relevant to identity governance because the organisation can only govern what it can later reconstruct, and reconstruction depends on durable event history.

A few things that frame the scale:

  • 80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
  • Only 44% of organisations have implemented any policies to govern AI agents, despite 92% agreeing that governing them is critical to enterprise security, according to SailPoint research.

A question worth separating out:

Q: How do teams know if durable execution is actually working for agents?

A: Teams know durable execution is working when a workflow can resume from a failure point without redoing completed work, and when the event history shows every decision and activity in order. If the only recovery option is to restart from scratch, the control is not real.

👉 Read our full editorial: Durable execution is becoming a control plane for AI agents



   
ReplyQuote
Share: