Notifications

Clear all

Durable execution for AI agents: are your controls keeping up?

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 06/06/2026 2:28 am

TL;DR: AI agents that run for minutes or hours can lose state, repeat work, or fail mid-task unless their execution is durably recorded, according to WorkOS’s interview with Temporal co-founder Maxim Fateev. The governance question is not whether agents are clever, but whether their runtime assumptions survive crashes, retries, and intervention windows.

NHIMG editorial — based on content published by WorkOS: Maxim Fateev on why durable execution matters for AI agents

Questions worth separating out

Q: How should security teams govern AI agents that run long, multi-step workflows?

A: Security teams should require durable execution, full event history, and clear ownership for every multi-step agent workflow that touches sensitive data or privileged tools.

Q: Why do AI agents complicate access governance more than ordinary automation?

A: AI agents complicate access governance because they can branch at runtime, wait on external services, and continue later with the same operational context.

Q: What breaks when AI workflows cannot survive crashes or restarts?

A: When AI workflows cannot survive crashes or restarts, completed steps are lost, retries become manual, and teams risk duplicate actions or incomplete records.

Practitioner guidance

Inventory agent workflows that cannot survive interruption Map every AI agent flow that loses state on crash, restart, or worker reschedule.
Require persisted event history for all privileged agent tasks Make complete step history mandatory for any workflow that touches customer data, secrets, or production systems.
Separate business logic from retry and recovery logic Keep workflow decisions in the agent code and place retries, persistence, and scheduling in the orchestration layer.

What's in the full article

WorkOS's full interview covers the operational detail this post intentionally leaves for the source:

How Temporal records workflow history and resumes execution after worker failure
Why distributed systems patterns matter for AI agent reliability in production
How event history supports debugging, observability, and intervention during long-running runs
What teams should consider when choosing an orchestration layer for agentic workloads

👉 Read WorkOS's interview on durable execution for AI agents →

Durable execution for AI agents: are your controls keeping up?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

06/06/2026 4:08 am

Durable execution is becoming an identity control surface for AI agents. Once an agent can run for extended periods, branch at runtime, and survive failures, the workflow engine is no longer just infrastructure. It becomes the place where state, retries, and evidence are preserved. That makes it relevant to identity governance because the organisation can only govern what it can later reconstruct, and reconstruction depends on durable event history.

A few things that frame the scale:

80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
Only 44% of organisations have implemented any policies to govern AI agents, despite 92% agreeing that governing them is critical to enterprise security, according to SailPoint research.

A question worth separating out:

Q: How do teams know if durable execution is actually working for agents?

A: Teams know durable execution is working when a workflow can resume from a failure point without redoing completed work, and when the event history shows every decision and activity in order. If the only recovery option is to restart from scratch, the control is not real.

👉 Read our full editorial: Durable execution is becoming a control plane for AI agents

ReplyQuote

Forum Statistics

11 Forums

13.5 K Topics

25.8 K Posts

87 Online

135 Members

Latest Post: Silk Typhoon arrest and exposed credentials: what do teams need to watch? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies