Security teams should require durable execution, full event history, and clear ownership for every multi-step agent workflow that touches sensitive data or privileged tools. If the agent can lose state on failure, the organisation cannot reliably audit what happened or prove which actions were completed versus replayed.
Why This Matters for Security Teams
Long-running AI agents are not just another application workload. They are autonomous actors that can chain tools, pursue goals, and accumulate privilege across many steps, which makes static RBAC a poor fit. Governance has to focus on what the agent is trying to do at runtime, not only on a pre-approved role. That is why current guidance is moving toward intent-based authorisation, durable execution, and verifiable identity for the workload itself, as reflected in the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework.
This matters because agent failures are not limited to broken prompts. They involve state loss, hidden retries, tool misuse, and actions that outlive the original user intent. NHIMG research shows how quickly secrets become exploitable in the real world: in the AI LLM hijack breach and similar incidents, credential exposure turns into rapid abuse, while the OWASP NHI Top 10 highlights why agent identity and secret handling must be treated as first-class controls. In practice, many security teams encounter this only after an agent has already executed an unauthorised side task or accessed data outside its intended scope.
How It Works in Practice
For multi-step workflows, governance should start with workload identity and per-task authority. The agent should present a cryptographic identity for the workload, then receive just-in-time credentials only for the specific step it is authorised to perform. That means short-lived tokens, narrow scopes, and automatic revocation when the task ends. Static service accounts and long-lived API keys are poor choices because autonomous systems are not predictable in the same way as human-operated jobs. CSA frames this clearly in the CSA MAESTRO agentic AI threat modeling framework, while NIST zero trust guidance supports evaluating every request at the moment it is made.
- Use durable execution so each step, retry, and tool call is recorded as an event, not a transient process state.
- Issue ephemeral secrets per workflow stage, not broad credentials for the full agent lifecycle.
- Evaluate policy at runtime with context such as target data, tool sensitivity, user intent, and step history.
- Require human or policy approval for privilege escalation, external sharing, or irreversible actions.
That runtime evaluation can be implemented with policy-as-code patterns such as OPA or Cedar, but the key is consistency, not the specific engine. NHI governance also needs an audit trail that can reconstruct exactly which actions were authorised, replayed, or aborted. NHIMG’s Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs is useful here because agent credentials must be treated as managed identities, not incidental application secrets. These controls tend to break down when an agent spans multiple vendors or ephemeral compute environments, because identity continuity and event retention become inconsistent across boundaries.
Common Variations and Edge Cases
Tighter step-level control often increases operational overhead, requiring organisations to balance safety against workflow latency and developer friction. That tradeoff is real, especially for agents that perform many small actions in quick succession. Best practice is evolving, but there is no universal standard for this yet, so teams should classify workflows by blast radius rather than applying one governance pattern everywhere.
Low-risk agents may only need scoped JIT credentials and logging, while agents that touch finance, customer data, or production controls need stronger approval gates, richer audit history, and stricter separation of duties. High-autonomy systems also need special handling when they can self-assign subgoals or branch into parallel tool use, because failures may happen after a valid first step and before a risky second step. That is where the Top 10 NHI Issues and the NIST Cybersecurity Framework 2.0 remain useful as operational references for accountability, detection, and recovery. When agents can move across long-lived sessions, shared toolchains, or loosely governed MCP connections, the model breaks down because the organisation can no longer prove which action belonged to which intent.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | Covers agent misuse and unsafe tool execution in autonomous workflows. |
| CSA MAESTRO | Provides threat modeling for multi-step agentic workflows and control mapping. | |
| NIST AI RMF | Supports governance, mapping, and measurement for autonomous AI risk. |
Define ownership, monitor behavior, and measure agent risk across the full workflow lifecycle.
Related resources from NHI Mgmt Group
- How should security teams govern AI agents that use OAuth access?
- How should security teams govern AI agents that can access enterprise systems?
- How should security teams govern ecommerce AI agents that can touch payment systems?
- How should security teams govern AI agents that use multiple identity layers?
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 6, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org