Check whether every create, update, and delete request is blocked until an explicit human decision is recorded, and whether the logs show the proposed plan, the approval event, and the executed result. If the agent can proceed after a partial review, the control is only procedural, not enforced.
Why This Matters for Security Teams
Agent approvals are only meaningful if they stop execution until a human decision is recorded and tied to the exact request. For autonomous systems, a “review” that happens after the agent has already staged changes, called tools, or prepared follow-on actions is not a control. That gap matters because agents can chain actions faster than humans can intervene, so approval must be enforced at runtime rather than assumed from process.
This is where current guidance from the OWASP Agentic AI Top 10 and NIST AI Risk Management Framework aligns with what NHI teams see in production: approvals need to be auditable, contextual, and binding to the action being authorised. NHI Mgmt Group’s Ultimate Guide to NHIs notes that 90% of IT leaders say properly managing NHIs is essential for zero trust, which is a reminder that identity alone is not enough without enforced decision points.
In practice, many security teams discover approval bypasses only after an agent has already completed a risky operation, rather than through intentional control testing.
How It Works in Practice
To know whether approvals are actually working, organisations need to test the full control path, not just the workflow. The agent should submit a proposed plan, wait in a blocked state, and only continue after a recorded human decision authorises the specific request. The evidence should connect three things: the requested action, the approval event, and the executed result. If those records do not match, the approval is likely cosmetic.
Strong implementations usually enforce this with policy checks at the tool boundary, not in the user interface. That means the agent cannot write, delete, deploy, or access a protected API until the decision service returns allow for that exact operation, resource, and context. Teams often pair this with immutable logging, short-lived credentials, and request-level correlation IDs so investigators can prove that the approved plan is the one that executed. For agentic systems, the question is not whether a human saw a ticket; it is whether the runtime blocked the tool call until the right approval existed.
Security teams should validate the control with negative testing. Attempt a blocked action, confirm the agent cannot proceed, then verify the log trail includes the pre-approval plan, the approver identity, the timestamp, and the final action outcome. This approach is consistent with the risk patterns described in OWASP NHI Top 10 and the agentic threat modelling emphasis in CSA MAESTRO agentic AI threat modeling framework. These controls tend to break down when approvals are implemented only in chat, ticketing, or UI layers because the agent can still invoke downstream tools directly.
Common Variations and Edge Cases
Tighter approval controls often increase friction, requiring organisations to balance safety against response speed and operational throughput. That tradeoff becomes visible when high-volume agents need rapid access for routine tasks, or when emergency changes must be approved in seconds rather than minutes. Current guidance suggests using risk-based approvals for lower-impact actions and explicit human sign-off for destructive or externally visible actions, but there is no universal standard for this yet.
Some environments also use delegated approval, where a named operator pre-authorises a narrow task class under strict policy. That can work, but only if the scope is bounded, time-limited, and revocable. Another edge case is multi-agent orchestration: one agent may request approval while another executes a dependent action, so teams need correlation across the entire chain, not just the first request. This is especially important when the system can spawn subtasks or retry failed operations automatically.
For deeper implementation patterns, the Analysis of Claude Code Security and Anthropic — first AI-orchestrated cyber espionage campaign report both reinforce the need for runtime enforcement over procedural review. Organisations should treat any approval model that cannot prove blocked execution as incomplete.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A5 | Agent approval failures often stem from tool-use and execution bypasses. |
| CSA MAESTRO | GOV-2 | MAESTRO covers governance and runtime controls for agentic decision points. |
| NIST AI RMF | AI RMF supports measurable governance and accountability for agent actions. |
Enforce approval at the tool boundary so no agent action executes before policy allow.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on July 1, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org