How do organisations govern agent security without over-trusting platform safeguards?

Why This Matters for Security Teams

Platform safeguards matter, but they are not a complete control plane for autonomous agents. Agents can chain tools, follow injected instructions, and take actions outside the narrow assumptions encoded by product defaults. That means security teams need governance that observes intent, evaluates context at runtime, and blocks unsafe tool use before damage occurs. The risk is not theoretical: the OWASP Agentic AI Top 10 and NIST AI Risk Management Framework both emphasize that dynamic AI behaviour creates control gaps that static policy cannot reliably close. NHIMG research on agentic risk and the OWASP Agentic Applications Top 10 show why relying only on the platform’s native guardrails leaves too much trust in components that can be bypassed through prompt injection or tool chaining.

That gap is especially dangerous because many organisations already underestimate their NHI exposure. In The State of Non-Human Identity Security, only 1.5 out of 10 organisations reported high confidence in securing NHIs, which is a warning sign for agent governance too. In practice, many security teams discover unsafe agent behaviour only after an over-permissioned workflow has already acted on real data or real systems.

How It Works in Practice

Effective agent governance treats the platform as an execution environment, not the final arbiter of trust. The control model should sit above and beside the platform, using workload identity, policy-as-code, and short-lived credentials to decide what the agent may do at the moment of action. The operational goal is simple: prove what the agent is, constrain what it can attempt, and expire access as soon as the task ends.

Current guidance suggests building this in layers:

Use workload identity for the agent, not a shared static secret, so every request can be tied to a cryptographic identity.

Issue just-in-time credentials with tight TTLs so access is task-scoped and automatically revoked after use.

Evaluate policy in real time against request context, tool sensitivity, data classification, and session state rather than predefining broad role grants.

Log every tool invocation, prompt transition, and approval decision so investigators can reconstruct the full agent thread.

That approach aligns with the NIST AI Risk Management Framework, which pushes governance toward measurable controls and accountability, and with the CSA MAESTRO agentic AI threat modeling framework, which focuses on threat-driven design. For identity depth, NHIMG’s Ultimate Guide to NHIs is a useful reference for lifecycle, rotation, and visibility patterns that translate directly into agent controls.

In practice, this often means placing an authorization gateway or policy engine in front of tool calls, pairing it with secrets managers or identity brokers, and refusing any action that is not explicitly justified by the current task. These controls tend to break down when agents are allowed direct network reach, because tool autonomy and lateral movement become harder to intercept at the point of decision.

Common Variations and Edge Cases

Tighter agent control often increases latency, operational complexity, and developer friction, so organisations must balance runtime safety against workflow speed. Best practice is evolving here, and there is no universal standard for exactly where to place guardrails in every stack. The right pattern depends on whether the agent is customer-facing, internal-only, or allowed to trigger irreversible actions.

One common edge case is vendor-hosted platforms that promise built-in protections. Those safeguards can reduce baseline risk, but they should not replace independent policy enforcement because the organisation still owns the data, the access, and the blast radius. Another edge case is delegated tool use in multi-agent systems, where one agent’s approved action becomes another agent’s input. That creates hidden privilege escalation paths unless each hop is re-authorized.

Security teams should also treat long-lived credentials as a failure mode, not a convenience. NHIMG’s research shows how credential weakness and visibility gaps repeatedly drive NHI incidents, and the same pattern applies to agents that are left with persistent tokens. The safer model is to combine the NIST Cybersecurity Framework 2.0 with agent-specific controls and to use threat references such as the MITRE ATLAS adversarial AI threat matrix when deciding how much trust a platform default actually deserves.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agent prompt injection and tool abuse require runtime controls beyond platform defaults.
CSA MAESTRO	TRM	Threat modeling agent workflows helps expose hidden trust in platform safeguards.
NIST AI RMF		AI RMF supports accountable governance for autonomous, context-driven agent behaviour.

Gate every tool call with policy checks and deny actions that lack current-task justification.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

How do organisations govern agent security without over-trusting platform safeguards?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group