When policy lives only in prompts or application code, it becomes easy to bypass, hard to audit, and fragile across model changes. The organisation loses a reliable control point for routing, logging, and privilege reduction. That creates hidden access paths that are difficult to detect after the agent has already acted.
Why This Matters for Security Teams
When agent policy exists only in prompts or application code, it is not acting as a control, only as an instruction. That distinction matters because autonomous agents can be re-tasked, chained into other tools, or influenced by input that changes their behaviour at runtime. Security teams lose a durable enforcement point for privilege, routing, and logging, and audit evidence becomes dependent on whatever the application happened to record.
This is why NHI Management Group treats policy-as-code and runtime enforcement as a governance issue, not just an engineering preference. The risk is especially visible in agentic systems where a prompt may look safe in testing but fail once the agent encounters a new tool, a new context, or a model update. NHIMG research on the OWASP NHI Top 10 shows how hidden access paths and weak control placement turn into operational exposure.
In practice, many security teams discover these gaps only after an agent has already touched sensitive systems, rather than through intentional policy review.
How It Works in Practice
Prompt text can influence behaviour, but it does not reliably enforce security decisions. If a model is asked to summarise data, open a ticket, or call an API, the real question is whether the agent is authorised to do so right now, with this input, in this context. That requires a separate policy decision point outside the prompt and outside the application’s business logic.
Current guidance suggests treating the agent as an identity-bearing workload and evaluating its access at runtime, not at build time. Pair the agent with workload identity, then issue short-lived credentials only for the specific task. This is the operating model described across NIST AI Risk Management Framework and the OWASP Agentic AI Top 10, and it aligns with NHIMG guidance on lifecycle control in the Ultimate Guide to NHIs.
- Use runtime policy evaluation for each request, not a one-time prompt safeguard.
- Bind the agent to workload identity so the system knows what it is, not just what it claims.
- Issue JIT credentials with a short TTL and revoke them when the task ends.
- Log policy decisions separately from application logs so approvals and denials are auditable.
- Restrict tool scope so one successful action cannot silently become lateral movement.
For example, a code assistant that can read a repo should not automatically be able to push to production, fetch secrets, or approve its own escalation. NHIMG analysis of the Analysis of Claude Code Security and the Moltbook AI agent keys breach both illustrate how quickly weak credential placement becomes a control failure. These controls tend to break down when the agent can call unmanaged third-party tools because the policy engine no longer sees the full action chain.
Common Variations and Edge Cases
Tighter runtime policy often increases latency, integration effort, and operational overhead, requiring organisations to balance stronger enforcement against developer convenience. That tradeoff is real, but it is usually preferable to relying on prompts that can be bypassed by model drift, context injection, or code changes.
There is no universal standard for prompt-only governance, and best practice is evolving. Some teams try to harden prompts with “do not” rules, but those controls are advisory, not enforceable. Others embed policy into application code, yet code still sits too close to the execution path and is easy to bypass through alternate routes, retries, or background jobs. The more autonomous the agent, the less trustworthy static rules become.
The hardest cases are multi-agent workflows, delegated tool use, and environments with shared secrets. In those settings, a single prompt can trigger downstream actions that were never reviewed as a chain. NHI Management Group’s Top 10 NHI Issues and the Regulatory and Audit Perspectives section both reinforce the same operational lesson: if access cannot be enforced, logged, and revoked outside the prompt, it is not a dependable control.
In environments with legacy systems, offline batch jobs, or shared service accounts, this guidance breaks down because the agent cannot be cleanly separated from broader application privilege.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | Covers prompt and policy bypass risks in agentic systems. |
| CSA MAESTRO | M2 | Addresses agentic control-plane design and enforcement boundaries. |
| NIST AI RMF | GOVERN | Supports governance, accountability, and traceable AI control placement. |
Separate orchestration logic from enforceable policy and audit each decision.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 23, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org