They treat prompt injection as a content problem instead of an access problem. The real issue is whether the agent can ever see or disclose usable secrets. If the credential lives in the agent’s memory, configuration, or environment, the attacker only needs to persuade the system to reveal it. Identity design has to remove that exposure path.
Why This Matters for Security Teams
Prompt injection becomes a security problem when it is able to influence identity-bearing actions, not just model output. Teams often focus on filtering malicious text, but the real exposure is whether an agent can reach secrets, tokens, or privileged tool actions in the first place. When those controls are weak, a single injected instruction can turn a harmless chat input into credential disclosure or unauthorized execution. Current guidance from the OWASP Agentic AI Top 10 treats this as an access control failure, not a content moderation issue.
NHIMG research shows why that framing matters: the Ultimate Guide to NHIs reports that 79% of organisations have experienced secrets leaks, and 77% of those incidents caused tangible damage. That is the operational reality behind prompt injection. If an agent can read environment variables, config files, or long-lived API keys, the attack surface is already exposed before any prompt is sent. In practice, many security teams encounter identity abuse only after the model has already been prompted into revealing something useful, rather than through intentional access design.
How It Works in Practice
The practical mistake is letting the agent inherit broad identity context from the runtime. A prompt injection does not need to “hack the model” if the model is sitting next to usable credentials. Security teams should treat the agent as a workload with narrowly scoped identity, short-lived authority, and request-time policy checks. That means separating the model from secrets, issuing task-bound credentials only when needed, and revoking them as soon as the action completes.
In mature designs, the agent’s identity is a workload identity, not a human-like account. Cryptographic proof from systems such as SPIFFE or OIDC establishes what the workload is, while authorization is evaluated dynamically for the specific action being requested. That is closer to the direction described in the Ultimate Guide to NHIs — Standards than to legacy IAM patterns. The policy decision should happen at runtime with full context: which tool is being called, what data is involved, whether the request is expected, and whether the current session still deserves access.
- Use just-in-time credentials instead of embedding static keys in prompts, memory, or environment variables.
- Bind each secret to a single task or API call, then revoke it automatically on completion.
- Keep tool permissions separate from model instructions so prompt text cannot directly expand identity scope.
- Evaluate authorization at request time using policy-as-code, not only during provisioning.
This aligns with the OWASP Agentic AI Top 10, which emphasises tool abuse and excessive authority as core agentic risks. These controls tend to break down when legacy applications force agents to reuse shared service accounts because the system cannot distinguish intended automation from attacker-driven instruction following.
Common Variations and Edge Cases
Tighter identity controls often increase integration overhead, requiring organisations to balance security against developer friction and runtime complexity. Best practice is evolving, especially for multi-agent systems and long-running workflows where there is no universal standard for how often context should be revalidated. Some teams also overcorrect by moving every secret into a vault but still granting the agent broad retrieval rights, which preserves the same exposure path under a different wrapper.
The most common edge case is a toolchain that mixes user prompts, system prompts, and operational credentials in one process. In those environments, prompt injection can trigger secret exfiltration even if the model itself never “breaks” policy. Another weak point is delegated access across agents: one compromised agent can call another agent that has higher privilege, which creates an internal lateral-movement path. The Top 10 NHI Issues is useful here because it highlights how excessive privilege and poor rotation turn routine automation into a durable compromise path. Current guidance suggests reducing standing access first, then layering content controls on top. In practice, prompt injection becomes dangerous whenever the agent can translate words into privileged system state faster than defenders can revoke the underlying identity.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | TBD | Prompt injection is an agentic tool-abuse and excess-authority problem. |
| CSA MAESTRO | TBD | MAESTRO addresses agent workflows where identity and tool use intersect. |
| NIST AI RMF | AI RMF governs risk, accountability, and runtime controls for AI systems. |
Use AI RMF to assign owners, assess prompt-injection risk, and monitor control effectiveness.