Prompt injection works because the agent uses its own authorised access. The attacker does not need to steal a key if they can steer the agent into using that key for an unintended action. Valid credentials confirm identity, but they do not prove the intent behind each runtime decision.
Why This Matters for Security Teams
Prompt injection is dangerous because it targets the decision layer, not the login layer. A credential can be valid, fully rotated, and still be used to execute an attacker-influenced action if the agent accepts malicious instructions embedded in content, tools, or workflows. That is why identity proof alone does not close the risk: the agent is operating with authorised access, but not necessarily authorised intent.
This is where classic IAM expectations break down. The most common failure is assuming that if a workload is authenticated, every downstream action is safe. For autonomous systems, the question is not only “who is this?” but “what is this entity trying to do right now?” Current guidance from the OWASP Non-Human Identity Top 10 and the Ultimate Guide to NHIs — Static vs Dynamic Secrets both point to the same operational reality: static trust does not map cleanly to dynamic execution.
NHIMG research shows the wider non-human identity problem is still immature, with only 19.6% of security professionals expressing strong confidence in their organisation’s ability to securely manage non-human workload identities, according to The 2024 Non-Human Identity Security Report. In practice, many security teams encounter prompt injection only after an agent has already used legitimate access in an unintended way, rather than through intentional abuse testing.
How It Works in Practice
Prompt injection succeeds when an agent treats untrusted input as actionable instruction. The attacker may place malicious directives in a ticket, document, web page, email, API response, or retrieval corpus. If the agent has tool access, it can chain those instructions into real operations such as reading data, sending messages, creating records, or invoking privileged APIs. The credential remains valid throughout; the misuse happens because the runtime policy did not distinguish approved purpose from attacker-shaped purpose.
Defence therefore has to move closer to the decision point. Best practice is evolving toward runtime controls that combine workload identity, policy evaluation, and short-lived authorisation:
- Use workload identity so the system can prove what the agent is, not just what secret it holds.
- Issue dynamic, short-lived secrets instead of long-lived static credentials whenever possible.
- Apply intent-based or context-aware authorisation at request time, not only at onboarding time.
- Scope tool use to the minimum viable action and revoke access when the task completes.
- Evaluate policy continuously using policy-as-code patterns rather than fixed allowlists alone.
That approach aligns with the OWASP Agentic AI Top 10 and the NIST Cybersecurity Framework 2.0, which both emphasise governance, least privilege, and monitored execution. The practical takeaway is simple: the credential authenticates the agent, but runtime policy must authorise each action. These controls tend to break down when agents have broad tool access across loosely governed SaaS, CI/CD, or RAG pipelines because the input surface and the action surface are both too large.
Common Variations and Edge Cases
Tighter runtime controls often increase integration overhead, requiring organisations to balance security precision against delivery speed and operational complexity. There is no universal standard for this yet, especially for agentic systems that work across multiple tools and data sources.
One common edge case is retrieval-augmented workflows, where the model is reading untrusted external content but also has the authority to act on it. Another is multi-agent orchestration, where one compromised agent can pass poisoned context to another and turn a narrow prompt injection into a broader workflow compromise. In those environments, static RBAC alone is usually too coarse, and even strong authentication from NIST SP 800-63 Digital Identity Guidelines does not solve the intent problem.
For security teams, the practical question is where to place trust boundaries. That may mean separate identities per agent, explicit approval gates for high-impact actions, and stricter control over secrets exposed to tools. The CI/CD pipeline exploitation case study and the Reviewdog GitHub Action supply chain attack both illustrate how valid access can still be abused once untrusted input reaches an automated control plane. Guidance remains evolving, but current practice suggests treating prompt injection as a runtime authorisation failure, not just a content-safety issue.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A1 | Prompt injection exploits agent instruction handling and tool use. |
| CSA MAESTRO | GOV-02 | Agent governance must limit unintended actions from valid access. |
| NIST AI RMF | AI RMF governs contextual risk management for autonomous decisions. |
Assess and monitor agent behaviour continuously, not just initial identity assurance.
Related resources from NHI Mgmt Group
- What is the difference between prompt injection risk and identity abuse in agents?
- Why do non-human identities create more risk than many human accounts?
- Why do non-human identities create more remediation risk than many human accounts?
- How should teams reduce the risk of exposed AI credentials being abused?
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 10, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org