TL;DR: Prompt injection cannot be fully mitigated by better prompts because current LLMs cannot reliably separate instruction from data, so consequential authority must move behind cryptographic authorization boundaries, according to Scramble ID. Scope-per-tool tokens, step-up approval, dual control, and chain-aware delegation make harmful actions fail at the resource boundary instead of in the model.
NHIMG editorial — based on content published by Scramble ID: Prompt Injection Defense Through Identity Controls
Questions worth separating out
Q: How should security teams stop prompt injection from turning into tool misuse?
A: They should enforce authorization at the tool or resource boundary, not inside the model.
Q: Why do prompt filters fail against indirect prompt injection?
A: Indirect injection hides malicious instructions inside content the agent is meant to process, such as documents, web pages, or emails.
Q: What breaks when an AI agent has more tool access than it needs?
A: Prompt injection can convert excess privilege into action.
Practitioner guidance
- Map every agent tool to a distinct scope Inventory each tool an agent can invoke, then assign the minimum scope needed for that one function.
- Place approval gates on irreversible actions Require human step-up or dual control before outbound payments, mass deletions, public posts, privilege grants, or security-policy changes can execute.
- Preserve delegation provenance across agent chains Use chain-aware token exchange so each hop carries subject and actor context end to end.
What's in the full article
Scramble ID's full article covers the operational detail this post intentionally leaves for the source:
- A detailed walkthrough of scope-per-tool tokens at the MCP boundary, including how insufficient_scope failures should behave.
- The Lockstep dual-control flow for irreversible actions and the specific kinds of actions that should trigger it.
- Token exchange handling for multi-hop delegation chains, including subject_token and act claim propagation.
- The article's concrete prompt-injected document example and how each control layer blocks a different failure path.
👉 Read Scramble ID's analysis of prompt injection defence through identity controls →
Prompt injection and identity controls: are your guardrails enough?
Explore further
Prompt injection is an authorization problem disguised as a language problem. The model can be manipulated only because it is sitting too close to consequential power. Better prompts may reduce obvious failures, but they do not create a trustworthy instruction boundary. Practitioners should treat the model as an untrusted decision surface and move the real decision into cryptographic enforcement.
A few things that frame the scale:
- 72% of organisations have experienced or suspect they have experienced a breach of non-human identities, according to The 2024 ESG Report: Managing Non-Human Identities.
- Two-thirds of enterprises have endured a successful cyberattack resulting from compromised non-human identities, with a quarter encountering multiple attacks, according to the same report.
A question worth separating out:
Q: Who should approve high-risk actions taken by AI agents?
A: A human approver independent of the initiating path should approve high-risk actions such as payments, bulk deletions, or privilege grants. For the highest-risk actions, dual control is better than single review because it adds segregation of duties and makes silent misuse harder to complete.
👉 Read our full editorial: Identity controls are the structural defence against prompt injection