Why are AI gateways not enough to stop prompt injection and data leakage?

Why This Matters for Security Teams

AI gateways are useful for routing, throttling, and basic request filtering, but they do not solve the core problem of content that is adversarial, ambiguous, or context-dependent. Prompt injection can arrive in a user message, a retrieved document, or a tool response, and data leakage often happens when an agent is tricked into revealing secrets already present in its working context. That is why the security question is not just where the traffic flows, but what the model or agent is being induced to do.

This is especially important because current guidance on agentic systems treats the interaction itself as the attack surface. NHIMG research on OWASP Agentic Applications Top 10 and the Guide to the Secret Sprawl Challenge both show why secret exposure and unsafe tool use are frequently downstream of weak content-layer controls. In practice, many security teams discover prompt injection only after an agent has already called the wrong tool or exposed sensitive context, rather than through intentional testing.

How It Works in Practice

Effective protection requires controls that inspect the prompt, the retrieved context, the tool call, and the model output as a single security problem. A gateway can block known-bad destinations, but it cannot reliably determine whether a benign-looking instruction contains a hidden override, a data exfiltration attempt, or a chain that will steer an agent toward privileged actions. For that reason, current practice is moving toward layered inspection, content policy enforcement, and runtime decisioning rather than perimeter-only filtering.

Teams should combine several controls:

Prompt and response inspection for instructions that try to override system behavior or solicit secrets.

Context isolation so retrieved content cannot silently inherit authority from system prompts or tool memory.

Tool-level authorization so each action is checked against current intent, not just session origin.

Secret minimisation so API keys, tokens, and certificates are never placed where the model can echo them back.

Logging and redaction that preserve forensic value without storing raw sensitive content unnecessarily.

That approach aligns with the broader direction described in the OWASP Agentic AI Top 10 and with the incident patterns documented in the DeepSeek breach, where secrets exposure and unsafe data handling were not solved by simple routing controls. Anthropic’s report on the first AI-orchestrated cyber espionage campaign also underscores that autonomous workflows can chain actions faster than human review can intervene. These controls tend to break down when the model has broad tool access and long-lived memory because a single successful injection can cascade into multiple downstream leaks.

Common Variations and Edge Cases

Tighter content inspection often increases latency, tuning effort, and false positives, requiring organisations to balance detection quality against user experience and operational overhead. There is no universal standard for this yet, so best practice is evolving rather than settled.

Some environments need stronger controls than others. Customer-facing chat applications usually prioritise redaction and output filtering, while internal agentic workflows often need stricter tool authorization and least-privilege context segmentation. Retrieval-augmented systems are a common edge case because malicious text may be embedded in a document the gateway sees as harmless. Similarly, multi-agent pipelines can leak data between agents even when the initial user request looked safe, because one agent may inherit compromised context from another.

NHIMG analysis of 52 NHI Breaches Analysis shows that identity and secret exposure often become visible only after misuse has already spread across systems. The practical takeaway is that gateways remain a useful choke point, but they are not a substitute for semantic inspection, runtime policy enforcement, and secret-aware application design. Where agents can reason, retrieve, and act, a network boundary alone is too shallow to stop abuse.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A3	Prompt injection and unsafe tool use are core agentic AI risks.
CSA MAESTRO	RUNTIME	MAESTRO emphasizes runtime governance for autonomous agent behavior.
NIST AI RMF		AI RMF addresses governance for harmful model outputs and misuse.

Establish monitoring, evaluation, and accountability for prompt and data handling risks.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why are AI gateways not enough to stop prompt injection and data leakage?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group