Subscribe to the Non-Human & AI Identity Journal

Why do agentic systems increase the risk of hidden proxy attacks?

Agentic systems increase the risk because they create a trusted path from model output to external systems, and that path often includes URLs, headers, and other data that can be intercepted. If the model can alter its own tool call, an attacker can redirect traffic without changing the visible conversation.

Why This Matters for Security Teams

Hidden proxy attacks matter because agentic systems do not just generate text, they create and modify outbound actions that can touch URLs, headers, tokens, and downstream tools. That turns a model response into an execution path that attackers can influence after the fact. Guidance from the OWASP Agentic AI Top 10 and NHI breach analysis on AI LLM hijack breach shows the issue is not just prompt injection, but trust placed in the agent’s tool chain.

When a proxy layer, redirector, or broker is silently rewritten, defenders may still see a normal conversation while traffic is being steered somewhere else. That makes detection harder than classic phishing or credential theft because the malicious control point sits inside the agent workflow itself. In practice, many security teams encounter the compromise only after data egress, unauthorized API use, or anomalous routing has already occurred, rather than through intentional review of the agent’s tool calls.

How It Works in Practice

Agentic systems increase hidden proxy risk because they often assemble requests dynamically. A single action may include a destination URL, authorization header, session token, and metadata chosen by the model at runtime. If the model can be induced to alter any of those fields, it can redirect the request to an attacker-controlled proxy, relay, or look-alike service without visibly changing the chat transcript. The conversation may look benign while the actual call path is compromised.

That is why static allowlists and role-based access rules are often too blunt for autonomous workflows. Current guidance suggests moving toward request-time authorization, policy-as-code, and workload identity so the platform can validate what the agent is trying to do right now. The NIST AI Risk Management Framework supports this broader governance posture, while 52 NHI Breaches Analysis shows how compromised non-human identities often become the fastest path from one exposed secret to wider abuse. The operational pattern usually includes:

  • short-lived credentials issued per task instead of standing secrets
  • runtime policy checks before each outbound tool call
  • cryptographic workload identity, not just application login state
  • egress controls that validate destination, headers, and proxy hops
  • logging that preserves the full tool invocation chain for audit

Security teams should also model interception at the proxy layer itself, not only at the model layer. That means checking whether redirects, callback URLs, and embedded endpoints can be changed by tool output, retrieval content, or chained agent behavior. These controls tend to break down when agents are allowed to chain tools across multiple tenants or networks because the trust boundary becomes too fragmented to inspect consistently.

Common Variations and Edge Cases

Tighter request mediation often increases latency and operational overhead, requiring organisations to balance containment against throughput and developer friction. That tradeoff becomes sharper in multi-agent systems, where one agent may generate a request that another agent executes, and the intermediate handoff can become the hidden proxy path. Best practice is evolving here, and there is no universal standard for this yet.

Some environments also create special risk. Browser-using agents can inherit cookies and session state, API brokers can mask the true destination, and RAG systems can inject attacker-controlled URLs into later tool calls. In those cases, static policies may be too coarse because the destination is not known until runtime. The Anthropic AI-orchestrated cyber espionage report and AI Agents: The New Attack Surface report both reinforce that autonomous behavior expands the number of places where trust can be subverted. The practical response is to treat every outbound agent action as an authenticated transaction, not a presumed-safe follow-on from the model.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A1 Agent tool hijacking is a core agentic app threat.
CSA MAESTRO MAESTRO models trust boundaries for multi-step agent workflows.
NIST AI RMF AI RMF addresses governance for risky autonomous behavior.

Map proxy, tool, and handoff paths so hidden relays are treated as explicit attack surfaces.