TL;DR: A graph-level backdoor can rewrite tool-call URLs in real time, silently redirecting agentic requests through attacker-controlled infrastructure while leaving the user-facing response unchanged, according to HiddenLayer research. The attack turns model execution into a supply chain compromise and makes graph-level inspection a prerequisite for safe deployment.
NHIMG editorial — based on content published by HiddenLayer: Agentic ShadowLogic
Questions worth separating out
Q: What breaks when a tool-calling model can rewrite its own requests?
A: The approval model breaks because the action that was reviewed is no longer the action that executes.
Q: Why do agentic systems increase the risk of hidden proxy attacks?
A: Agentic systems increase the risk because they create a trusted path from model output to external systems, and that path often includes URLs, headers, and other data that can be intercepted.
Q: How should security teams validate downloaded models before using them in production?
A: Security teams should treat downloaded models as executable artefacts and inspect them for unusual graph logic, embedded state, and tool-call manipulation before deployment.
Practitioner guidance
- Inspect model graphs before deployment Scan exported ONNX or equivalent artefacts for conditional branches, unexpected state storage, and logit manipulation inside tool-calling paths before approving the model for production use.
- Separate approval from execution Validate that the destination, parameters, and headers of every tool call are independently checked after generation and before dispatch, rather than trusting model-produced arguments.
- Constrain egress for agent tools Route agent tool traffic through controlled network boundaries so that unexpected proxies, redirects, or destination changes are visible and blocked where possible.
What's in the full report
HiddenLayer's full research covers the operational detail this post intentionally leaves for the source:
- Figure-by-figure walkthrough of the ONNX graph logic used to detect URL generation inside tool calls
- Token-by-token explanation of the KV cache state that keeps the backdoor aligned across generation steps
- Demonstration artefacts showing how the proxy logs, rewrites, and response injection behave in practice
- Implementation detail on the model-scanner approach used to detect suspicious graph payloads before deployment
👉 Read HiddenLayer's research on Agentic ShadowLogic and tool-call hijacking →
Agentic ShadowLogic: are tool calls safe to trust?
Explore further
Tool-call integrity is now an identity problem, not just a model-safety problem. Once a model can rewrite its own downstream action path, the real control boundary shifts from the prompt to the execution artefact. That means tool calls behave like privileged machine identities whose authority can be subverted after approval. Practitioners should treat model graphs as governance objects, not just ML artefacts.
A few things that frame the scale:
- 91.6% of secrets remain valid five days after the targeted organisation is notified, showing a critical gap in remediation procedures, according to the Ultimate Guide to NHIs.
- Only 5.7% of organisations have full visibility into their service accounts, which means most teams would struggle to spot a hidden tool-call path or compromised runtime identity in time.
A question worth separating out:
Q: What should organisations do when AI tools may carry credentials in request paths?
A: Organisations should remove credentials from request paths, reduce reliance on query-based authentication, and constrain where agent traffic can be sent. If the proxy layer can see the credential, the proxy layer can potentially expose it, so request hygiene matters as much as model hygiene.
👉 Read our full editorial: Agentic ShadowLogic exposes a hidden tool-call hijack risk