Agentic AI tools and browsing widen the attack surface

By NHI Mgmt Group Editorial TeamPublished 2025-11-12Domain: Agentic AI & NHIsSource: Lakera

TL;DR: Agentic AI systems that can browse the web and invoke tools create a larger attack surface, because every new action path can be abused and every retrieved page can hide instructions, according to Lakera. Runtime guardrails and continuous red-teaming are now the practical boundary between safe autonomy and unintended execution.

At a glance

What this is: This analysis shows that agentic AI browsing and tool use turn capability expansion into a security problem, where over-privileged actions and hidden web instructions become direct attack paths.

Why it matters: It matters because IAM, NHI, and AI governance teams need to treat agent permissions, retrieval sources, and execution safeguards as one control plane rather than separate concerns.

👉 Read Lakera's analysis of over-privileged tools and uncontrolled browsing in agentic AI

Context

Agentic AI security fails when systems are given broad tool access and the open web at the same time. Browsing-enabled agents do not just consume information, they can act on it, which means trust boundaries move from static prompts into runtime execution paths.

That creates a governance problem for non-human identities and autonomous systems alike. If an agent can send email, fetch data, run code, or follow hidden instructions embedded in web content, then least privilege, content validation, and egress control have to operate continuously rather than at setup time.

Key questions

Q: How should security teams govern AI agents that can browse the web and use tools?

A: Treat browsing and tool use as live execution, not passive assistance. Restrict permissions, validate retrieved content, and require runtime approval for any action that could change data, send messages, or trigger code. The control goal is to stop hostile content from becoming a trusted instruction before the agent completes the action chain.

Q: Why do AI agents increase risk more than ordinary automation?

A: AI agents increase risk because they can choose actions at runtime, not just follow a fixed workflow. When tool access and browsing are combined, a poisoned input can steer the agent into actions the original workflow never intended. That makes privilege scope, content trust, and execution controls part of the same governance problem.

Q: What breaks when browsing agents trust web content too much?

A: The trust boundary breaks down. A browsing agent may treat hidden instructions in HTML, images, metadata, or retrieved text as part of the task, then act on them as if they were legitimate context. That can produce unsafe links, data exposure, or chained tool misuse before any human notices the deviation.

Q: How do you know if runtime guardrails are actually working for agentic AI?

A: Look for inspection at the point of action, not just at configuration time. Effective guardrails screen prompts, retrieved content, and outbound actions in the live session, then block or flag anything that would expand scope unexpectedly. If unsafe tool calls still happen without a visible control decision, the guardrail is not working.

Technical breakdown

Over-privileged tools in agentic AI workflows

Agentic systems often chain tool calls through APIs, browsers, and MCP servers, which turns each permitted action into a potential abuse path. The issue is not that the tool is malicious, but that the agent can discover, select, and combine capabilities at runtime. Once a model can invoke email, file, or code execution tools without a fresh approval gate, privilege scope becomes the security boundary. In practice, this is the same over-permission problem that has long affected machine identities, but now the actor can vary its sequence of actions on the fly.

Practical implication: scope every agent tool to the minimum callable function set and require runtime approval for high-impact actions.

Uncontrolled browsing and indirect prompt injection

Browsing agents ingest text, HTML, and sometimes image metadata into model context, which means hostile content can act like a hidden instruction rather than a visible payload. This is indirect prompt injection: the attacker does not need to control the agent directly, only the page it reads. That matters because retrieval is no longer passive. When a browsing agent interprets third-party content as trusted context, it can be steered into republishing links, exposing data, or chaining into additional tools. The attack surface includes pages, comments, alt text, SVG metadata, and other content that traditional web filters often ignore.

Practical implication: validate retrieved content before it reaches model context and strip or sandbox instruction-like material.

Runtime guardrails vs static policy controls

Static policy is too early in the lifecycle to manage agentic risk on its own. By the time an agent is live, the relevant question is whether each prompt, retrieval, and outbound action is screened at runtime. That is where guardrails matter: they can block unsafe tool invocation, detect prompt injection, and surface policy violations while the session is active. For identity teams, the important shift is conceptual. The control point is no longer only provisioning or configuration. It is the execution moment, where the agent decides what to do next and whether the action is still safe.

Practical implication: pair pre-approval with runtime inspection for prompts, tool calls, and outbound side effects.

Threat narrative

Attacker objective: The attacker aims to make the agent execute, disclose, or propagate actions that appear legitimate to the user but serve the attacker’s intent.

Entry occurs when the agent retrieves hostile web content or interacts with a third-party MCP source that contains hidden instructions or poisoned markup.
Escalation occurs when over-privileged tools let the agent act on that content, including sending messages, executing code, or following a malicious link path.
Impact occurs when the agent republishes harmful content, leaks data, or triggers unintended actions through connected tools and browser permissions.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Every new agent capability expands identity risk, not just feature risk. When an AI system can browse, invoke tools, and act without a human in the loop, the security model changes from prompt filtering to runtime governance. That is a non-human identity problem first, and an AI safety problem second. The practical conclusion is that capability growth must be treated as privilege growth, because the attack surface moves with the tool chain.

Over-privileged tools create identity blast radius in agentic systems. The moment an agent can send email, run code, or fetch data under broad permissions, a single poisoned instruction can become a multi-step incident. OWASP-NHI framing still applies because the exposed pattern is excessive agency through machine access. Practitioners should read this as a blast-radius issue, where each additional tool widens the range of unintended outcomes.

Indirect prompt injection is really retrieval trust collapse. The agent does not need a compromised model to fail; it only needs to trust hostile web content as if it were safe context. That shifts the control problem from model output moderation to content provenance, sanitisation, and source segregation. The implication is that browsing agents need a content trust layer, not just a prompt policy.

Runtime guardrails are the control plane that makes autonomy governable. Static allowlists and pre-launch reviews cannot see the final action a browsing agent will choose when the session is already live. This is why runtime inspection, action gating, and continuous red-teaming belong inside the operating model, not beside it. Practitioners should treat the execution moment as the real security boundary.

Agentic AI governance now sits at the intersection of NHI, IAM, and browser security. The field has moved beyond protecting isolated secrets or isolated prompts. The hard problem is controlling a dynamic identity that can discover, retrieve, and execute across boundaries. Teams that keep these domains separate will miss the combined failure mode, which is why governance must follow the behaviour, not the label.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation, according to SailPoint.
The next governance step is to pair agent access visibility with runtime enforcement, as analysed in OWASP Agentic AI Top 10.

What this signals

Agentic control now has to operate at the moment of execution. If an agent can browse, retrieve, and act in one session, then governance based only on pre-approval will miss the real failure point. Teams should assume that any trusted retrieval source can become a control-input channel unless runtime inspection is built into the workflow.

Runtime guardrails are becoming the practical boundary for autonomous behaviour. The shift is not just technical. It changes how IAM, NHI, and security architecture teams define acceptable access, because the same agent may be both a reader and an actor. For practitioners, that means access reviews need to cover tool scope, retrieval trust, and outbound actions together.

With 92% of organisations saying governing AI agents is critical to enterprise security but only 44% having implemented any policies, the gap is no longer awareness but execution. That is why the governance model has to move toward continuous checks, backed by references such as the NIST AI Risk Management Framework and the OWASP Top 10 for Agentic Applications 2026.

For practitioners

Restrict agent tool scopes to the minimum viable set Remove broad tool permissions from production agents and separate read, write, and execute capabilities so the agent cannot chain high-impact actions without a new control point.
Validate retrieved content before model ingestion Treat web pages, images, comments, and MCP-fed content as untrusted input and filter instruction-like material before it enters the agent context.
Add runtime approval for outbound side effects Require a final check before the agent sends email, publishes links, triggers payments, or starts code execution, because these are the points where compromise becomes visible.
Red-team browsing and tool chains continuously Test poisoned pages, hidden metadata, and chained tool calls under realistic agent permissions, then re-run the exercise whenever tool scope or data sources change. Use the OWASP Agentic AI Top 10 as a threat-modelling baseline and pair it with the MITRE ATLAS adversarial AI threat matrix.

Key takeaways

Agentic AI becomes a security problem when browsing and tool use turn dynamic content into runtime action.
The scale of the issue is already visible: most organisations report agent behaviour beyond intended scope, including data exposure and unauthorised access.
Practitioners need runtime guardrails, scoped tool access, and content validation to keep autonomy from becoming an uncontrolled execution path.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	LLM06:2025	Covers excessive agency and tool misuse in agentic workflows.
NIST AI RMF		Applies to governance and monitoring of autonomous AI behaviour.
NIST CSF 2.0	PR.AC-4	Access management is central when agent tools and browsing expand privileges.

Map agent permissions to excessive-agency controls and remove any tool the agent does not strictly need.

Key terms

Agentic AI: Software that can select actions, use tools, and continue execution with limited or no human direction. In security terms, the important question is not whether it is intelligent, but whether it can act across trust boundaries in ways that change access, data flow, or system state.
Indirect Prompt Injection: A hidden instruction placed inside content the model reads, such as web pages, documents, images, or metadata. The attack succeeds when the agent treats that content as trusted context and follows it during reasoning or tool execution.
Runtime Guardrail: A live control that screens prompts, retrieved content, and actions while an agent is operating. Unlike configuration-only policy, runtime guardrails can stop unsafe behaviour at the moment an action is about to be taken, which is where agentic risk becomes real.
Tool Scope: The specific set of functions, APIs, and side effects an agent is allowed to invoke. Tool scope is a security boundary, not a convenience setting. If it is too broad, the agent can chain actions into outcomes that were never intended at design time.

What's in the full article

Lakera's full article covers the operational detail this post intentionally leaves for the source:

Real-world examples of over-privileged tool chains in agentic workflows and how they were abused in practice
Step-by-step explanation of indirect prompt injection through web content, image metadata, and retrieval paths
Detailed runtime guardrail examples for MCP-based and browsing-enabled agents
Reference points to Lakera's own red-teaming and benchmark material for testing agent behaviour

👉 The full Lakera article covers runtime guardrails, agent breaker examples, and the browser-based attack paths in detail.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or programme maturity, it is worth exploring.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-12.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org