Notifications

Clear all

Agentic AI hard boundaries: are your controls actually enforceable?

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 06/06/2026 11:31 am

TL;DR: Agentic AI browsers and copilots remain vulnerable to prompt injection because probabilistic guardrails cannot reliably separate trusted intent from malicious instructions, according to Zenity’s PerplexedComet analysis. The real security boundary is deterministic enforcement at the code, network, or OS layer, where the model never gets a vote.

NHIMG editorial — based on content published by Zenity: Why Soft Guardrails Get Us Hacked: The Case for Hard Boundaries in Agentic AI

Questions worth separating out

Q: How should security teams prevent prompt injection in agentic AI systems?

A: Security teams should prevent prompt injection by removing dangerous capabilities at the environment level, not by relying on the model to judge intent correctly.

Q: Why do soft guardrails fail in agentic AI security?

A: Soft guardrails fail because they are probabilistic and operate in the same reasoning space as the agent they supervise.

Q: What breaks when an agent can reach local files and network egress?

A: What breaks is the assumption that the model can safely decide which actions belong to the task.

Practitioner guidance

Enforce deterministic capability blocks Remove high-risk functions such as local file access, clipboard access, and arbitrary egress from agent runtimes at the code or policy layer.
Separate trusted work from untrusted content paths Route external content, such as calendar descriptions, web pages, and inbox items, through a distinct ingestion path that cannot directly trigger privileged agent actions.
Map agent privileges as enforceable identity boundaries Inventory which resources each agent can reach, then reduce those permissions to the minimum set needed for the task.

What's in the full article

Zenity's full blog post covers the operational detail this post intentionally leaves for the source:

The step-by-step PerplexedComet attack chain, including the calendar-invite entry vector and the file:// exfiltration path.
The remediation sequence showing how Perplexity converted a prompt-level weakness into a deterministic code-level boundary.
The bypass variant involving view-source:file:// and why the first patch did not fully close the edge case.
The research context around soft guardrails, hard boundaries, and the broader agentic AI security debate.

👉 Read Zenity's analysis of why soft guardrails fail against agentic AI attacks →

Agentic AI hard boundaries: are your controls actually enforceable?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

06/06/2026 12:23 pm

Soft guardrails are a detection layer, not a security boundary. Probabilistic controls can add friction and visibility, but they cannot reliably prevent an agent from acting on malicious instructions embedded in untrusted content. The PerplexedComet chain shows that when the model is allowed to arbitrate trust inside the prompt, the attacker and the user can be merged into a single execution plan. Practitioners should read that as a boundary failure, not a tuning issue.

A few things that frame the scale:

98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
Only 44% of organisations have implemented any policies to govern AI agents, even though 92% say governance is critical to enterprise security.

A question worth separating out:

Q: What should teams do when an agentic browser must handle untrusted content?

A: Teams should isolate untrusted content handling from privileged actions and require deterministic barriers before the agent can touch sensitive resources. If the browser can read, interpret, and act on hostile text in the same session, then the trust boundary is too weak for production use.

👉 Read our full editorial: Hard boundaries, not soft guardrails, define agentic AI security

ReplyQuote

Forum Statistics

11 Forums

13.5 K Topics

25.8 K Posts

135 Online

135 Members

Latest Post: Silk Typhoon arrest and exposed credentials: what do teams need to watch? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies