AI agent governance needs inline protection against rogue behavior

By NHI Mgmt Group Editorial TeamPublished 2025-09-08Domain: Agentic AI & NHIsSource: Zenity

TL;DR: As Microsoft reports more than 230,000 organisations, including 90% of the Fortune 500, using Copilot Studio to build custom agents, Zenity’s analysis shows why unauthenticated inputs, tool chaining, and indirect prompt injection can turn workflow automation into data exposure risk. Inline, behavior-driven controls are becoming the governance baseline for AI agents.

At a glance

What this is: Zenity’s analysis argues that AI agents need inline protection because indirect prompt injection and risky tool use can trigger data exposure without credential theft or firewall bypass.

Why it matters: For IAM, PAM, and identity teams, the lesson is that agent governance must cover runtime behaviour, tool access, and trigger handling, not just provisioning and authentication.

By the numbers:

While 71% of IT teams have been advised on AI agent data access, only 47% of compliance teams, 39% of legal teams, and 34% of executives have the same visibility.
96% of technology professionals identify AI agents as a growing security threat, and 66% believe this risk is immediate.

👉 Read Zenity's analysis of rogue AI agents and inline protection in Copilot Studio

Context

AI agent governance is no longer a theoretical concern. When agents can connect to email, CRM systems, MCP servers, and other business tools, a single crafted input can trigger actions that expose sensitive data without any credential theft or authentication failure.

The control gap is not just about who can build an agent. It is about whether organisations can govern how the agent behaves at runtime, what it can invoke, and whether untrusted input can be turned into unauthorised action across identity and access pathways.

Key questions

Q: How should security teams prevent AI agents from acting on malicious input?

A: Security teams should treat every external prompt, email, ticket, or chat message as untrusted input until it is validated against policy. The strongest control is runtime enforcement at the point of tool invocation, where the system can block risky actions before they reach CRM, email, or other sensitive tools.

Q: Why do AI agents complicate identity governance more than standard automation?

A: AI agents complicate identity governance because they do not just execute scripted steps. They can interpret context, select tools, and trigger downstream actions at runtime, which means entitlement review alone does not capture the real risk. Governance must include behaviour, trigger paths, and delegated tool use.

Q: What do teams get wrong about public agent workflows?

A: Teams often assume a public workflow is safe if it has no login or human approval step. In practice, public exposure can make malicious prompting easier, especially when the agent is connected to business systems and can perform privileged actions on behalf of the workflow owner.

Q: How can organisations tell whether AI agent controls are working?

A: Controls are working when teams can identify every agent, every trigger, every connected tool, and every identity the agent can use, then prevent unauthorised actions in real time. If those relationships are not visible, the organisation is already operating with a governance blind spot.

Technical breakdown

Indirect prompt injection as an access path

Indirect prompt injection happens when attacker-controlled content reaches an agent through a channel the system trusts, such as email, tickets, chat, or document text. The agent then treats that content as instruction, not as data. In integrated environments, the exploit works because the agent can carry untrusted intent into privileged tool calls. The danger increases when the agent has access to CRM records, email systems, or other business applications that hold sensitive data. The issue is not model accuracy alone. It is the collapse of the boundary between user content and executable instruction.

Practical implication: restrict which input channels can trigger high-impact tool actions and treat all external content as untrusted by default.

Why tool invocation is the real control point

In agentic systems, the critical decision is often not what the model says, but what it is allowed to do next. Tool invocation connects reasoning to action, so a misconfigured agent can read data, send messages, or call other systems with very little friction. If the agent can chain tools, the blast radius grows fast because one unsafe decision can cascade into multiple systems. This is why inline inspection matters. Security teams need to evaluate intent, context, and destination before a tool call executes, not after logs are written.

Practical implication: place policy enforcement at the tool boundary and block high-risk actions before the invocation completes.

Agent visibility must cover build, trigger, and behaviour

Copilot-style agent platforms lower the barrier to creation, which means governance cannot assume central engineering ownership. A useful control model needs to know who built the agent, what it connects to, which identity it runs under, and which triggers can activate it. Without that inventory, security teams cannot judge whether a public flow, unauthenticated chat input, or delegated connection is acceptable. This is a lifecycle problem as much as a runtime problem. Discovery, approval, and monitoring must span the full agent lifecycle rather than stop at deployment.

Practical implication: maintain an agent inventory with owner, trigger, identity, and tool mappings before allowing production use.

Threat narrative

Attacker objective: The attacker aims to make the agent perform the malicious action for them, turning trusted automation into data leakage or unauthorised system access.

Entry occurs through indirect prompt injection, where a malicious message or other untrusted input reaches the agent through an approved channel such as email or a public flow.
Escalation happens when the agent treats that content as instruction and invokes connected tools with the permissions already attached to the session or workflow.
Impact follows when the agent exposes sensitive CRM or business data, or performs other unauthorised actions, without a credential theft event or perimeter breach.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
Cisco Active Directory credentials breach — Kraken ransomware group leaked Cisco Active Directory credentials.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Inline protection is becoming the governance boundary for AI agents: traditional IAM controls stop at authentication and entitlement assignment, but agent risk unfolds after access is already granted. Once an agent can interpret untrusted input and invoke tools autonomously, the decisive question becomes whether the action is blocked at runtime. That shifts security from perimeter trust to behaviour control, which is now the relevant governance layer for enterprise agents.

Trust in the trigger is the weak assumption: agents were designed for workflow efficiency, not for safely interpreting attacker-controlled content. The assumption that an input channel is benign fails when a public message, support email, or chat prompt can be converted into privileged action. The implication is not simply that controls are missing, but that the trust model for agent activation is already broken.

Agent lifecycle governance now spans build, configuration, and execution: democratized agent creation means business users can connect sensitive systems without the security team seeing the full dependency map. That creates misconfiguration risk, overbroad permissions, and hidden trigger paths that are difficult to audit after the fact. The practitioner lesson is that ownership and approval cannot begin at deployment; they must begin at creation.

AI agents sit at the intersection of NHI and autonomous behaviour: even where the platform looks like ordinary machine identity, the moment the system can choose actions and tool sequences at runtime, standard NHI controls become insufficient on their own. That is why agent governance must connect identity, policy, and execution monitoring in one model. Practitioners should treat AI agents as identities whose behaviour must be governed, not just secured.

From our research:
While 71% of IT teams have been advised on AI agent data access, only 47% of compliance teams, 39% of legal teams, and 34% of executives have the same visibility, according to AI Agents: The New Attack Surface report.
From our research: 98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
From our research: Read OWASP Agentic AI Top 10 for the control patterns that map to prompt injection, tool misuse, and agent scope drift.

What this signals

Inline enforcement is becoming the practical control plane for agent governance: when agents can be triggered by untrusted content and then reach into business systems, post-event logging is not enough. Security teams should expect runtime policy, connector-level controls, and owner accountability to become the baseline for production use, especially where agent behavior crosses email, CRM, and shared workflow tools.

Behavioural controls now sit beside identity controls: the platform may authenticate the agent, but governance succeeds only if the organisation can also prove who created it, what it can invoke, and how far its action set can expand during a session. That is why agent inventories, approval workflows, and policy enforcement need to be connected rather than managed as separate silos.

With 80% of organisations reporting agent actions beyond intended scope, the governance gap is no longer about experimentation. It is about whether production agents can be prevented from turning routine messages into high-impact system actions.

For practitioners

Classify every agent trigger path Map whether the agent can be activated by public flows, email, chat, or other untrusted inputs, then disable any path that can reach sensitive tools without review. Treat trigger exposure as an access decision, not just a workflow setting.
Enforce runtime policy at tool invocation Block or step up high-risk actions before the tool call executes, especially for CRM writes, email sends, and cross-system reads. The control point is the invocation boundary, where intent can still be evaluated.
Build an agent inventory with real ownership Record who created each agent, which identities and connectors it uses, and what data it can reach. Require that inventory before production use so security teams can identify hidden privilege and shadow AI.
Separate trusted data from executable instructions Filter or sanitize external content before it reaches the agent’s reasoning loop, especially if the content comes from customers, partners, or anonymous users. The goal is to prevent untrusted text from becoming policy-breaking behaviour.
Monitor for tool chaining and scope drift Track whether an agent moves from a narrow task into broader reads, writes, or delegations during a session. Escalate when tool use expands beyond the original business purpose or owner-approved scope.

Key takeaways

AI agents create a new governance problem when untrusted content can be converted into privileged action through tool invocation.
The article’s evidence shows that public flows, unauthenticated chat inputs, and connected business systems can expose sensitive data without any credential theft.
Practitioners need runtime policy enforcement, trigger-path control, and agent inventory discipline before broad deployment becomes a security liability.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Agent tool misuse and prompt injection are central to the article.
OWASP Non-Human Identity Top 10	NHI-03	Connected agent identities and privileged access create NHI-style exposure.
NIST AI RMF		Runtime governance and accountability for AI systems fit AI RMF GOVERN.

Apply agent policy controls at the tool boundary and block untrusted instructions from becoming actions.

Key terms

Indirect Prompt Injection: A malicious instruction hidden inside data that an agent is expected to process, such as email, chat, or documents. The attack works when the system mistakes content for command, allowing untrusted text to influence tool use, data access, or downstream actions without a traditional login event.
Tool Invocation: The moment an agent calls an external system or function to take action, such as reading records, sending messages, or updating a workflow. In autonomous and semi-autonomous environments, this is where intent becomes execution and where policy enforcement has to happen to be effective.
Agent Inventory: A governed record of every AI agent in the environment, including owner, trigger paths, connected tools, identity bindings, and data access. It is the minimum visibility layer needed to evaluate whether the organisation can safely approve, monitor, and retire agents over time.
Scope Drift: The expansion of an agent’s actions beyond the task it was meant to perform. When scope drift occurs, the agent may begin reading, writing, or delegating more broadly than intended, which raises the blast radius of a single misconfiguration or malicious prompt.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Zenity: Preventing AI Agents from Going Rogue with inline protection in Microsoft Copilot Studio. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-09-08.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org