By NHI Mgmt Group Editorial TeamPublished 2026-03-20Domain: Agentic AI & NHIsSource: WitnessAI

TL;DR: Enterprises can see who uses ChatGPT or Copilot, but often cannot see what is sent, what returns, or how shadow AI and agent actions expose proprietary data, according to WitnessAI. Legacy DLP and CASB controls struggle because AI risk lives in conversational payloads and intent, not file transfers or keywords.


At a glance

What this is: This is a generative AI security guide showing why conversation-level visibility, intent-aware policy, and runtime protection are now required for enterprise AI use.

Why it matters: It matters because IAM, security, and governance teams must control prompts, responses, and agent actions across human and digital workforces, not just monitor application access.

By the numbers:

  • WitnessAI’s Observe module provides network-level coverage across 4,000+ AI applications, with bidirectional capture of prompts and responses and continuous discovery, all without endpoint agents or browser extensions.

👉 Read WitnessAI's analysis of generative AI security controls and runtime guardrails


Context

Generative AI security is no longer just about who can sign in to a tool. The primary problem is that enterprise teams often lack visibility into prompts, responses, and the downstream data exposure that happens inside conversational workflows, especially when shadow AI sits outside approved inventory.

That gap matters because AI risk is semantic and runtime-driven. Traditional IAM, DLP, and CASB controls were built around files, sessions, and known applications, while modern AI use now spans chat interfaces, desktop apps, developer tools, and agentic workflows that can act on behalf of users.


Key questions

Q: How should security teams govern shadow AI without blocking all adoption?

A: Start by discovering where shadow AI is already in use, then classify interactions by data sensitivity and business purpose. Apply bidirectional inspection and intent-based policy so employees can use approved tools for legitimate work while risky prompts, outputs, and data flows are constrained before exposure occurs.

Q: Why do DLP and CASB tools struggle with generative AI security?

A: They were built for files, SaaS events, and known transfer patterns, not conversational payloads or agent actions. When employees paste data into chat tools or agents call APIs, the control point shifts to the interaction itself, and legacy tools often cannot see or interpret that movement.

Q: What should organisations do when AI tools can take actions on behalf of users?

A: Treat agent actions as governed identity events and require logging of tool calls, delegation chains, and execution context. Without that layer, investigations cannot tell whether an action came from the human operator, the agent, or a downstream delegated workflow.

Q: How can teams tell whether AI policy is actually working?

A: Measure whether prompts, responses, and tool calls are visible across sanctioned and unsanctioned workflows, then test whether intent-based rules distinguish legitimate use from risky disclosure. If the programme only catches obvious keywords or browser traffic, it is not controlling the real AI risk surface.


Technical breakdown

Why DLP and CASB miss AI conversation flows

Traditional DLP and CASB tools were designed for file-centric events, SaaS access, and predictable transfer patterns. Generative AI changes the data path because prompts and responses are conversational payloads that may never become files, never trigger a keyword, and sometimes never leave an encrypted session in a form legacy inspection can interpret. Certificate pinning can make inspection even harder. The result is a visibility gap at the point where sensitive data is most likely to be typed, transformed, or echoed back in a response.

Practical implication: teams need controls that inspect AI interactions directly, not just perimeter traffic and file events.

How intent-based policy enforcement differs from keyword detection

Intent-based enforcement classifies why an AI interaction is happening, not just which words appear in the prompt. That matters because risky use can be phrased indirectly, paraphrased, or hidden inside legitimate business language. Keyword systems produce false positives when prompts are benign and false negatives when employees share sensitive data without obvious trigger terms. Intent-aware policy allows security teams to distinguish summarisation, coding help, research, and exfiltration-like behaviour, then apply different responses rather than a blunt allow-or-block outcome.

Practical implication: route policy design around purpose and context, not keyword lists.

Agentic AI identity and tool-call protection

Agentic AI expands the problem from conversations to actions. When an agent can call tools, modify files, and execute API requests, the security question becomes identity-aware governance of runtime actions, not just content screening. The article highlights a core ambiguity: many controls cannot tell whether a given action came from the human who started the workflow or the agent that carried it out. In multi-agent chains, that attribution problem grows because delegation and re-execution obscure who initiated the act and who should be held accountable.

Practical implication: extend governance to tool calls, delegation paths, and agent identity attribution before agents are allowed to act.


Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.


NHI Mgmt Group analysis

Conversation-level visibility is now the baseline control for generative AI. Security teams that can only see logins are governing the wrong layer. The meaningful risk sits in prompts, responses, and copy-paste workflows that bypass file-centric inspection entirely. For IAM and security programmes, that means the control objective is no longer application access alone but the content and intent moving through AI interactions.

Legacy DLP assumptions break when the data path is semantic. DLP was built for known files and predictable transfer events, while AI use routinely happens in browser, desktop, and IDE contexts where no file boundary exists. That exposes a governance gap, not just a tooling gap. Practitioners should treat semantic data movement as a distinct risk class in NIST CSF-aligned protection and detection planning.

Intent-based enforcement creates a named governance gap: semantic policy blind spots. Semantic policy blind spots arise when security rules cannot tell legitimate assistance from risky disclosure or prompt manipulation because the system only sees text, not purpose. That gap becomes more pronounced as teams allow employees to use multiple models and workflows. The implication is that AI policy must be driven by use case, context, and data sensitivity, not static allow lists.

Agent identity turns AI governance into an accountability problem as much as a data problem. Once an AI system can take actions, the question shifts from what was typed to who or what executed the action and under which delegation chain. Without that identity layer, audit logs lose fidelity and incident response cannot reconstruct causality. The field needs governance models that treat agent actions as first-class identity events, not just application telemetry.

From our research:

  • 79% of organisations have experienced secrets leaks, with 77% of these incidents resulting in tangible damage, according to Ultimate Guide to NHIs.
  • 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools.
  • Ultimate Guide to NHIs , Key Challenges and Risks shows why visibility gaps and unmanaged credentials keep recurring across identity programmes.

What this signals

Semantic policy blind spots will become a standard audit finding if AI governance stays focused on applications instead of interactions. Teams should expect more pressure to evidence what entered the model, what came back, and how outputs were constrained before they influenced downstream work.

With NHIs outnumbering human identities by 25x to 50x, the practical lesson is that identity programmes will need to govern both human prompts and machine actions as one continuous control surface. That shift favours runtime visibility over periodic review.

The next control boundary is not the chat UI but the point where an AI interaction becomes a decision, a data movement, or a delegated action. Security teams that do not build that boundary into policy, telemetry, and incident response will keep detecting after exposure instead of before it.


For practitioners

  • Inventory sanctioned and shadow AI usage Build a continuously updated catalog of approved and unapproved AI tools, including desktop apps and IDE integrations that browser controls miss. Use that inventory to decide where bidirectional inspection is required and where policy exceptions still create exposure.
  • Shift policy from keywords to intent Classify AI interactions by purpose such as summarisation, coding help, research, or potential exfiltration, then apply allow, warn, block, or route responses based on sensitivity and business context. Keep keyword rules only as a narrow backstop.
  • Extend governance to prompts, responses, and tool calls Treat prompts and model outputs as governed data flows, not passive chat text. For agentic workflows, log tool calls, delegation paths, and action timing so investigators can reconstruct who triggered each action and what the agent actually did.
  • Map AI controls to NIST CSF functions Align discovery, content inspection, response filtering, and incident handling to the NIST Cybersecurity Framework 2.0 so AI governance is embedded in identify, protect, detect, respond, and recover rather than isolated as a point solution.

Key takeaways

  • Generative AI security fails when teams can only see access logs and not prompts, responses, or agent actions.
  • Legacy DLP and CASB controls are mismatched to semantic data flow, which is why AI risk persists inside ordinary workflows.
  • Practitioners need bidirectional visibility, intent-based policy, and runtime defense at the interaction point to govern AI safely.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10Agent tool use and runtime actions are central to the article's risk model.
OWASP Non-Human Identity Top 10NHI-03AI service accounts and model integrations create non-human identity governance exposure.
NIST CSF 2.0PR.DS-5The article focuses on protecting data in transit through AI interactions and outputs.

Inventory AI-related NHIs and enforce lifecycle controls where credentials, tokens, or API keys are used.


Key terms

  • Shadow AI: Shadow AI is the use of generative AI tools, models, or agents outside approved governance and visibility. The security problem is not only unsanctioned software but unsanctioned data movement, where sensitive prompts, outputs, and actions occur without the controls that normally apply to enterprise systems.
  • Intent-based enforcement: Intent-based enforcement is a policy approach that evaluates why an AI interaction is happening, not just the words it contains. It is used to distinguish legitimate work from risky disclosure or manipulation when paraphrasing, conversational context, and semantic variation defeat keyword-based rules.
  • Agent identity: Agent identity is the governance layer that attributes actions performed by an AI system to the specific agent, workflow, or delegation path that executed them. In autonomous or multi-agent environments, it is essential for audit fidelity, incident reconstruction, and privilege boundary enforcement.
  • Bidirectional visibility: Bidirectional visibility means security teams can inspect both what goes into an AI system and what comes back out. For generative AI, that is the minimum required to understand sensitive input exposure, unsafe outputs, and the point at which an interaction turns into a data loss event.

Deepen your knowledge

Generative AI security, bidirectional visibility, and runtime policy enforcement are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI chatflows, desktop tools, and agent actions, it is worth exploring.

This post draws on content published by WitnessAI: generative AI security and runtime guardrails for enterprise AI. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-03-20.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org