By NHI Mgmt Group Editorial TeamPublished 2026-05-28Domain: Agentic AI & NHIsSource: WitnessAI

TL;DR: Enterprise AI use is expanding faster than organisations can prove business value, while governance friction, shadow AI, and weak measurement keep pilots from scaling, according to WitnessAI. The underlying problem is that legacy approval and audit models were built for deterministic software, not for runtime AI activity that needs continuous control and evidence.


At a glance

What this is: This analysis argues that enterprise AI adoption is outpacing governance, visibility, and ROI measurement, leaving many organisations with widespread usage but limited proof of business impact.

Why it matters: IAM, security, and compliance teams need to treat AI activity as a governed identity and access problem because unmeasured usage, weak enforcement, and poor auditability affect both human and machine-controlled workflows.

By the numbers:

👉 Read WitnessAI's analysis of the AI adoption-impact gap and runtime AI governance


Context

The AI adoption-impact gap is the disconnect between broad AI deployment and weak evidence of measurable business return. In practice, that gap appears when organisations approve tools faster than they can instrument usage, define controls, or prove which AI workflows improve outcomes.

For identity and access teams, this is not just an AI operations problem. It is a governance problem spanning human users, non-human identity patterns, and agent-like systems that can move data, trigger actions, and create audit obligations without fitting older approval models.

WitnessAI uses the gap to argue for operational AI risk management, but the deeper issue is structural: many enterprises still treat AI as a software purchase rather than as a governed runtime that needs visibility, enforcement, and evidence. That starting point is now typical across the Global 2000.


Key questions

Q: How should organisations govern AI usage when employees use unapproved tools?

A: Organisations should start with visibility, not enforcement. If teams cannot see which apps, agents, or workflows are being used, they cannot assess data exposure or apply meaningful controls. Once usage is mapped, policy can shift from blanket bans to context-based decisions that reflect sensitivity, role, and business purpose.

Q: Why do AI projects often fail to show measurable business value?

A: AI projects often fail because measurement is an afterthought. Many organisations can report spend and pilot counts, but not which interactions changed outcomes or reduced cycle time. Without telemetry at the prompt, model, and workflow level, leaders get anecdotes instead of defensible ROI evidence.

Q: What breaks when AI governance relies only on approval workflows?

A: Approval-only governance breaks when usage shifts outside sanctioned channels. Employees then move to shadow AI, and security teams lose visibility into data flows, model use, and policy violations. The result is slower formal adoption, more informal usage, and less confidence that controls match actual risk.

Q: Who should own accountability for runtime AI controls and audit trails?

A: Accountability should be shared across security, compliance, AI platform owners, and identity teams, with each function owning a distinct part of the control chain. Runtime AI controls are operational safeguards, while audit trails provide evidence that those safeguards actually worked during production use.


Technical breakdown

Why governance models break under AI adoption

Traditional governance assumes software behaves predictably, with known inputs, outputs, and approval points. AI systems are different because their outputs vary with context, and the same model can be used in multiple workflows by different users, apps, or agents. That makes static approval gates slow and often ineffective. The real technical issue is not whether a policy exists, but whether the organisation can inspect, classify, and control interactions at runtime across the full path of use.

Practical implication: move governance from one-time approval into runtime control that can evaluate actual AI use as it happens.

Shadow AI visibility across apps, agents, and endpoints

Shadow AI persists when monitoring only covers browser traffic or a narrow set of sanctioned applications. AI usage now appears in desktop tools, IDEs, embedded copilots, and API-driven agent workflows, which means traditional logging leaves large blind spots. Network-level visibility matters because it can identify where AI interactions occur without requiring endpoint clients or browser extensions. Once usage is visible, organisations can inventory tools, users, and data flows instead of guessing where exposure is occurring.

Practical implication: build discovery for AI activity at the network layer so unapproved use becomes measurable and governable.

Auditability and runtime AI controls

Auditability in AI means more than storing a policy document or a approval record. It requires capturing prompts, responses, policy decisions, and the context needed to explain why an action was allowed, warned, blocked, or routed. Runtime controls are especially important because the same interaction can contain prompt injection, sensitive data exposure, or policy violations that only appear during execution. This is where AI risk management becomes operational rather than theoretical.

Practical implication: maintain defensible interaction-level evidence so boards and regulators can verify that controls were enforced, not merely designed.


Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.


NHI Mgmt Group analysis

AI adoption without interaction-level governance is a control illusion: enterprises can count deployments, but they cannot reliably govern what they cannot observe at the point of use. This is why approval-heavy programmes stall while informal usage expands elsewhere. The implication is that security leaders should stop treating usage telemetry as a nice-to-have and start treating it as a prerequisite for control.

Shadow AI is a visibility failure before it is a policy failure: when employees route work through unapproved tools, the issue is not simply non-compliance. It is that the organisation lacks enough instrumentation to see where AI activity is happening, what data is involved, and which workflows are bypassing formal review. Practitioner conclusion: the first governance problem is discovery, not enforcement.

Runtime AI risk management is the discipline that connects value, evidence, and accountability: governance sets intent, but only runtime controls and audit trails can prove that AI is being used safely in production. That makes AI risk management structurally closer to operational access governance than to static policy management. Practitioner conclusion: boards should demand evidence of enforcement, not just policy maturity.

Measurement vacuum: organisations often know spend, but not which prompts, models, or workflows produced value. That breaks the assumption that AI investment can be managed like a conventional software roll-out, because operational proof is missing at the interaction layer. The implication is that finance, security, and IAM teams need shared measurement models before scale becomes defensible.

AI governance now crosses human and non-human identity boundaries: employees, copilots, embedded models, and agentic workflows all participate in the same decision chain. That means identity governance can no longer stop at the human login or the service account alone. Practitioner conclusion: the programme boundary has to expand to the full AI interaction path.

From our research:

  • The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
  • Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
  • That behavioural gap connects directly to the adoption-impact problem, which is why the Ultimate Guide to NHIs , 2025 Outlook and Predictions is the next step for teams planning AI governance at scale.

What this signals

AI governance will increasingly be judged by evidence quality, not policy volume: the organisations that can show prompt-level control, traceable enforcement, and clear ownership will be able to scale AI with far less friction than those relying on annual review cycles.

The practical shift is toward treating AI usage as an operational identity surface, with discovery, policy, and audit stitched together in the same control plane. That approach is especially important where human users, service accounts, and agentic workflows intersect.

Measurement vacuum: teams that cannot connect prompts to outcomes will keep arguing about AI value from anecdotes instead of evidence. That is a governance weakness, not just a reporting problem.


For practitioners

  • Instrument AI usage at the network layer Map where AI activity actually occurs across browsers, desktop applications, IDEs, embedded copilots, and API-driven workflows. Use that inventory to identify unsanctioned tools, sensitive-data paths, and the systems that need policy coverage first.
  • Replace binary AI approval decisions with context-based controls Use allow, warn, block, and route actions so teams can govern high-risk use without forcing a department-wide ban that drives shadow AI. Tie the control decision to data sensitivity, purpose, and user role rather than tool brand.
  • Capture interaction-level audit trails Log prompts, responses, policy decisions, and the context needed to explain enforcement outcomes. Retain those records in a form that supports internal review, regulatory challenge, and board reporting.
  • Align AI governance, compliance, and risk ownership Separate policy setting, compliance checking, and operational risk management so each function has a distinct role in production AI oversight. Use a shared control model to avoid gaps between design intent and real-world enforcement.

Key takeaways

  • The core risk is not AI adoption itself, but the gap between widespread usage and weak proof of control or value.
  • The scale of the problem is already visible in shadow AI usage, AI spending growth, and the small share of pilots that deliver measurable revenue impact.
  • The response is to treat AI risk management as a runtime discipline with discovery, enforcement, and audit evidence, not as a static policy layer.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A1Runtime AI controls address misuse, prompt injection, and policy bypass in agentic workflows.
NIST AI RMFAI RMF GOVERN and MEASURE functions fit the accountability and telemetry gap described here.
NIST CSF 2.0PR.AC-4Access control and governance are central to managing AI usage across users and systems.

Treat AI tools and agent workflows as governed access paths with documented control ownership.


Key terms

  • AI Adoption-Impact Gap: The gap between broad AI deployment and weak proof that it improves revenue, cost, or operational outcomes. It appears when organisations scale usage faster than they build the telemetry, governance, and control structure needed to connect activity to measurable business value.
  • Shadow AI: AI tools, agents, or workflows used without formal approval or visibility from security and governance teams. The term covers unsanctioned usage that may still be business-driven but sits outside policy, audit, and control boundaries, creating unmanaged data and access exposure.
  • Runtime AI Risk Management: The ongoing operational discipline of identifying, controlling, and evidencing AI risk while the system is in production. It focuses on live prompts, responses, policy outcomes, and audit trails rather than static policy documents or one-time approval decisions.
  • Interaction-Level Telemetry: Evidence captured at the point where a person or system interacts with an AI model, including prompts, responses, and control decisions. It lets teams explain what happened, why it was allowed or blocked, and whether governance actually worked in production.

Deepen your knowledge

AI adoption-impact governance and runtime control are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building oversight for AI tools, agents, and delegated workflows, it is worth exploring.

This post draws on content published by WitnessAI: AI adoption, governance, and the operational gap between AI investment and measurable impact. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-28.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org