What do organisations get wrong about AI monitoring?

Why This Matters for Security Teams

AI monitoring often fails because teams measure availability, latency, and error rates while missing whether the system is still making safe, policy-compliant decisions. That gap matters most when the model is not broken in an obvious way. It is drifting, being steered, or producing outputs that remain technically valid but operationally risky. NIST’s NIST Cybersecurity Framework 2.0 treats resilience as more than uptime, and NHIMG’s Top 10 NHI Issues shows how identity and behaviour failures often surface before classic infrastructure alerts.

The common mistake is assuming that if the model endpoint is healthy, the AI is healthy. In practice, adversarial ML rarely announces itself with a crash. It shows up as repeated prompt-pattern abuse, subtle output shifts, policy bypass attempts, or a gradual increase in anomalous tool use. Security teams that only watch service telemetry miss the behavioural layer where the actual risk lives. In practice, many security teams encounter AI misuse only after downstream decisions have already been influenced, rather than through intentional behavioural detection.

How It Works in Practice

Effective AI monitoring needs to combine infrastructure telemetry, identity telemetry, and output behaviour analysis. The goal is to detect when an AI system is acting outside its expected intent, not just when it is unreachable. That means watching for repeated prompt sequences, unusual retrieval patterns, tool invocation spikes, escalating privilege requests, and output drift across similar inputs. NHIMG’s NHI Lifecycle Management Guide is useful here because monitoring should follow the full identity lifecycle, from provisioning to revocation and review.

Practitioners should separate three layers:

Service health: uptime, errors, token throughput, latency, and rate limits.

Behavioural health: output consistency, policy compliance, hallucination trends, and anomalous refusals or over-permissive responses.

Security health: prompt injection signals, suspicious tool calls, data exfiltration patterns, and identity misuse.

This is where current guidance increasingly aligns with the Ultimate Guide to NHIs — Key Challenges and Risks: the system can remain online while its trustworthiness erodes. Security teams should define baseline behaviour for common tasks, then alert on deviation from that baseline rather than waiting for a service outage. For higher-risk deployments, that should be paired with policy evaluation at request time and human review for sensitive actions. These controls tend to break down when logs are fragmented across model gateways, vector stores, and downstream tools because no single team can reconstruct the full decision path.

Common Variations and Edge Cases

Tighter behavioural monitoring often increases noise and analyst workload, requiring organisations to balance early detection against alert fatigue. That tradeoff is especially visible in high-volume environments where legitimate variation is normal. Best practice is evolving, but there is no universal standard for how much drift is acceptable across different model classes, tasks, and business contexts.

Some environments need more than generic anomaly detection. Customer support agents, code assistants, and autonomous workflow agents have very different baselines, so a single scoring model usually underperforms. Edge cases also include retraining cycles, seasonal query shifts, and chained-agent workflows, where one agent’s output becomes another agent’s input. In those settings, an apparently minor anomaly can propagate quickly across systems. NHIMG’s DeepSeek breach is a reminder that exposed data and compromised context can amplify model risk long before a traditional incident is declared.

Organsations that only monitor the model will still miss the broader control plane. Monitoring should include prompts, embeddings, tool access, identity use, and downstream actions, not just model accuracy dashboards. That is the practical difference between supervising an application and governing an AI system.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		Addresses behavioural drift, prompt abuse, and unsafe agent actions.
CSA MAESTRO		Covers monitoring and runtime controls for agentic AI systems.
NIST AI RMF		Supports ongoing measurement and governance of AI risk over time.

Track agent outputs, tool use, and drift signals as security events, not just service metrics.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What do organisations get wrong about AI monitoring?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group