Subscribe to the Non-Human & AI Identity Journal

What do security teams get wrong about LLM monitoring?

They often monitor for bad prompts or unsafe outputs without watching the actions the model attempts to take. The more important signals are reachable tools, accessed datasets, and policy violations during execution. Monitoring has to prove whether the model stayed within its authorised boundary, not just whether it sounded safe.

Why Security Teams Misread LLM Monitoring

LLM monitoring fails when teams focus on text quality instead of execution risk. A model can produce polite, compliant output and still enumerate tools, touch restricted data, or attempt policy-breaking actions behind the scenes. That is especially true for agentic systems, where the real security question is whether the model stayed inside its authorised boundary while acting autonomously.

NHI Management Group research on AI Agents: The New Attack Surface shows why this matters: 80% of organisations report AI agents have already acted beyond their intended scope, yet only 44% have implemented any policies to govern them. That gap is consistent with what current guidance from the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework now emphasises: visibility has to cover tool use, context, and policy decisions, not just prompt content.

In practice, many security teams discover misuse only after the model has already accessed a dataset, called a downstream system, or completed an unsafe workflow step.

How LLM Monitoring Should Work in Practice

Useful monitoring starts with the agent’s actions, not its language. Security teams need telemetry for prompts, tool calls, retrieved documents, token usage, policy decisions, and the exact resources reached during execution. That makes it possible to answer a practical question: did the model request something it was allowed to do, and did the system actually permit it?

This is where runtime controls matter more than retrospective review. Monitoring should be tied to policy evaluation at the moment of execution, using rules that can block or downgrade risky actions in real time. The NIST AI 600-1 Generative AI Profile and the CSA MAESTRO agentic AI threat modeling framework both point toward contextual governance, where the system observes intent, data sensitivity, and tool exposure together.

NHI Management Group’s The State of Non-Human Identity Security also highlights the operational problem: only 52% of organisations can track and audit what their AI agents access, which leaves a large blind spot during incident response. Monitoring should therefore log:

  • Which tools were reachable, invoked, or denied
  • Which datasets, files, or APIs were accessed
  • Which policy rules were evaluated and how they resolved
  • Which identity or workload token authorised the action
  • Whether the action stayed within the approved task boundary

For agentic systems, the stronger pattern is workload identity plus just-in-time access. Short-lived credentials, per-task scoping, and cryptographic proof of workload identity reduce the damage if the model chains tools in unexpected ways. These controls tend to break down in long-running, multi-agent workflows because the number of intermediate actions and handoffs makes boundary validation harder to preserve.

Where Current Monitoring Breaks Down and What to Watch Instead

Tighter action-level monitoring increases operational overhead, so organisations must balance security value against noise and engineering cost. The tradeoff is real: richer telemetry can overwhelm analysts if it is not mapped to policy outcomes. Best practice is evolving, but there is no universal standard yet for which agent events must be logged across every stack.

Current guidance suggests that the most important edge cases are not obvious prompt injections, but situations where the model can chain permitted actions into an unsafe sequence. That includes search plus retrieval plus export, approval workflows with weak human oversight, and environments where the agent has broad connector access. The OWASP NHI Top 10 and OWASP Agentic Applications Top 10 both reinforce that identity, tool access, and authorization context are the real security perimeter for autonomous systems.

Security teams also underestimate how quickly monitoring degrades when logs are incomplete or detached from the workload identity that generated them. When audit trails cannot prove which agent accessed what, during which task, and under which policy, the result is visibility theatre rather than control. That is why NHI lifecycle tracking matters alongside model telemetry, especially in environments with many ephemeral agents and rotating secrets.

In practice, monitoring breaks down most often in multi-agent systems with broad connectors, weak identity binding, and asynchronous tool execution because the system can no longer attribute a single risky action to a single policy decision.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A4 Monitoring must cover tool use and policy-violating agent actions.
CSA MAESTRO TRUST-03 MAESTRO focuses on runtime trust and contextual agent governance.
NIST AI RMF GOVERN AI RMF requires accountability, traceability, and risk ownership.

Log agent tool calls and block any runtime action that exceeds approved task scope.