The clearest signals are unexpected tool calls, access to data outside the approved scope, repeated retrieval of unneeded context, and actions that diverge from the documented workflow. If ownership, lineage, or policy posture is missing from the telemetry, teams cannot tell whether the system is drifting or simply behaving as designed.
Why This Matters for Security Teams
Drift is not just a policy problem; it is an identity and governance problem that shows up when a system begins operating beyond the scope it was approved for. For AI systems, that often means new tool use, broader retrieval, or access paths that were never part of the documented workflow. NIST’s NIST Cybersecurity Framework 2.0 remains useful here because it frames governance, access, and continuous monitoring as operational controls rather than one-time approvals.
Security teams get into trouble when they rely on the original use case as a static label instead of testing whether the system is still behaving inside that label. In practice, drift is often invisible until a model starts retrieving unnecessary context, calling tools it was not meant to use, or touching data with a higher sensitivity than the business owner expected. NHIMG has documented how exposed credentials can be abused very quickly in the real world, including the LLMjacking pattern where attackers move fast once a credential is available. In practice, many security teams encounter drift only after a workflow has already been expanded by convenience, not through intentional approval.
How It Works in Practice
Detecting drift requires looking at behaviour, not just configuration. The strongest signal is a mismatch between the approved task boundary and the actual runtime activity. That means monitoring what the system asks for, what it receives, and what it does next. For autonomous or semi-autonomous systems, the approval record should describe the task, data class, tools, and allowed side effects. Anything outside that envelope is a candidate drift signal.
Useful telemetry usually includes:
- Tool invocation frequency and sequence, especially when the system begins chaining actions not required for the original workflow.
- Retrieval scope, including repeated pulls of context that do not improve the task outcome.
- Data access patterns, especially when the system reaches into repositories, records, or prompts beyond its expected scope.
- Policy exceptions, override requests, and fallback behaviours that indicate the system is compensating for missing permissions.
For governance, current guidance suggests pairing policy-as-code with runtime evaluation so the system is judged against context, not just a static role. NIST’s NIST Cybersecurity Framework 2.0 supports this kind of continuous control thinking, while NHIMG research on the Salesloft OAuth token breach shows how quickly access can be repurposed once trust boundaries are loose. Where available, teams should also tie each system to an owner, a workload identity, and a known approval scope so that drift can be detected as a measurable change instead of a subjective judgment. These controls tend to break down when telemetry is fragmented across vendors, because no single log stream reveals the full sequence of decision, access, and action.
Common Variations and Edge Cases
Tighter drift detection often increases operational overhead, requiring organisations to balance better assurance against more alert noise and more review work. That tradeoff is especially visible when systems are exploratory by design, such as research assistants, code-generation agents, or multi-step orchestration pipelines.
There is no universal standard for this yet, but current guidance suggests treating certain patterns as higher risk even when they are not outright violations. Examples include repeated retrieval of adjacent context, tool calls that expand privileges temporarily, and changes in data sensitivity after a prompt or workflow handoff. In some environments, extra retrieval is harmless; in others, it is an early sign that the system is compensating for weak task scoping. The difference comes down to whether the action is explainable by the approved use case and whether ownership and lineage are visible in telemetry.
NHIMG’s coverage of the DeepSeek breach is a reminder that hidden scope expansion often coexists with exposed data paths, while the JetBrains GitHub plugin token exposure illustrates how quickly trusted tooling can become a wider access problem. Teams should be especially cautious in environments that mix human prompts, autonomous actions, and shared credentials, because drift can look like normal productivity until it crosses a boundary that was never enforced in the first place.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | A2 | Covers unsafe tool use and agent behaviour drift beyond intended scope. |
| CSA MAESTRO | GOV-03 | Addresses governance, policy enforcement, and runtime oversight for agentic systems. |
| NIST AI RMF | Supports continuous measurement and governance of AI system behaviour and risk. |
Define allowed workflows, owners, and runtime checks so agent actions stay within approved boundaries.