Subscribe to the Non-Human & AI Identity Journal

Why do AI models create governance risk even without retraining?

Because behaviour can change at inference time when the model sees new context, examples, or instructions. That means access decisions made before a session starts are not enough on their own. Practitioners need controls that address what the model can consume and do during execution, not only what it was permitted to access originally.

Why This Matters for Security Teams

AI models can introduce governance risk even when weights stay frozen because the risk surface shifts at inference time. New prompts, tool calls, retrieval results, and chained instructions can change what the system consumes and what it is able to do in a live session. That makes pre-approved access lists and static review gates incomplete, especially when the model can act with authority through connected services.

NIST’s NIST Cybersecurity Framework 2.0 is useful here because it pushes teams toward continuous governance, not one-time approval. NHIMG’s Ultimate Guide to NHIs — Key Challenges and Risks similarly treats identity exposure as an operational control problem, not just a provisioning problem. The practical issue is that the model’s effective behaviour can diverge from the design-time assumptions embedded in policy, especially once tools, memory, and external context are attached.

In practice, many security teams encounter model-driven misuse only after an agent has already accessed data or executed a tool chain that nobody explicitly approved.

How It Works in Practice

Governance for frozen models is still necessary, but it is not sufficient. The real control point is the live session, where the model receives context, interprets instructions, and may call tools or APIs. That is why current guidance increasingly favours runtime controls that evaluate intent, context, and risk on each action rather than relying only on the original authorization decision.

A practical design usually combines workload identity, short-lived credentials, and policy checks at execution time. A model or agent should present a cryptographic workload identity, such as an OIDC token or SPIFFE-based identity, so the platform can distinguish the workload from the user who launched it. Access should then be issued just in time, scoped to the task, and revoked automatically when the task ends. This is the opposite of long-lived secrets sitting in environment variables or shared vault paths.

  • Use runtime authorization for each sensitive action, not only at session start.
  • Bind tool access to workload identity, not just to the application name or tenant.
  • Issue ephemeral secrets with tight TTLs and automatic revocation.
  • Log prompts, tool calls, and policy decisions together so investigators can reconstruct the full chain.

For non-human identity programs, NHIMG’s Top 10 NHI Issues and Ultimate Guide to NHIs — Lifecycle Processes for Managing NHIs both reinforce the same operational pattern: identity must be governed across the full lifecycle, not only at issuance. These controls tend to break down when agents are allowed broad tool access inside flat network segments because lateral movement and chained actions become hard to distinguish from legitimate work.

Common Variations and Edge Cases

Tighter runtime control often increases friction, requiring organisations to balance user experience and automation speed against containment and auditability. That tradeoff is real, and best practice is still evolving for agentic systems that negotiate between autonomy and supervision.

One common edge case is retrieval-augmented generation. If the model can pull in external documents, its behaviour may change based on content it was never explicitly trained on, so the governance question becomes: what sources may it consult, and what may it do with that information? Another edge case is multi-agent orchestration, where one agent can inherit context from another and amplify a small policy gap into a full workflow escape.

Model governance also breaks down when organisations treat prompt filters as the primary control. Prompt screening helps, but it does not replace policy-as-code, tool-level authorization, or per-action approvals for high-risk operations. The OWASP NHI Top 10 is a strong reference point for these emerging agent risks, while the 2024 ESG Report: Managing Non-Human Identities shows how often NHI compromises already translate into real incidents. The reported 72% breach or suspected breach rate is a reminder that poor control over non-human access is not theoretical.

Where organisations run autonomous agents against production systems, guidance breaks down fastest when the model is allowed to accumulate context across tasks without fresh authorization because past trust becomes a substitute for present validation.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A01 Runtime model behavior changes create agent-specific abuse paths.
CSA MAESTRO M2 Covers governing autonomous agent actions and runtime authorization.
NIST AI RMF AI RMF addresses ongoing governance of changing AI behavior at runtime.

Apply continuous monitoring and human accountability to live AI outputs and actions.