What do organisations get wrong when they secure AI only at the model layer?

They often protect the model while leaving the data and action layer under-governed. If an agent can retrieve sensitive information and act on it, model safety alone does not stop misuse. Effective governance has to control data exposure, tool use, and the authorisation path that turns context into action.

Why This Matters for Security Teams

Securing AI only at the model layer gives a false sense of control. The model may be constrained, but the surrounding system can still expose sensitive data, retrieve secrets, and trigger actions through tools or APIs. That means the real risk sits in the data and action layers, where an AI agent can turn context into execution. Current guidance from NIST Cybersecurity Framework 2.0 and the NIST Cybersecurity Framework 2.0 emphasises governance, access control, and continuous risk management, not just content filtering. In practice, many security teams encounter misuse only after an agent has already accessed data it should not have seen and acted on it with valid privileges, rather than through intentional model compromise.

How It Works in Practice

The practical failure mode is simple: the model is treated as the security boundary, while the agent’s identity, permissions, and runtime context are left too broad. A safer design assumes the agent is an autonomous workload with execution authority, not a passive chatbot. That means the control plane must decide, at request time, whether the agent may read a record, call a tool, or submit an action. For agentic systems, that decision should be tied to workload identity, task scope, and policy rather than a static role alone.

In mature designs, this looks more like just-in-time credential provisioning, short-lived secrets, and intent-based authorisation. The agent receives only the minimum capability needed for the task, and that capability expires quickly. This is where NIST Cybersecurity Framework 2.0 aligns with operational reality: protect identities, constrain access, and detect misuse continuously. It also reflects the direction of the NIST Cybersecurity Framework 2.0 and the emerging agentic guidance in NHI security.

Use workload identity for the agent, not shared human credentials.
Issue JIT credentials per task and revoke them on completion.
Evaluate policy at runtime using context such as data sensitivity, tool type, and business purpose.
Separate read access from actuation so the agent cannot automatically turn retrieved context into action.

Threat research shows why this matters. In the DeepSeek breach, exposed data included backend credentials and API keys, which is exactly the kind of spill that model-only controls do not prevent. The same lesson appears in the DeepSeek breach as a reminder that retrieval and action pipelines must be governed end to end. These controls tend to break down when legacy applications require long-lived service accounts because the agent cannot be cleanly scoped to a single task or session.

Common Variations and Edge Cases

Tighter runtime controls often increase operational overhead, requiring organisations to balance safety against latency, integration effort, and developer friction. That tradeoff is real, especially where agents must chain multiple tools or complete long-running workflows. Best practice is evolving, and there is no universal standard for this yet, but static RBAC alone is not enough for autonomous systems. The more an agent can plan, retry, and re-route around failures, the more important it becomes to control each step rather than trust the model’s output.

Edge cases appear in high-trust environments, internal copilots, and regulated workflows where teams assume the network perimeter is sufficient. It is not. Agent behaviour can change with prompt context, data retrieval results, and tool availability, so permissioning must reflect what the agent is trying to do right now. For implementation detail, NIST Cybersecurity Framework 2.0 supports the broader governance pattern, while the DeepSeek breach illustrates how quickly sensitive material can escape once retrieval paths are overexposed. The same risk pattern also shows up when organisations rely on static secrets instead of ephemeral credentials for agent workloads. In practice, the hardest failures emerge when an agent has legitimate access to useful data but no clear boundary around what actions that data is allowed to trigger.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A03	Addresses over-permissioned agents and unsafe tool use at runtime.
CSA MAESTRO	M2	Covers agent identity, policy enforcement, and controlled actuation.
NIST AI RMF		Supports governance for AI systems whose behaviour can change with context.

Establish accountability, monitoring, and risk controls across the full agent lifecycle, not just the model.

What do organisations get wrong when they secure AI only at the model layer?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group