What breaks when code mode gives agents more runtime freedom?

Why This Matters for Security Teams

When code mode expands an agent from “requesting a tool” to “executing a plan,” the security question changes from access approval to runtime containment. Static IAM, RBAC, and even well-tuned PAM can authorize the initial action while still leaving the agent free to chain tools, write files, call APIs, or alter infrastructure in ways the original request did not make obvious. That is why current guidance increasingly points to OWASP Top 10 for Agentic Applications 2026 and the NIST AI Risk Management Framework as useful baselines, because both emphasize context, accountability, and ongoing oversight rather than one-time approval.

This matters even more for NHIs because agents often operate through secrets, service accounts, and workload identities that can outlive the task they were created for. NHI Mgmt Group research shows 97% of NHIs carry excessive privileges, which means runtime freedom often lands in an environment that is already overexposed. In practice, many security teams discover the control gap only after the agent has already executed a broader sequence of actions than anyone intended.

How It Works in Practice

The practical fix is to treat agent runtime access as a series of constrained decisions, not a single login event. That usually means issuing JIT credentials for a narrowly scoped task, binding those credentials to a workload identity, and evaluating policy at request time. The identity primitive should describe what the agent is, not just what secret it holds. In mature designs, that means cryptographic workload identity, short-lived tokens, and explicit revocation when the task ends.

For agentic systems, intent-based authorisation is often a better mental model than traditional role assignment. The policy engine should ask: what is the agent trying to do, against which resource, with what context, and under what risk threshold? That is where policy-as-code approaches fit, whether teams use OPA, Cedar, or another engine. The runtime then constrains tool calls, file writes, network reachability, approval thresholds, and data access based on context rather than a pre-baked role. Guidance in CSA MAESTRO agentic AI threat modeling framework and the NIST AI Risk Management Framework aligns with this shift because both stress risk management across the full lifecycle, not just the moment of authentication.

A useful operational pattern is:

issue a short-lived secret only after task approval;

bind it to the workload identity and the specific target service;

limit the agent to the minimum set of tools and scopes needed;

monitor for chaining behavior, lateral movement, and repeated escalation attempts;

revoke credentials automatically when the task completes or the policy signal changes.

NHIMG coverage of OWASP NHI Top 10 and Analysis of Claude Code Security shows why this matters: once an agent can translate intent into executable steps, the attacker is no longer limited to a single API call path. These controls tend to break down when the agent can reach general-purpose shells, broad cloud roles, or shared CI/CD runners because the runtime itself becomes the privilege boundary.

Common Variations and Edge Cases

Tighter runtime control often increases latency, policy complexity, and operator overhead, so organisations have to balance safety against deployment friction. There is no universal standard for this yet, especially across mixed stacks that combine MCP tools, cloud APIs, local code execution, and human approval loops. Best practice is evolving, not settled.

One common edge case is a semi-autonomous agent that starts in a narrow workflow but can escalate into broader actions when a tool returns unexpected output. Another is a distributed agent fleet, where one agent’s compromised token can cascade into many services if workload identity is not isolated per task. NHIMG reporting on the Moltbook AI agent keys breach and the AI LLM hijack breach illustrates how quickly credential exposure turns into execution exposure when secrets are long-lived or broadly reusable.

Another practical wrinkle is that JIT access is only effective if revocation is reliable and observable. If the environment cannot invalidate tokens quickly, or if the agent caches credentials locally, the supposed “ephemeral” control becomes a longer-lived shadow privilege. In guidance terms, that is where OWASP Agentic AI Top 10 and zero-trust design principles converge: treat every runtime action as untrusted until the current context proves otherwise.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A3	Agentic runtime freedom creates prompt-to-action abuse and tool chaining risk.
CSA MAESTRO	M-2	MAESTRO covers runtime controls for autonomous agents and their tool access.
NIST AI RMF	GOVERN	AI RMF governance is needed when agents execute actions autonomously.

Assign ownership and oversight for agent behavior, with monitoring and incident response tied to runtime risk.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What breaks when code mode gives agents more runtime freedom?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group