Why is code scanning not enough for AI agent security?

Code scanning finds vulnerabilities in the software artefact, but it does not establish identity, privilege, or accountability for the runtime actor. An agent can run secure code and still be over-permissioned, unaudited, or impossible to revoke cleanly. Security teams need both application security controls and identity controls to reduce that gap.

Why This Matters for Security Teams

Code scanning is necessary, but it only answers whether the artefact contains known bugs, insecure patterns, or exposed secrets. It does not tell you whether an AI agent is authorized to use tools, call APIs, move laterally, or retain long-lived access after the task ends. For agentic systems, the real risk sits in runtime identity and privilege, not just in source code quality.

This is why practitioners increasingly pair application security with identity controls. The OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework both push teams toward runtime risk management, because autonomous workloads can chain actions in ways static review never sees. NHIMG research shows the same pattern in the wild: the LLMjacking report highlights how compromised NHIs become an immediate attack path once agents have usable credentials.

In practice, many security teams discover excessive agent privilege only after an API key is reused, a tool is chained unexpectedly, or a revoked workflow still has live access.

How It Works in Practice

Effective agent security starts by treating the agent as a runtime actor with its own workload identity, not as a normal application binary. Static IAM roles are a poor fit for autonomous systems because agents do not follow a fixed call path. Their behaviour depends on prompts, context, tool outputs, and intermediate decisions, which means pre-defined access rules quickly become either too broad or too brittle.

Current guidance suggests combining four controls. First, assign a workload identity that can be cryptographically asserted at runtime, using patterns such as SPIFFE or OIDC-backed service tokens. Second, issue just-in-time credentials for the specific task, with short TTLs and automatic revocation after completion. Third, evaluate authorization at request time with policy-as-code, rather than pre-baking every permission into a static role. Fourth, log every tool invocation, secret access, and downstream action so the agent’s behaviour can be reconstructed after the fact.

This is where identity governance becomes more important than code scanning alone. A clean scan does not prevent an agent from being over-permissioned. A secure build does not stop an agent from reading a secret it should never have seen. NHIMG’s OWASP NHI Top 10 analysis and the Ultimate Guide to NHIs both reinforce that secrets, entitlement scope, and revocation discipline are the practical control points.

Use short-lived tokens per task, not shared credentials for the whole agent fleet.
Bind tool access to context, such as user request, workflow stage, and risk level.
Separate read, write, and escalation paths so one compromised capability does not unlock the rest.
Revoke access automatically when the task is complete or the agent state changes materially.

These controls tend to break down in long-running multi-agent workflows with shared memory and loosely governed tool plugins because the runtime context changes faster than static entitlements do.

Common Variations and Edge Cases

Tighter runtime authorization often increases operational overhead, requiring organisations to balance security gains against workflow latency, debugging complexity, and developer friction. That tradeoff is real, especially where agents need to coordinate across many tools or hand off work between systems.

Best practice is evolving, but there is no universal standard yet for how much autonomy should be allowed before human approval is required. In lower-risk environments, teams may accept broader tool access with aggressive monitoring. In higher-risk environments, especially where agents touch production systems, customer data, or secrets, the safer pattern is explicit step-up approval and tightly scoped JIT access.

Another edge case is secret scanning in repositories. It can still catch exposed API keys, certificates, and tokens, and it remains valuable for finding bad hygiene early. But the control stops at the artefact boundary. It will not detect whether an agent has inherited a secret through environment variables, a vault policy mistake, or an overbroad service account. That is why the issue is often visible only after abuse, not during build time. NHIMG’s State of Secrets in AppSec research shows the scale of this challenge, including the fact that remediation of leaked secrets can take weeks even in well-funded programmes.

For agentic systems, the right question is not whether the code scanned cleanly, but whether the runtime identity can be trusted, limited, observed, and revoked without delay.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Addresses runtime agent abuse that code scanning cannot see.
CSA MAESTRO		Covers agentic threat modeling, identity, and control-plane governance.
NIST AI RMF		Supports governance of autonomous AI risk beyond software scanning.

Use AIRMF to assign ownership, monitor runtime behaviour, and escalate high-risk agent actions.

Why is code scanning not enough for AI agent security?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group