How should security teams combine AI code scanning with runtime security?

Use AI code scanning to reduce obvious pre-deployment defects, then use runtime security to validate what is actually executing in production. The two controls answer different questions. Scanning improves code quality, while runtime controls expose privilege changes, process behaviour, and reachable attack paths that only appear after deployment.

Why This Matters for Security Teams

AI code scanning and runtime security solve different failure modes, and treating them as substitutes leaves a gap that attackers routinely exploit. Scanning can catch insecure prompts, secret exposure, unsafe dependencies, and obvious policy violations before release. Runtime security is what confirms whether the deployed workload is actually behaving as intended once it begins calling tools, moving data, and inheriting permissions. NHI risk becomes sharper here because the same API key, token, or service account can be present in code, build artefacts, and live execution paths. The State of Non-Human Identity Security report shows how often organisations still lack confidence in this area, which is consistent with what happens when pre-deployment checks are assumed to be enough. Current guidance suggests pairing static analysis with continuous validation aligned to NIST Cybersecurity Framework 2.0, not replacing one with the other. In practice, many security teams discover excessive privilege or live token abuse only after a release has already reached production, rather than through intentional testing.

How It Works in Practice

The practical model is a two-layer control stack. AI code scanning runs early in the SDLC to identify insecure code paths, hard-coded secrets, risky model/tool integrations, insecure deserialisation, and policy violations before merge. Runtime security then watches the live workload for what scanning cannot know in advance: actual process execution, outbound connections, privilege escalation, secret use, file system access, and tool invocation patterns. That second layer matters because AI systems and agentic services often change behaviour after deployment based on inputs, context, and retrieved data.

A workable implementation usually includes:

Scanning pull requests and build artefacts for exposed credentials, unsafe libraries, and insecure agent tool calls.
Enforcing least privilege at runtime with short-lived tokens, strong workload identity, and tightly scoped secrets.
Monitoring execution for anomalous command spawning, unusual network destinations, and new access paths.
Blocking or alerting when the runtime behaviour diverges from the approved policy baseline.

For teams dealing with NHIs, runtime controls are especially important because tokens and service identities are often more privileged than any single developer realises. The DeepSeek breach is a reminder that secrets and sensitive assets can surface in places static review does not fully control, while runtime telemetry shows whether those assets are actually being exercised. Best practice is evolving toward policy-as-code, where build-time findings and runtime detections feed one enforcement model instead of separate ticket queues. These controls tend to break down when pipelines are highly dynamic and ephemeral, because each deployment changes the execution context faster than manual approval and review can track.

Common Variations and Edge Cases

Tighter runtime control often increases operational overhead, requiring organisations to balance stronger containment against developer speed and alert volume. That tradeoff becomes sharper in agentic or self-modifying systems, where the code path at review time is not always the code path at execution time. In those environments, current guidance suggests treating runtime telemetry as the source of truth and using scanning as an early warning system rather than a gate that guarantees safety.

Edge cases matter. Some teams over-invest in scanning and under-invest in runtime baselining, which creates blind spots for lateral movement, chained tool use, and privilege drift. Others do the opposite and ignore code hygiene, which leaves obvious secrets, weak dependencies, and misconfigurations available to attackers before runtime controls ever activate. Neither control is complete on its own. The strongest pattern is to use scanning to reduce avoidable defects, then use runtime security to enforce the actual trust boundary for the deployed workload. This is especially relevant for organisations building on autonomous services, where runtime identity and live authorisation decisions matter more than pre-approved roles. The operational lesson is simple: if the workload can decide, adapt, or invoke tools after release, security must validate the execution path, not just the source code.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A01	Runtime checks are critical when agent behaviour diverges from static code review.
CSA MAESTRO	M2	MAESTRO ties agent governance to continuous validation across the runtime path.
NIST AI RMF		AIRMF supports ongoing measurement of AI system behaviour after deployment.

Combine build-time scanning with runtime policy enforcement and telemetry for every agent action.

How should security teams combine AI code scanning with runtime security?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group