Subscribe to the Non-Human & AI Identity Journal
Home FAQ Threats, Abuse & Incident Response What breaks when AI skills are judged only…
Threats, Abuse & Incident Response

What breaks when AI skills are judged only by static code review?

← Back to all FAQ
By NHI Mgmt Group Editorial Team Updated June 23, 2026 Domain: Threats, Abuse & Incident Response

Static review misses behaviours that only appear at runtime, including hidden exfiltration, environment variable access, and unauthorized network calls. A skill can look benign in source and still behave maliciously once executed, so code-only review creates a false sense of security and leaves the real attack path untested.

Why Static Code Review Fails for AI Skills

Static review is useful for spotting obvious unsafe code paths, but it is not enough for AI skills that execute with environment access, tool calls, and runtime inputs. A skill can appear harmless in source while still reading secrets, invoking external services, or changing behavior based on prompts, files, or network responses. That gap matters because runtime context is where most abuse actually emerges, which is why guidance from the NIST Cybersecurity Framework 2.0 and NHI research on the State of Secrets in AppSec both point toward operational validation, not code inspection alone.

For AI skills, the real question is not whether the source looks clean. It is whether the skill can be coerced at execution time into leaking credentials, chaining unauthorized tools, or reaching network destinations that were never intended by the developer. That is especially important when a skill inherits broad workload permissions from its host process or CI runner.

In practice, many security teams encounter misuse only after a skill has already executed in a production-like environment, rather than through intentional review of its source code.

How Runtime Behaviour Changes the Security Model

Static review checks what a skill is declared to do. Runtime testing checks what it can actually do when fed realistic inputs, connected to live services, and granted real credentials. That difference is critical for AI skills because their behavior is often conditional and context-driven. A benign-looking parser, retrieval helper, or orchestration routine may still access DeepSeek breach-style sensitive material if the execution environment exposes it.

Effective review usually combines code inspection with sandbox execution, policy enforcement, and telemetry. Security teams should verify:

  • What files, environment variables, and secrets the skill can read at runtime
  • Which outbound network calls occur during normal and adversarial prompts
  • Whether tool invocation is constrained by allowlists, not developer intent
  • Whether logs, traces, or callbacks leak tokens, payloads, or model outputs
  • Whether the skill behaves differently when system prompts, retries, or errors occur

Current guidance suggests treating AI skills as executable workloads, not passive code artifacts. That means pairing secure code review with runtime controls such as least privilege, ephemeral credentials, and request-time policy checks. Implementation thinking from NIST Cybersecurity Framework 2.0 is helpful here because it frames verification as an operational activity, not a one-time gate. The same logic appears in NHI research on secrets exposure, where leaked material often remains usable long after code has been approved.

These controls tend to break down when skills are executed inside broad CI runners or shared agent sandboxes, because the execution environment itself becomes the easiest path to secrets and network access.

Where Static Review Breaks Down in Real Environments

Tighter review often increases delivery overhead, so teams need to balance developer throughput against the risk of runtime abuse. The hard edge cases are usually environment-driven: a skill may pass review in a clean test harness but fail once deployed beside secret stores, metadata services, or internal APIs. Best practice is evolving, and there is no universal standard for this yet.

Two recurring failure modes stand out. First, static analysis cannot reliably prove the absence of hidden exfiltration when behavior depends on prompts, retrieved content, or third-party tool responses. Second, code review often assumes a stable trust boundary, while real AI workflows may discover new paths during execution, especially in multi-step orchestration. That is why organizations should validate skills in conditions that resemble production, including revoked-test credentials, restricted egress, and deliberate prompt abuse.

For teams managing secrets-heavy or agentic workloads, the practical lesson is simple: code review is necessary, but it is not a security verdict. The security decision belongs at runtime, where actual permissions, real data, and observable behavior can be measured together.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A04Static review misses agent behavior that only appears during execution.
CSA MAESTROGOV-05MAESTRO emphasizes runtime governance for autonomous tool-using systems.
NIST AI RMFAI RMF addresses operational validation of AI risks beyond source review.

Use AI RMF to assess behavior, context, and ongoing monitoring, not code alone.

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 23, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org