TL;DR: AI-generated code introduces insecure-by-default patterns, hallucinated dependencies, and faster secret exposure, while 5.2% of commercial-model outputs and 21.7% of open-source-model outputs contained hallucinated packages in one USENIX study, according to Orca Security. Reachability, code-to-cloud context, and remediation quality now matter more than raw alert volume because the real question is whether a finding can actually be exploited.
NHIMG editorial — based on content published by Orca Security: AI Code Security Solutions in 2026
By the numbers:
- A USENIX Security 2025 study of 576,000 AI-generated code samples found that hallucinated packages appeared in at least 5.2% of commercial-model outputs and 21.7% of open-source-model outputs.
- When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases.
Questions worth separating out
Q: How should security teams govern AI-generated code in production environments?
A: Treat AI-generated code as untrusted until it passes the same controls you apply to external contributions.
Q: Why do AI-generated dependencies create more risk than normal dependency churn?
A: Because the model can invent a package name that looks legitimate but has no trusted history.
Q: What do security teams get wrong about vulnerability severity in AI-assisted code?
A: They often assume the highest-severity finding should always be fixed first.
Practitioner guidance
- Treat AI-generated code as untrusted input Require review and security scanning for AI-generated changes before merge, especially in authentication, authorization, secrets, and data-handling code paths.
- Verify dependency provenance before installation Pin versions, confirm the package exists, and reject unfamiliar imports until they are independently validated against the source registry.
- Prioritise reachable findings over raw severity Use reachability analysis and runtime context to sort fixes by whether the vulnerable code actually executes in production.
What's in the full article
Orca Security's full article covers the operational detail this post intentionally leaves for the source:
- Comparative notes on AI code security vendors and how their workflows differ in practice.
- Detailed capability breakdowns across SAST, SCA, secrets detection, PR review, and remediation.
- Product-level discussion of code-to-cloud prioritisation and runtime exposure mapping.
- Selection criteria for teams choosing between developer-first, enterprise AppSec, and cloud-native platforms.
👉 Read Orca Security's analysis of AI code security solutions in 2026 →
AI code security and reachability: are your controls keeping up?
Explore further
AI code security is now an identity governance problem, not just an AppSec problem. AI-generated code does not merely introduce flaws. It creates new trust decisions about who or what authored a dependency, whether a secret is still live, and whether a workload should ever execute the code path that was generated. That pushes the topic directly into NHI governance because secrets, tokens, and workload identities are now part of the same control plane as source review. Practitioners should treat generated code as an identity-bearing artifact, not just a development convenience.
A few things that frame the scale:
- A USENIX Security 2025 study of 576,000 AI-generated code samples found that hallucinated packages appeared in at least 5.2% of commercial-model outputs and 21.7% of open-source-model outputs, according to LLMjacking: How Attackers Hijack AI Using Compromised NHIs.
- Another finding from the same research shows that more than 205,000 unique fake package names were observed, which explains why dependency trust now needs machine verification as well as human review.
A question worth separating out:
Q: How do organisations know if their AI code security controls are actually working?
A: Look for fewer reachable findings reaching production, faster revocation of committed secrets, and lower rates of unverified dependencies in pull requests. If scans are producing volume but not changing which issues are blocked before deployment, the programme is generating noise rather than risk reduction.
👉 Read our full editorial: AI code security now hinges on reachability, not severity