TL;DR: Large language models still recommend nonexistent packages at material rates, with 24.2% hallucinations in GPT-4, 22.2% in GPT-3.5, 64.5% in Gemini, and 29.1% in Cohere across 47,803 how-to prompts, according to Lasso Security research. The risk is not just bad answers, but a poisoned dependency path that security and engineering teams must validate before code reaches production.
NHIMG editorial — based on content published by Lasso Security: Diving Deeper into AI Package Hallucinations
By the numbers:
- In total we received 24.2% of hallucinations, with 19.6% of repetitiveness.
- In total we received 64.5% of hallucinations, with 14% percentages of repetitiveness.
- In three months the fake and empty package got more than 30k authentic downloads!
Questions worth separating out
Q: What is the biggest risk when developers rely on LLMs for package recommendations?
A: The biggest risk is that the model invents a package that sounds legitimate, and the developer treats it as real.
A: Because the attacker does not need to break the model to exploit the output.
Q: How can security teams tell whether AI-generated package suggestions are being trusted too much?
A: Look for invented dependency names in code, tickets, build scripts, and chat threads, then trace whether they were ever validated against a registry or approved source.
Practitioner guidance
- Verify every AI-suggested dependency against a trusted registry Check package existence, maintainer identity, release history, and repository ownership before accepting any model-recommended dependency into code, documentation, or a ticket.
- Add a provenance gate before installation Block builds unless the package comes from an approved source, has a verifiable signature or checksum, and matches an internal allowlist for the language ecosystem in use.
- Train developers to cross-check model output Make AI-generated package names part of secure coding review.
What's in the full report
Lasso Security's full research covers the operational detail this post intentionally leaves for the source:
- The full experimental setup across 47,803 how-to questions and four LLMs, including language-by-language methodology.
- The per-model breakdown of hallucinated package rates and repetitiveness across GPT-4, GPT-3.5, Gemini, and Cohere.
- The cross-model intersection analysis showing which hallucinated packages recurred across multiple systems.
- The case-study notes on the fake package test and the GitHub repository search that surfaced real-world adoption.
👉 Read Lasso Security's research on AI package hallucinations and supply-chain risk →
AI package hallucinations: what security teams need to do now?
Explore further