What is the biggest risk when developers rely on LLMs for package recommendations?

Why This Matters for Security Teams

LLM package recommendations look helpful because they compress research time, but they also blur the line between a real dependency and a plausible-sounding invention. In software supply chains, that matters immediately: a developer can copy a package name from an answer, install it, and extend trust to code that was never validated for existence, ownership, or provenance. The risk is not just broken builds; it is dependency confusion, typosquatting exposure, and silent introduction of malicious code.

NHIMG research on The State of Secrets in AppSec shows how often security assumptions break down once sensitive workflows move into everyday developer activity. Current guidance from OWASP Agentic AI Top 10 treats hallucinated outputs as a trust problem, not a prompt-quality problem, because downstream action is where the damage occurs. In practice, many security teams encounter package risk only after a build pipeline has already consumed a false recommendation and expanded the blast radius.

How It Works in Practice

The failure mode is simple. A developer asks an LLM for a library to solve a task, and the model returns a package name that sounds credible. If the team treats that answer as authoritative, the next step is often a direct install, an internal wiki update, or a code example copied into production work. None of those steps verify whether the package exists, whether the publisher is legitimate, or whether the package has an abuse history.

Good practice is to force a verification step before any AI-suggested dependency is used. That usually means checking the package registry directly, confirming the maintainer namespace, reviewing download history, and validating provenance through signed releases or internal allowlists. Where possible, organisations should pair this with policy gates in CI so that unapproved dependencies cannot enter the software bill of materials without review. For broader governance, NIST AI Risk Management Framework and CSA MAESTRO agentic AI threat modeling framework both support the idea that AI outputs need downstream controls, not just model-side safeguards.

NHIMG has documented adjacent supply-chain abuse in the LiteLLM PyPI package breach, which is a reminder that package trust is an identity and provenance problem as much as a code quality problem. Teams should also align detection with package intelligence from the registry itself and with internal software admission controls. These controls tend to break down when developers are allowed to bypass normal review paths in fast-moving experimentation environments because AI suggestions are copied straight into production work.

Common Variations and Edge Cases

Tighter dependency controls often increase developer friction, requiring organisations to balance speed against supply-chain assurance. That tradeoff is real, especially in prototyping teams that use LLMs as a first-pass research tool. Best practice is evolving, but there is no universal standard for when an AI-suggested dependency may be trusted without manual review.

Edge cases matter. A package may exist but be abandoned, recently transferred, or renamed in a way that makes the LLM answer misleading even if the name is technically valid. Internal packages are another exception: if a team mirrors dependencies or uses private registries, the model may recommend a public package that is functionally similar but operationally wrong. In regulated environments, the safer pattern is to treat the LLM as a suggestion engine only, then require human verification against registry metadata, maintainer identity, and approved source policy. NHIMG’s OWASP NHI Top 10 coverage and its Analysis of Claude Code Security both reinforce a simple point: automation accelerates mistakes as quickly as it accelerates work if trust checks are missing.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	N/A	Hallucinated package names are a trust and action-risk in agentic workflows.
OWASP Non-Human Identity Top 10	NHI-03	Package trust depends on identity, provenance, and secret hygiene.
NIST AI RMF		AI RMF applies to governing downstream risks from model output misuse.

Add human verification and policy gates around AI-generated dependency recommendations.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

What is the biggest risk when developers rely on LLMs for package recommendations?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group