What breaks when unvetted skills are allowed into an agent marketplace?

The trust model breaks first. A user installs a skill expecting a utility feature, but the skill can become a delivery path for data exfiltration, malicious commands, or credential theft. Once the agent accepts the skill, the attacker inherits the agent’s permissions and can operate through a legitimate identity channel.

Why This Matters for Security Teams

An agent marketplace changes the trust boundary from code installation to runtime authority. When unvetted skills are allowed in, the issue is not just whether the skill is malicious, but whether it can inherit an agent’s permissions and act through a legitimate identity channel. That makes data loss, command injection, and credential theft more likely than with a normal app store model, because the agent is already authorised to do useful work.

Current guidance from the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework treats this as a governance and runtime control problem, not a static software review problem. NHI Management Group has repeatedly shown how exposed identities and secrets amplify blast radius, including the finding that 92% of organisations expose NHIs to third parties in the Ultimate Guide to NHIs. In practice, many security teams encounter marketplace abuse only after a skill has already chained into sensitive tools and moved data out through an approved workflow.

How It Works in Practice

A safe marketplace needs to treat every skill as an untrusted workload until it proves what it is, what it can do, and under what context it may act. That means the control plane must evaluate authorisation at runtime, not just during install. Static RBAC alone is too blunt for autonomous systems because an agent’s tool use is goal-driven and variable. Best practice is evolving toward intent-based checks, short-lived credentials, and workload identity, especially where skills can call other tools on the agent’s behalf.

In practical terms, teams should combine policy-as-code with ephemeral access:

Issue JIT credentials per task and revoke them automatically when the task ends.
Bind each skill to a workload identity, such as SPIFFE-based identities or OIDC tokens, so the platform knows what is acting.
Limit each skill to narrowly scoped tool grants, not broad agent permissions.
Evaluate every privileged action against context, including destination, data type, and user intent.
Log skill provenance, prompt inputs, tool calls, and data egress for forensics.

The operational lesson is reinforced by NHIMG research on the OWASP NHI Top 10, which highlights how agentic applications can turn trusted integrations into attack paths. The same pattern appears in the AI LLM hijack breach analysis, where the compromise was not the model alone but the permissions attached to the workflow. These controls tend to break down when marketplace skills can self-chain into other tools across loosely segmented tenants because the platform loses a reliable boundary for intent, provenance, and revocation.

Common Variations and Edge Cases

Tighter skill approval often increases friction, so organisations must balance developer velocity against the risk of delegated execution. That tradeoff becomes sharper in marketplaces that support third-party plugins, internal power-user skills, or rapid experimentation by business teams. There is no universal standard for this yet, but current guidance suggests treating high-impact skills differently from low-risk utilities.

Three edge cases matter most. First, a benign skill can become unsafe after an update, so approval should be version-specific rather than permanent. Second, a skill that handles sensitive data may need a separate approval path even if its code looks harmless, because the real risk is runtime access, not static code quality. Third, agent-to-agent workflows can create privilege stitching, where each individual skill is limited but the chain collectively exposes sensitive systems.

For teams building governance around this, the NIST AI Risk Management Framework and the CSA MAESTRO agentic AI threat modeling framework both support layered review, monitoring, and accountability. The practical rule is simple: if a skill can touch secrets, external systems, or identity boundaries, it should be treated as an agentic supply-chain risk, not a harmless marketplace add-on.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Unvetted skills are a supply-chain and tool-use risk in agentic apps.
CSA MAESTRO		MAESTRO maps trust boundaries, workflow abuse, and agent chain risk.
NIST AI RMF		AIRMF guides governance for unpredictable autonomous behaviour and impact.

Model marketplace skills as untrusted components and add provenance, approval, and monitoring controls.

What breaks when unvetted skills are allowed into an agent marketplace?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group