Subscribe to the Non-Human & AI Identity Journal

Promptware

Promptware is malicious content designed to make an AI system carry out harmful actions through instructions instead of code. In practice, it abuses the model’s obedience to context, policy text, or local project prompts to redirect legitimate tools toward theft, exfiltration, or unauthorized changes.

Expanded Definition

Promptware is a form of malicious content that targets an AI system’s instruction-following behavior rather than its code path. It may appear in prompts, context windows, embedded documents, policy text, chat history, or local project instructions that the model treats as authoritative. The goal is to redirect legitimate execution toward harmful outcomes such as data theft, exfiltration, privilege misuse, or unauthorized changes. For security teams, the relevant boundary is not whether the content “looks like code,” but whether the model treats it as actionable instruction.

Definitions vary across vendors, but in NHI and agentic AI security the term usually covers both direct prompt injection and indirect instruction poisoning when the malicious content influences an autonomous agent’s tool use. That makes promptware closely related to the broader AI control plane, where an agent can access files, APIs, ticketing systems, or cloud resources after interpreting hostile text as a trusted directive. Guidance is still evolving, so teams should treat the term operationally rather than as a narrow malware category. See the NIST Cybersecurity Framework 2.0 for the governance lens that applies to these risks.

The most common misapplication is treating promptware as a harmless content problem, which occurs when defenders monitor only user input and ignore trusted files, retrieved documents, and agent memory.

Examples and Use Cases

Implementing promptware defenses rigorously often introduces friction in automation, requiring organisations to weigh agent productivity against stricter input validation and tool gating.

  • A poisoned support document instructs a helpdesk agent to reveal tokens or export customer records when the document is retrieved during a workflow.
  • A malicious prompt hidden in a project README changes an coding agent’s behavior so it edits policy files or pushes unsafe configuration changes.
  • An attacker places instruction text in a ticket or email that a service agent reads, causing the agent to call internal tools outside its intended scope.
  • A vendor-integrated chatbot receives hidden directives inside a knowledge base article and uses its permissions to query systems it should not touch.
  • During incident response, analysts compare the behavior to patterns discussed in the Ultimate Guide to NHIs, which highlights how overprivileged non-human identities expand blast radius when compromised.

Promptware is often discussed alongside NIST Cybersecurity Framework 2.0 because the practical response is not just content filtering, but governance over who can influence the agent, which tools it can invoke, and what data it can trust. It also appears in NHI reviews when a service account or API-driven agent is allowed to execute instructions from low-trust sources.

Why It Matters in NHI Security

Promptware matters because many AI agents operate with real credentials, delegated access, and broad execution authority. When hostile instructions succeed, the model may not “break” in the traditional sense. It simply behaves as designed, but against the organisation’s intent. That makes promptware especially dangerous in environments where an agent can read secrets, trigger workflows, modify records, or call external services on behalf of a service account.

NHI Management Group has found that 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools, which increases the damage potential if promptware steers an agent toward those locations. Combined with the fact that 80% of identity breaches involved compromised non-human identities such as service accounts and API keys, promptware becomes a practical identity-risk issue rather than only an AI safety issue. The governance response is to apply least privilege, strict context boundaries, tool allowlists, and auditability to every agent path that can touch NHI-controlled assets, using controls consistent with Ultimate Guide to NHIs guidance.

Organisations typically encounter promptware only after an agent has already leaked data, changed configuration, or executed an unauthorised action, at which point promptware becomes operationally unavoidable to address.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 Covers prompt injection and agent abuse patterns that define promptware.
OWASP Non-Human Identity Top 10 NHI-07 Promptware becomes an NHI risk when agents misuse service credentials or secrets.
NIST CSF 2.0 PR.AC-3 Identity and access management controls reduce harm from agent-driven misuse.

Constrain agent instructions, validate untrusted context, and restrict tool execution paths.