Who is accountable when an employee leaks a secret into an AI prompt?

Why This Matters for Security Teams

An employee who leaks a secret into an AI prompt is not creating an AI problem first. The core issue is whether the organisation treated the prompt as a governed data path, with clear ownership, detection, and response. That maps directly to broader NHI exposure patterns seen in incidents such as the Guide to the Secret Sprawl Challenge, where unmanaged secrets move faster than teams can contain them.

Current guidance suggests accountability should follow the controls that should have prevented the disclosure: acceptable-use policy, data classification, identity governance, and secret handling workflows. The model is only the receiving system. The human is the actor, but the organisation is accountable for whether the environment made risky disclosure likely or detectable. In the real world, teams usually discover this only after the secret has already been copied, reused, or exposed in downstream logs.

How It Works in Practice

Accountability is usually shared across three layers: the employee, the manager or system owner, and the security or governance function that defined the control. The employee is accountable for violating policy. The organisation is accountable for whether policy existed, was usable, and was enforced. That distinction matters because prompt-based leakage often starts as an ordinary workflow action, not a deliberate exfiltration event.

Practically, mature organisations treat prompt handling like a sensitive data control point. They classify which secrets may never enter prompts, which may be masked, and which require approval. They log prompt submission decisions, monitor for secret-like patterns, and route high-risk events into incident response. This aligns with the direction of the OWASP Non-Human Identity Top 10, which emphasises that identity and credential exposure must be managed as a lifecycle problem, not a one-time policy statement.

Use data classification to mark secrets that are prohibited in prompts.

Apply DLP or prompt inspection controls before content leaves the endpoint or tenant.

Record who submitted the prompt, what controls were bypassed, and what response was returned.

Rotate or revoke the secret immediately if it may have been exposed.

Review whether AI tools are approved channels for the data type involved.

The operational burden is real: Akeyless reports that the average time to mitigate a leaked secret is 36 hours, which shows how quickly a simple prompt mistake becomes a sustained exposure problem. Organisations that ignore this often see the same failure pattern described in NHIMG’s The 52 NHI breaches Report and in broader credential abuse research such as the Anthropic AI-orchestrated cyber espionage campaign report. These controls tend to break down when employees can paste into consumer AI tools from unmanaged endpoints because the organisation loses both visibility and enforcement.

Common Variations and Edge Cases

Tighter prompt controls often increase friction for legitimate work, requiring organisations to balance productivity against leakage risk. That tradeoff is especially sharp in engineering, customer support, and legal review, where users may need to interact with sensitive material under time pressure.

There is no universal standard for this yet. Current guidance suggests that accountability shifts based on whether the prompt path was approved, whether the secret was masked, and whether the tool was sanctioned for that data class. If an employee bypasses policy, the employee bears direct conduct accountability, but the organisation still owns the governance gap if detection, training, or technical guardrails were weak.

Edge cases matter. A pasted API key into a public chatbot is different from a redacted token fragment submitted to an internal assistant with retention disabled. The former is usually a clear policy breach. The latter may indicate a control design failure if the organisation had not implemented masking, restricted tool access, or defined retention and logging rules. This is why the safest answer is not “blame the model” but “trace the control failure across identity, data handling, and AI use policy.”

Best practice is evolving, but the operational pattern is consistent: if the organisation cannot show who approved the AI tool, what data was allowed, and how leaked secrets are revoked, accountability will rest with the organisation’s governance model, not with the model itself.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 and OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-01	Secret leakage is an identity exposure problem, not just a prompt problem.
OWASP Agentic AI Top 10	AI-03	AI prompt handling needs controls for unsafe data flow and tool use.
NIST AI RMF		Accountability for AI use falls under governance, mapping, and monitoring functions.

Restrict sensitive prompts, inspect content, and block risky agent or assistant actions at runtime.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Who is accountable when an employee leaks a secret into an AI prompt?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group