Subscribe to the Non-Human & AI Identity Journal

Who is accountable when an AI agent exfiltrates secrets through a support workflow?

Accountability sits with the team that designed the privilege boundary and the data path, not with the model itself. If the workflow allowed a privileged agent to read sensitive data and write it into a customer-visible channel, the control failure was architectural. Governance, logging, and containment must be owned by the programme that exposed the path.

Why This Matters for Security Teams

When an AI agent can read a secret, interpret a customer issue, and place that secret into a support reply, the failure is not “the model misbehaved.” It is a boundary design problem. Accountability follows the team that approved the workflow, the data classification, and the privilege path. Current guidance from the OWASP Agentic AI Top 10 and NIST AI Risk Management Framework both point to the same practical issue: autonomous systems require controls around what they can access, when they can access it, and where outputs can flow.

For NHI and secrets governance, the important question is whether the agent had standing authority to cross a trust boundary that humans would never be allowed to cross manually. That includes support tooling, ticketing systems, chat interfaces, and any workflow that can transform backend data into customer-visible text. NHIMG research on The State of Secrets in AppSec shows how often organisations overestimate their control maturity while secrets remain exposed for weeks. In practice, many security teams encounter accountability gaps only after a sensitive value has already been copied into an external channel, rather than through intentional workflow design.

How It Works in Practice

The right accountability model starts with the privilege boundary, not the model label. If the agent is an operational actor, it should be treated as a non-human identity with tightly scoped, task-specific access. That means the workflow owner must define what the agent may read, what it may write, and which fields are never allowed to traverse into support responses. Where possible, secrets should be isolated behind a brokered service so the agent never receives raw credentials in the first place.

Practitioners increasingly combine workload identity, just-in-time access, and real-time policy evaluation. SPIFFE and SPIRE are commonly used to prove what the workload is, while policy engines such as OPA or Cedar can decide at runtime whether a given support action is allowed. That runtime decision should account for ticket type, customer tier, data sensitivity, and whether the output channel is internal or external. This is more effective than static role assignment because the agent’s behaviour is goal-driven and can branch in ways a human analyst would not predict.

  • Issue ephemeral credentials per task, then revoke them immediately after completion.
  • Log the input, tool call, and output path so exfiltration can be traced to a workflow owner.
  • Separate redaction, summarisation, and customer reply steps so raw secrets never reach the final channel.
  • Use policy-as-code to block any action that would copy secrets into a support-visible surface.

NHIMG’s analysis of the Analysis of Claude Code Security and the OWASP NHI Top 10 both reinforce that agentic systems fail when static access assumptions meet dynamic tool use. These controls tend to break down when a support platform allows direct copy-paste from privileged backend context into a customer-facing conversation because the data path collapses the intended containment model.

Common Variations and Edge Cases

Tighter support workflow controls often increase operational friction, requiring organisations to balance faster case resolution against stronger containment. That tradeoff is real, especially when teams want agents to summarise incidents, draft responses, or assist tier-one support without adding manual review at every step.

There is no universal standard for this yet, but current guidance suggests that the most defensible model is layered. A low-risk customer service assistant may only see sanitised context, while a privileged incident-response agent might access fuller telemetry inside an internal-only environment. The accountability then shifts by role: platform engineering owns the technical boundary, the product team owns the customer channel design, and security owns policy enforcement and monitoring. If the workflow crosses regulated data, legal and privacy owners must also approve the disclosure path.

Edge cases arise when agents chain tools across systems, such as pulling from a knowledge base, opening a ticket, and posting a reply through a separate integration. That creates multiple exfiltration points, and each one needs an explicit policy decision. NHIMG’s Guide to the Secret Sprawl Challenge shows why distributed secrets handling magnifies this problem, while the CSA MAESTRO agentic AI threat modeling framework and NIST AI Risk Management Framework both support continuous governance rather than one-time approval. The hard edge case is any environment where the support workflow itself is allowed to output unredacted backend data to external recipients, because then accountability is shared across design, operations, and oversight.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 N/A Agentic workflows need runtime guardrails for autonomous actions and data release.
CSA MAESTRO N/A MAESTRO addresses threat modeling for agentic workflows and support-channel abuse.
NIST AI RMF AI RMF governs accountability, measurement, and continuous risk treatment for AI systems.

Define per-tool policies and block any agent action that can expose secrets to external channels.