Subscribe to the Non-Human & AI Identity Journal

Why does sensitive information disclosure become an identity problem in LLM systems?

Sensitive information disclosure becomes an identity problem because the model can only expose what its connected accounts and routes allow it to access. If the workflow has broad permissions, the model can return or leak far more than intended. Narrow permissions, short-lived credentials, and per-route controls reduce the chance that a single request becomes a large data event.

Why This Becomes an Identity Problem

Sensitive information disclosure is not just a content-safety issue. It becomes an identity problem when the LLM is operating through connected accounts, scoped tokens, shared service identities, and tool routes that determine what data can be reached, assembled, and returned. If those identities are broad, the model can expose more than the user should ever see. That risk is now common enough to show up in agentic deployments, where SailPoint’s AI Agents: The New Attack Surface report found 31% of organisations reported AI agents inappropriately sharing sensitive data and 23% revealing access credentials.

The practical failure mode is simple: the model does not need to “steal” data in the human sense. It only needs permission to retrieve, transform, or relay it. That means the security question shifts from prompt content to identity design, route segregation, and least privilege at the workload level. Guidance from OWASP Agentic AI Top 10 and NIST AI Risk Management Framework both point toward controlling what the system can access, not just what it can say. In practice, many teams discover the leak only after a broad integration has already exposed sensitive records through a routine response path.

How It Works in Practice

The core control objective is to make every data path identity-bound, narrowly scoped, and easy to revoke. An LLM should not inherit a catch-all enterprise account. It should authenticate as a workload identity with explicit routes for specific tasks, backed by short-lived credentials and policy checks at request time. This is where NHI design matters: the same principles described in Ultimate Guide to NHIs apply to AI systems because the model’s access is only as safe as the secret, token, or delegated identity behind it.

In operational terms, effective patterns usually include:

  • Per-route identity separation, so retrieval, summarisation, ticketing, and outbound messaging do not share one credential.
  • Just-in-time credentials with short TTLs, so tokens expire after the task completes or the session ends.
  • Policy-as-code at runtime using context such as user, route, dataset sensitivity, and purpose of use.
  • Redaction and data minimisation before the model ever sees sensitive fields.
  • Logging that records which identity accessed which source, so disclosure can be traced after the fact.

Current guidance suggests using workload identity primitives rather than static shared secrets wherever possible, including cryptographic identities and ephemeral tokens. That aligns with implementation thinking in the CSA MAESTRO agentic AI threat modeling framework, which treats tool access and data access as part of the system’s attack surface. When organisations fail to do this, the model may retrieve a sensitive object once and then surface it repeatedly across follow-on prompts, summaries, and downstream integrations. These controls tend to break down when one shared backend identity powers many tools because the model can reuse broad access across unrelated workflows.

Common Variations and Edge Cases

Tighter data access controls often increase integration overhead, requiring organisations to balance operational speed against the risk of overexposure. That tradeoff is most visible in environments with legacy APIs, shared service accounts, or analytics pipelines that were never designed for task-specific identity. Best practice is evolving, and there is no universal standard for this yet, but the direction is clear: separate the identities that fetch data from the identities that present it.

Some edge cases need special handling. In retrieval-augmented generation, the risk is not only the final answer but the documents selected for context. In multi-agent workflows, one agent’s overbroad access can become another agent’s exfiltration path. In regulated environments, even a harmless-looking summary can become a disclosure if it combines fields that were never meant to be joined. NHIMG research on the AI agents attack surface and the OWASP NHI Top 10 both reinforce that identity boundaries must hold across prompts, tools, and data routes.

Exceptions also appear when teams rely on human consent as a proxy for system permission. That works poorly for autonomous workflows because the agent may chain calls, cache outputs, or reuse context in ways the human never intended. For that reason, disclosure controls should be enforced at the identity and route layer first, with prompts and content filters as supporting controls, not primary safeguards.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 LLM-03 Covers data leakage through agent tool use and prompt flows.
CSA MAESTRO T1 Maps agent identity and tool access to runtime threat modeling.
NIST AI RMF Addresses governance and measurement of AI disclosure risk.

Model each agent route, secret, and tool as a separate trust boundary with explicit controls.