RAG access control gaps expose a new identity governance problem

By NHI Mgmt Group Editorial TeamPublished 2025-07-02Domain: Best PracticesSource: Lasso Security

TL;DR: Retrieval-augmented generation can surface unauthorized material when vector search is not constrained by identity and document-level permissions, according to Lasso Security. The practical issue is not hallucination alone but who can retrieve what, which makes access control and context filtering central to enterprise AI governance.

At a glance

What this is: This analysis shows that RAG systems can expose sensitive internal content if retrieval is not tied to user and document permissions.

Why it matters: It matters because IAM, IGA, and security teams must govern LLM access paths with the same discipline they apply to shared files, service accounts, and other non-human identities.

👉 Read Lasso Security's analysis of RAG access control and context-based governance

Context

Retrieval-augmented generation, or RAG, is an access control problem as much as it is an LLM quality problem. When a model can retrieve from broad internal indexes, the question becomes which users are allowed to surface which content, not just whether the answer is accurate. For IAM and governance teams, that puts permissions, metadata, and retrieval boundaries directly into the identity design.

The weakness is familiar: broad access in shared storage becomes more dangerous once a conversational interface can find and repackage hidden material. RAG can amplify existing permission debt by making buried files easier to discover and easier to misuse. That is why RAG governance belongs alongside non-human identity, access reviews, and context-aware policy design, not in a separate AI silo.

Key questions

Q: How should security teams control access in RAG applications?

A: Security teams should control access in RAG applications by treating retrieval as an authorisation step, not just a search function. The model should only receive context from documents the requester is permitted to see, and those permissions should be enforced before generation. Shared indices, stale metadata, and broad storage permissions all increase the chance of disclosure.

Q: Why do RAG systems create data exposure risk even without prompt injection?

A: RAG systems create data exposure risk even without prompt injection because the retrieval engine can surface legitimate but unauthorized content. If indexing is broad and permissions are weak, a normal query may retrieve sensitive documents that the user should never discover through ordinary navigation. The risk comes from access design, not just adversarial prompts.

Q: What do teams get wrong about document-level access control for AI search?

A: Teams often assume document-level access control is enough once metadata filters are in place. In practice, those filters fail when metadata is incomplete, outdated, or inconsistently mapped across sources. The result is a false sense of control, especially where documents mix public and restricted material in the same corpus.

Q: Who is accountable when a RAG system reveals restricted internal content?

A: Accountability usually sits with the organisation operating the retrieval pipeline, not with the model itself. Security, IAM, data governance, and application owners all share responsibility for how content is indexed, filtered, and presented. If the system exposes restricted material, the failure is in policy design and entitlement enforcement.

Technical breakdown

Why RAG retrieval bypasses traditional document discovery

RAG systems embed documents into vectors, then retrieve the nearest matches for a query using similarity scoring. That means the retrieval layer does not naturally understand business intent or access scope. If the index contains broadly searchable material, the model can pull content a user would never have found through ordinary file navigation. The core risk is not the model inventing facts. It is the retrieval path exposing legitimate data to the wrong principal because the search step is blind to entitlement boundaries.

Practical implication: treat retrieval as an authorisation point, not just a search function.

Separate instances versus document-level access control

The article describes two common approaches. Separate instances isolate data by role or function, which reduces cross-contamination but increases operational overhead and duplication risk. Document-level access control uses metadata to filter results per user or role, which is more granular but harder to maintain at scale. Both approaches can fail if permissions drift or metadata falls out of date. In practice, RAG security depends on whether the retrieval architecture can enforce policy before context reaches the model.

Practical implication: validate whether permission checks happen before retrieval, not after generation.

Context-based access control in GenAI workflows

Context-based access control extends the control plane beyond static document permissions by evaluating the request and response context together. That matters in RAG because a user’s question, role, and expected behaviour can reveal whether the retrieved content should be disclosed. This is especially relevant when files contain mixed sensitivity, such as one document holding both public and restricted material. CBAC is really an attempt to make context a governance signal, not just a prompt engineering concern.

Practical implication: define policy rules that combine identity, request context, and content sensitivity before response delivery.

Threat narrative

Attacker objective: The attacker objective is to obtain sensitive internal data through model-assisted retrieval without needing to defeat the underlying application directly.

Entry occurs when a user submits a normal-looking question into a RAG application that searches broadly across internal content. The request itself is not malicious, but it reaches documents that were never meant to be equally visible to every principal.
Credential or permission abuse occurs when the retrieval layer ignores or weakly enforces document-level access rules, allowing the model to surface material from broadly shared or poorly tagged sources. The content is exposed through search behaviour rather than direct file access.
Impact follows when the system presents sensitive internal material back to the user in generated form, making restricted knowledge easier to consume, copy, or redistribute. The attacker objective is unauthorized disclosure of internal information through AI-mediated retrieval.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

RAG access control is really identity governance for retrieval paths. The article is not only about LLM quality, it is about who can cause sensitive content to reappear through a query. That means the policy boundary sits between the user, the index, and the model response. Practitioners should treat retrieval as governed access, not as a neutral backend function.

Document-level permissions fail when permission metadata becomes the control plane. Once access decisions depend on metadata that is incomplete, stale, or inconsistently applied, the model can surface content that ordinary storage permissions would have hidden. This is a classic governance failure mode: the entitlement exists on paper, but the retrieval layer no longer respects it. The implication is that access reviews must include AI retrieval surfaces, not just repositories.

Context-based access control is a named concept worth sharpening for GenAI governance. The article points to a control model that evaluates the user’s request, expected behaviour, and response context together. That is more than a content filter, because it turns conversation context into an access signal. For security teams, the lesson is that GenAI governance now needs policy decisions that understand intent, role, and sensitivity before context is returned.

Broad shared-drive permissions become more dangerous once RAG can discover them automatically. The article correctly identifies an old access problem that becomes newly exploitable through conversational retrieval. A buried file with broad permissions is no longer low-risk just because users do not know where it lives. Practitioners should recognise that AI search collapses the protection value of obscurity and exposes latent entitlement debt.

RAG security belongs in the same governance stack as NHI and lifecycle controls. The control problem is not limited to human users. Retrieval services, embeddings pipelines, and data connectors all create non-human access paths that need ownership, review, and scope constraints. The field should stop treating RAG as an AI feature and start treating it as an identity-sensitive data plane.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems (39%), inappropriately sharing sensitive data (31%), and revealing access credentials (23%), according to AI Agents: The New Attack Surface report.
48% of organisations say they cannot track and audit the data their AI agents access, leaving a compliance and investigation blind spot.
That governance gap makes OWASP Agentic AI Top 10 a useful next lens when retrieval systems begin to behave like active agents.

What this signals

Context-aware retrieval is becoming a governance requirement, not an optimisation choice. As conversational systems sit closer to enterprise knowledge stores, teams need policy that understands request context, sensitivity, and entitlement before any response is assembled. The practical signal is that access review programmes must expand from repositories to AI retrieval surfaces, including indexes, connectors, and generated outputs.

RAG exposes permission debt already present in shared drives and broad internal stores. The article shows how a buried file with permissive access becomes easier to find once a model can search semantically across the corpus. With 48% of organisations unable to track and audit the data their AI agents access, according to AI Agents: The New Attack Surface report, the blind spot is not theoretical. Teams should prepare for retrieval controls to become part of their identity governance baseline.

Retrieval-based disclosure is a non-human identity problem as much as an AI problem. Connectors, indexes, and embedding pipelines act on behalf of users and should be governed as machine access paths. That means service ownership, entitlement scope, and lifecycle controls need to extend into GenAI infrastructure, particularly where the same corpus serves multiple business functions. For practitioners, the next step is to align RAG governance with OWASP Non-Human Identity Top 10 and existing IAM controls.

For practitioners

Map retrieval paths to entitlements Inventory every data source, vector index, and connector used by the RAG stack, then verify which identities can retrieve from each one. Treat the retrieval layer as an access boundary and require explicit policy checks before context is passed to the model.
Separate sensitive corpora by policy tier Segment finance, HR, legal, and general knowledge into distinct retrieval scopes when the same index cannot reliably enforce metadata-based filtering. Use this to reduce cross-domain exposure where document-level controls are difficult to maintain.
Attach sensitivity metadata at ingestion Classify documents during indexing with role, business function, and sensitivity markers that remain available to the retrieval engine. Reconcile those markers during periodic access reviews so stale metadata does not become a hidden exposure path.
Test query paths for unauthorized disclosure Run positive and negative tests that ask for restricted topics from user roles that should not see them, then inspect whether the retrieval layer filters them out before generation. Validate both direct access and mixed-content documents with partial sensitivity.

Key takeaways

RAG turns access control into a retrieval problem, because the model can surface sensitive material the user should not normally discover.
The evidence points to familiar governance debt, not a new class of data inaccuracy, because broad permissions and stale metadata are what make disclosure possible.
Practitioners should treat retrieval paths, indexes, and connectors as governed identity surfaces and apply entitlement checks before generation.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-03	RAG retrieval depends on controlling non-human access paths and credentialed connectors.
NIST CSF 2.0	PR.AC-4	Access permissions must restrict what users can retrieve through AI search.
NIST Zero Trust (SP 800-207)	PR.AC-3	RAG should verify context and identity before releasing sensitive internal content.

Extend access reviews to AI retrieval surfaces and verify entitlement filters before generation.

Key terms

Retrieval-Augmented Generation: A pattern that lets an LLM pull external documents or data at query time and use them as context for its answer. In security terms, it creates a governed retrieval path that can expose sensitive material if identity, permission, and metadata checks are not enforced before the model sees the content.
Context-Based Access Control: An access model that evaluates the request, requester, and response context before releasing information. In GenAI systems, it is used to decide whether a query is appropriate for a user’s role, intent, and sensitivity boundary, rather than relying only on static document permissions.
Vector Database: A database that stores embeddings, which are numeric representations of documents or content used for similarity search. In RAG, it becomes part of the identity-sensitive data plane because whatever is indexed there can be retrieved and repackaged by the model unless access controls are applied at query time.
Permission Metadata: Structured labels attached to documents or records that describe who may access them and under what conditions. In AI retrieval systems, stale or incomplete permission metadata becomes a governance failure because the search layer may treat it as the source of truth for disclosure decisions.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Lasso Security: Riding the RAG Trail: Access, Permissions and Context. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-07-02.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org