TL;DR: Externalized authorization pushes access checks into the retrieval layer so semantic search and RAG pipelines only surface documents a user is permitted to see, according to Cerbos. That shift matters because similarity is not authorization, and post-retrieval filtering can still leak snippets, waste search budget, and expose confidential data before the final gate.
At a glance
What this is: This is a Cerbos engineering guide showing how externalized authorization can turn policy plans into ChromaDB filters so RAG systems enforce access before retrieval.
Why it matters: It matters because IAM teams now have to govern not just who can access data, but which AI retrieval paths are allowed to surface it across NHI, autonomous, and human workflows.
👉 Read Cerbos's guide to externalized authorization for ChromaDB retrieval
Context
Externalized authorization means access control is evaluated outside the application, using central policies instead of hard-coded checks. In RAG and vector search systems, that matters because semantic relevance can pull back material a user is not entitled to see unless the retrieval layer is filtered first.
For IAM and NHI programmes, this is the same governance problem in a new place: policy must travel with the request path, whether the actor is a human user, a service account, or an AI workflow that retrieves data on demand. The question is no longer only who can log in, but what the search layer can expose.
This pattern is especially relevant when applications rely on metadata filters, query planning, and runtime policy evaluation to keep unauthorized content out of the LLM context window. That is why externalized authorization is now part of identity governance, not just application design.
Key questions
Q: How should security teams enforce authorization in RAG retrieval pipelines?
A: Security teams should enforce authorization at the retrieval layer, before documents are returned to the application or LLM. That means translating identity and resource policies into database or vector-store filters so unauthorized content never enters the search result set, prompt context, or summary path.
Q: Why do vector databases complicate access control for AI applications?
A: Vector databases complicate access control because they optimize for similarity, not entitlement. A document can be highly relevant to a query and still be unauthorized for the requesting principal, so relevance-based retrieval creates a data exposure path unless policy is applied during search.
Q: What breaks when post-retrieval filtering is used for confidential content?
A: Post-retrieval filtering breaks because the system has already spent search budget and may already have exposed snippets, rankings, or partial context. Even if the final answer is filtered, the retrieval step can still leak enough information for an unauthorized user or model to infer sensitive data.
Q: How do policy plans help control access in AI retrieval systems?
A: Policy plans let the authorization engine express the exact conditions under which access is allowed, then translate those conditions into datastore-native filters. That gives teams a consistent way to enforce policy across applications without duplicating access logic in every retrieval workflow.
Technical breakdown
PlanResources and query planning in retrieval authorization
Cerbos uses PlanResources to partially evaluate policy before the application performs a database or vector search. The result is a query plan, essentially an abstract representation of the conditions under which access is allowed. That plan can then be translated into a native filter for the target datastore. The important shift is that authorization is no longer a separate after-the-fact check. It becomes part of query construction, so the system only asks the database or vector store for rows or chunks the principal is allowed to see.
Practical implication: Practitioners should move authorization logic into the retrieval path rather than relying on application-side filtering after results come back.
ChromaDB metadata filters and semantic search authorization
Vector search ranks content by similarity, not entitlement. In a RAG pipeline, that means semantically close documents can be retrieved even when they contain confidential material. Cerbos addresses that by translating policy output into ChromaDB Where filters, which are applied during the search itself. The retrieval engine only considers documents matching the allowed metadata conditions, so unauthorized content never reaches the LLM prompt. This is especially important because snippets, summaries, and embeddings can still leak context even when final output filtering is attempted later.
Practical implication: Security teams should enforce authorization at the metadata query layer for vector stores used in RAG systems.
Policy operators, De Morgan inversion, and filter limitations
The adapter maps Cerbos policy operators to ChromaDB filter syntax, including logical and, or, and comparison operators such as eq, lt, and gte. When a policy includes negation, the adapter inverts the expression using De Morgan's law because ChromaDB does not support a native not operator. Some policy constructs, such as string helpers and collection operators, are not supported because ChromaDB metadata is flat and scalar. That means policy design and datastore capability have to be aligned, or the adapter will fail with a descriptive error rather than silently weakening enforcement.
Practical implication: Teams need to validate policy expressiveness against datastore filter capabilities before depending on retrieval-time authorization.
Breaches seen in the wild
- ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.
- Schneider Electric credentials breach — exposed credentials gave attackers access to Schneider Electric Jira, exfiltrating 40GB.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Externalized authorization is the right control pattern when retrieval is the exposure point. Once AI systems search internal content by semantic similarity, the authorization decision has to happen before retrieval, not after it. Post-filtering is too late because the system may already have exposed snippets, rankings, or prompt context. The practitioner conclusion is that retrieval-layer policy enforcement belongs in the identity control plane, not in application cleanup.
Similarity-based search creates an access problem, not just a relevance problem. RAG pipelines assume the most relevant document is safe to surface, but relevance and entitlement are different control questions. That mismatch produces a governance gap where confidential material can be discoverable by proximity alone. Practitioners should treat vector search as an authorization surface, not a neutral indexing layer.
Query-plan authorization is a strong bridge between IAM and AI application control. This is where policy-as-code, policy evaluation, and datastore filters intersect in a way traditional application checks cannot replicate. Cerbos' model shows how identity policy can be projected into retrieval systems without changing application code. The field implication is that modern IAM now has to govern machine-mediated information discovery as well as login and API access.
Retrieval-layer filtering reduces the blast radius of non-human identity access. Service accounts, app backends, and AI retrieval workflows often operate with broad read permissions that are hard to police at the application boundary. When those actors query data stores directly or through orchestration layers, the enforcement point has to be as close to the data as possible. The practitioner conclusion is that NHI governance and AI retrieval governance are converging on the same control requirement: deny unauthorized data before it is ever assembled into context.
Policy capability and datastore capability must be designed together. If the authorization language can express conditions that the vector store cannot natively enforce, the adapter becomes the compatibility checkpoint. That is not a feature detail, it is a governance dependency. Practitioners should treat unsupported operators as a design-time control issue, because silent mismatch would create an authorization gap.
From our research:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- From our research: 43% of security professionals are concerned about AI systems learning and reproducing sensitive information patterns from codebases, according to The State of Secrets in AppSec.
- For practitioners moving from policy design to implementation, Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs helps connect access governance, rotation, and offboarding across non-human identities.
What this signals
Policy-enforced retrieval is becoming a baseline control for AI systems that touch sensitive data. As RAG adoption grows, the control question shifts from model accuracy to what the search layer is allowed to reveal. Teams that already manage service-account and workload access through policy should extend the same discipline to vector stores and retrieval APIs, especially where internal documents or code are involved.
Externalized authorization is also an NHI governance issue. Retrieval pipelines are often operated by non-human identities with broad read privileges, which means the access path itself can become the exposure path. That is why programmes built around the Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs need to include AI retrieval workloads in lifecycle reviews, entitlement checks, and logging.
Because AI systems can learn from whatever they are allowed to retrieve, the governance gap is no longer limited to data exfiltration. The better frame is persistent exposure through machine-mediated discovery, which makes entitlement scope and metadata design part of security architecture rather than implementation detail.
For practitioners
- Map RAG retrieval paths to explicit authorization policies Identify every point where embeddings, metadata, or similarity search can surface internal content, then bind those paths to policy-as-code rather than application-side post-filtering. This should cover human users, service accounts, and AI workflows that call retrieval APIs.
- Enforce filters before documents enter the LLM context Configure vector stores and adapters so the search layer applies access conditions during retrieval, not after ranking or summarisation. If the document never reaches the prompt, the LLM cannot leak it in snippets or answers.
- Validate policy syntax against datastore filter support Test which operators your retrieval layer can actually enforce, including negation, membership, and string matching. If the datastore cannot represent a policy condition cleanly, redesign the rule or choose a different enforcement point.
- Classify retrieval pipelines as identity-controlled systems Add RAG pipelines to IAM and NHI governance reviews, including entitlement scope, query logging, and policy change control. Treat vector search as an access path that can reveal sensitive data even when the application never returns it directly.
Key takeaways
- RAG systems create an authorization gap when they retrieve by similarity but govern by different rules.
- Externalized authorization closes that gap by pushing policy into the retrieval path before the LLM ever sees the data.
- Teams should treat vector search, metadata filters, and retrieval APIs as identity-controlled access paths that require the same governance discipline as other NHI workloads.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-03 | Retrieval workflows depend on controlling credentialed access to sensitive data paths. |
| NIST CSF 2.0 | PR.AC-4 | Access permissions must follow policy into the retrieval layer, not stop at the app boundary. |
| NIST Zero Trust (SP 800-207) | AC-4 | Zero Trust requires continuous enforcement of access conditions at each data access decision. |
Bind AI retrieval and service-account access to least privilege and verify policy enforcement at query time.
Key terms
- Externalized Authorization: An access control model where policy decisions are made outside the application and enforced by a dedicated engine. It keeps authorization logic central, testable, and reusable across systems, which is especially useful when the same data must be governed across APIs, search layers, and AI retrieval workflows.
- Query Plan: A structured representation of the conditions under which a request is allowed to proceed. In practice, the plan can be translated into datastore-native filters so the access decision is enforced during retrieval rather than after the data has already been exposed to the application or model.
- Retrieval Layer: The stage in an AI application where documents, embeddings, or chunks are selected for use by the model. This layer is security-sensitive because it determines what information can enter the prompt or output path, making it a control point for both authorization and data leakage.
- Metadata Filter: A datastore constraint that limits search results based on indexed attributes such as department, clearance, or document type. For AI retrieval, metadata filters are often the practical mechanism that turns policy into enforcement before unauthorized content reaches the LLM context window.
Deepen your knowledge
Externalized authorization for RAG and vector search is covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are extending IAM into AI retrieval paths, it is a practical place to build the governance foundations.
This post draws on content published by Cerbos: externalized authorization for ChromaDB retrieval and RAG. Read the original.
Published by the NHIMG editorial team on 2026-02-16.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org