Notifications

Clear all

RAG-based AI agents: where do access controls break first?

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12212

Topic starter 24/06/2026 9:13 pm

TL;DR: RAG-powered AI agents can surface sensitive internal data, leak confidential material, or be manipulated through prompt and context injection when permission checks are missing, according to Cerbos. The security problem is not the model alone but the trust boundary around retrieval, authorization, and downstream response generation.

NHIMG editorial — based on content published by Cerbos: authorization-aware data filtering for RAG-based AI agents

By the numbers:

80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials.

Questions worth separating out

Q: How should security teams implement access control for RAG-based AI agents?

A: They should enforce authorization before retrieval, not after generation.

Q: Why do RAG-based assistants create more risk than a normal search tool?

A: Because they do not just return matching records, they assemble those records into a generated answer that can expose sensitive context, merge fragments, or amplify poisoned data.

Q: What do security teams get wrong about prompt injection in AI assistants?

A: They often treat prompt injection as a model safety issue alone, when it is also a trust issue in the content pipeline.

Practitioner guidance

Enforce retrieval-time authorization checks Apply policy before documents, rows, or API responses are injected into the prompt so the model never sees data the user cannot access.
Classify the data sources behind every agent Inventory which repositories, APIs, and knowledge bases feed each AI assistant, then map those sources to the same access rules used elsewhere in the product.
Filter retrieved content before prompt assembly Treat document fragments, search snippets, and vector results as untrusted input and remove content that could influence unsafe or out-of-scope responses.

What's in the full article

Cerbos' full article covers the implementation detail this post intentionally leaves at the strategy level:

Step-by-step policy flow from user request to filtered retrieval to generated answer.
Concrete examples of role, department, and region-based authorization filters for AI assistants.
Architecture patterns for centralizing access control across apps, APIs, and AI agents.
Operational discussion of auditability, compliance, and response logging in RAG workflows.

👉 Read Cerbos' analysis of authorization-aware access control for RAG AI agents →

RAG-based AI agents: where do access controls break first?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11787

25/06/2026 6:19 am

Authorization-aware retrieval is the new control plane for AI assistants. RAG systems collapse the old separation between application access and answer generation because the model can only be as safe as the data it is allowed to retrieve. That makes policy enforcement before retrieval the decisive control, not a nice-to-have filter. Practitioners should treat retrieval as an identity decision, not a search function.

A few things that frame the scale:

80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials, according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.

A question worth separating out:

Q: How can organisations tell whether AI agent permissions are actually working?

A: They should test whether the assistant can retrieve restricted records, whether policy filters are applied before prompt construction, and whether response logs prove the final answer stayed within the user's entitlement. If any of those checks fail, the control is cosmetic rather than effective.

👉 Read our full editorial: RAG-based AI agent access control is now an identity issue

ReplyQuote

Forum Statistics

11 Forums

13.5 K Topics

25.8 K Posts

39 Online

135 Members

Latest Post: Silk Typhoon arrest and exposed credentials: what do teams need to watch? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies