TL;DR: RAG security extends beyond model output quality because retrieval pipelines can expose documents, vector stores, logs, and context to prompt injection, data leakage, and excessive access, according to Lasso Security. The real issue is that existing IAM and data controls were not designed to govern what language models retrieve, trust, and transform at runtime.
NHIMG editorial — based on content published by Lasso Security: RAG security risks and mitigation strategies
By the numbers:
- 72% of organisations have experienced or suspect they have experienced a breach of non-human identities, 46% confirmed and 26% suspected.
- When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes, and as quickly as 9 minutes in some cases.
Questions worth separating out
Q: How should security teams control access in a RAG pipeline?
A: Security teams should control access at the retrieval layer, not only at the application layer.
Q: Why do RAG systems create more data exposure risk than standard chatbots?
A: RAG systems can expose more data because they connect live repositories, embeddings, and documents directly into the generation process.
Q: What breaks when prompt injection reaches a RAG retrieval layer?
A: When prompt injection reaches the retrieval layer, the system can treat malicious text as trusted context because semantic similarity is mistaken for legitimacy.
Practitioner guidance
- Restrict retrieval sources by context Apply context-based access control to the knowledge bases and document sets feeding RAG so retrieval only occurs for approved users, roles, and request contexts.
- Lock down vector-store identities Inventory the service accounts, API keys, and tokens that can read or write vector databases, then remove unnecessary write access and monitor for standing privilege.
- Add provenance checks before generation Tag source documents with ownership, sensitivity, and freshness metadata so the pipeline can exclude stale or untrusted content before it reaches the prompt window.
What's in the full article
Lasso Security's full article covers the operational detail this post intentionally leaves for the source:
- Step-by-step mitigation guidance for RAG retrieval, storage, and generation stages
- Examples of prompt injection and vector database risk scenarios in AI workflows
- Specific hardening measures for context-based access control and output validation
- Implementation detail on encryption, key management, and monitoring controls
👉 Read Lasso Security's analysis of RAG security risks and mitigation strategies →
RAG security and the governance gap teams are missing?
Explore further
RAG security is really an identity problem disguised as a model problem. Retrieval systems only work safely when the identities that populate the knowledge base, access the vector store, and feed the prompt are tightly bounded. If those identities are over-privileged, the model becomes a fast amplifier for existing access mistakes rather than a new source of intelligence. Practitioners should treat retrieval paths as governed access paths, not as neutral plumbing.
A few things that frame the scale:
- 72% of organisations have experienced or suspect they have experienced a breach of non-human identities, 46% confirmed and 26% suspected, according to The 2024 ESG Report: Managing Non-Human Identities.
- Enterprises that have experienced a compromised NHI averaged 2.7 separate incidents in the past 12 months, according to the same report.
A question worth separating out:
Q: How do security teams decide whether to use validation or retrieval controls first?
A: Security teams should prioritise retrieval controls first because validation cannot reliably fix unsafe context after it has entered the prompt window. Output checks are still useful, but they work best as a second layer. If the pipeline can retrieve sensitive or malicious content in the first place, the downstream validator is already working against a compromised input set.
👉 Read our full editorial: RAG security exposes the identity gap in retrieval pipelines