Retrieval-augmented generation is a pattern where an AI model pulls external information before generating output. The security challenge is that access rules can weaken when data is chunked, embedded, cached, or reused, so source permissions may not automatically follow the content into the model's context.
Expanded Definition
Retrieval-augmented generation, or RAG, combines model output with retrieved source material so an agent or application can answer using fresher, more specific context. In NHI and IAM programs, the security question is not only what the model says, but which identities can retrieve, cache, or reuse the underlying data.
Definitions vary across vendors because some describe RAG as a prompt-time pattern while others treat it as an application architecture. For security teams, the practical boundary is simpler: once content is chunked, embedded, indexed, or cached, the original access control decisions may no longer travel with the data. That makes RAG closely related to zero trust thinking, as described in NIST Cybersecurity Framework 2.0 and the access discipline covered in Ultimate Guide to NHIs.
The most common misapplication is treating retrieval permissions as equivalent to answer permissions, which occurs when teams index sensitive sources without preserving document-level entitlements at query time.
Examples and Use Cases
Implementing RAG rigorously often introduces latency, permission-mapping overhead, and additional audit complexity, requiring organisations to weigh answer quality against control fidelity.
- An internal support agent retrieves policy documents, but only after the caller’s NHI is verified against the same RBAC rules that protect the source repository.
- A finance assistant uses cached embeddings from approved procedures, while the platform checks whether the embedded source still matches the requesting service account’s scope.
- A software engineering copilot pulls from runbooks and incident notes, with JIT access limiting which agents can query sensitive outage records.
- A procurement workflow uses retrieval over contract clauses, but redaction and source filtering prevent an Agent from surfacing vendor data outside its business unit.
- A knowledge assistant indexes data from multiple systems, then enforces query-time policy so one identity cannot infer content merely because it was embedded earlier.
These patterns align with the governance focus in Ultimate Guide to NHIs and the risk-based control model in NIST Cybersecurity Framework 2.0, where identity, access, and data handling are treated as linked control surfaces rather than separate problems.
Why It Matters in NHI Security
RAG matters because it can silently widen access. When retrieval layers ingest secrets, tickets, chats, or operational runbooks, an Agent may surface information that its NHI should never have seen directly. That risk becomes sharper when secrets live outside dedicated managers, when service accounts are overprivileged, or when reused chunks persist after access should have expired.
NHIMG research shows that 97% of NHIs carry excessive privileges, increasing unauthorised access and broadening the attack surface, which makes retrieval governance especially important in AI-connected environments. The same research also shows that 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools, a pattern that can feed retrieval systems with sensitive content that was never meant for broad reuse. The governance lens in Ultimate Guide to NHIs and the control expectations in NIST Cybersecurity Framework 2.0 both point to the same requirement: know which identity can retrieve what, when, and under which policy.
Organisations typically encounter RAG risk only after an Agent exposes restricted content in a production response, at which point retrieval governance becomes operationally unavoidable to address.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-02 | Covers identity and secret handling failures that RAG can amplify. |
| NIST CSF 2.0 | PR.AC-4 | Maps to least-privilege access management for retrieved content. |
| NIST Zero Trust (SP 800-207) | Zero Trust requires continuous verification across retrieval paths and data access. |
Audit retrieval sources, embeddings, and caches to prevent secret sprawl and overexposed NHIs.
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on May 26, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org