Use a vault for third-party OAuth tokens and static API keys, then retrieve credentials only when a tool call needs them. That keeps raw secrets out of the model context and out of application code. The goal is not just storage, but controlled retrieval with narrow scope and short lifetime.
Why This Matters for Security Teams
Agent workflows reduce secret exposure only when raw credentials stay out of prompts, logs, and source code. The problem is not just storage location. Autonomous tools can request access repeatedly, chain actions across systems, and surface tokens in unexpected places unless retrieval is tightly controlled at runtime. NHI Mgmt Group notes that 96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools, which makes agentic workflows especially brittle. See the Guide to the Secret Sprawl Challenge and the OWASP Non-Human Identity Top 10 for the core risk pattern.
Security teams often assume a vault solves the issue by itself, but the exposure path usually comes from how the agent reaches the secret, how long the token lives, and who can observe the retrieval path. Secrets that are broadly scoped or long-lived can still be copied into traces, memory, cache layers, or downstream tool calls. In practice, many security teams encounter secret sprawl only after an agent has already reused a credential in an unintended workflow, rather than through intentional secret governance.
How It Works in Practice
The practical pattern is to make the agent ask for access at the moment of need, not to preload credentials into the model context. A vault or broker issues a short-lived token, or unwraps a stored third-party OAuth refresh token only for the exact tool call that needs it. That reduces the blast radius if the agent misbehaves, because the secret is narrow in scope, ephemeral, and revocable. This approach aligns with current guidance in NIST AI Risk Management Framework and CSA MAESTRO agentic AI threat modelling framework, which both emphasise runtime governance for high-variance AI behaviour.
Operationally, teams should treat secrets as controlled capabilities rather than static configuration. That means:
- Use workload identity for the agent, so the vault can authenticate the agent instance before issuing anything.
- Prefer short TTLs and per-task issuance over reusable long-lived API keys.
- Bind secret release to policy checks such as tool name, target system, environment, and user approval state.
- Keep retrieval outside the prompt pipeline so the model never sees raw secrets unless absolutely unavoidable.
- Log access events, not secret values, and rotate or revoke immediately after the task completes.
The same logic applies to secrets used in CI/CD-backed agents or multi-step assistants that call external services. The more the workflow chains tools, the more valuable it is to use runtime policy checks and scoped access rather than static entitlements. The 52 NHI Breaches Analysis and the OWASP Agentic AI Top 10 both reinforce that secret handling failures often become identity failures. These controls tend to break down when agents operate across multiple runtimes with weak workload identity and no central broker, because the secret then gets copied into local caches, debug traces, or ad hoc integration code.
Common Variations and Edge Cases
Tighter secret controls often increase latency and operational overhead, requiring organisations to balance runtime safety against developer convenience. That tradeoff is real, especially for agent systems that make frequent low-value calls. Best practice is evolving, but there is no universal standard for this yet: some teams cache ephemeral tokens for a short window, while others fetch on every tool invocation to minimise exposure. The right answer depends on the sensitivity of the target system and the tolerance for failure.
Edge cases usually appear when the workflow must support third-party OAuth, human escalation, or offline execution. In those cases, use the smallest possible token scope and separate the human approval path from the agent’s runtime credentials. If the agent needs to act on behalf of a user, keep the user grant distinct from the agent’s own workload identity, and never merge them into one long-lived credential. NHI Mgmt Group’s Ultimate Guide to NHIs is useful context here, especially where over-privilege and poor rotation make secret leakage far more damaging. In highly distributed agent pipelines, this guidance breaks down when secrets must traverse disconnected services that cannot enforce a shared policy engine, because consistency of retrieval and revocation is then hard to guarantee.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Non-Human Identity Top 10, OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Non-Human Identity Top 10 | NHI-03 | Secret exposure is reduced by rotation, scoping, and short-lived NHI credentials. |
| OWASP Agentic AI Top 10 | A2 | Agent workflows need runtime controls that stop prompt and tool-call secret leakage. |
| CSA MAESTRO | M-AC-2 | MAESTRO addresses policy-driven access for autonomous agent actions and secrets use. |
| NIST AI RMF | AI RMF supports governance for dynamic AI behaviour and controlled access paths. |
Use NHI-03 to rotate and scope credentials so agents never rely on static long-lived secrets.
Related resources from NHI Mgmt Group
Deepen Your Knowledge
Reviewed and updated by the NHIMG editorial team on June 12, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org