Subscribe to the Non-Human & AI Identity Journal
Home FAQ Agentic AI & Autonomous Identity What should security teams evaluate before using compound…
Agentic AI & Autonomous Identity

What should security teams evaluate before using compound AI systems in production?

← Back to all FAQ
By NHI Mgmt Group Editorial Team Updated June 7, 2026 Domain: Agentic AI & Autonomous Identity

Security teams should evaluate how the system decides which model to call, what credentials each step uses, and whether fallback paths are visible and auditable. They should also confirm that policy changes cannot silently expand access or change data flow. If the routing layer is opaque, the governance model is incomplete.

Why This Matters for Security Teams

Compound AI systems add a routing and orchestration layer between the user, the model, and the tools that act on the system’s behalf. That extra layer changes the security question from “is the model safe” to “can every step be explained, bounded, and revoked.” Current guidance suggests teams should treat the router, prompt chain, tool permissions, and fallback logic as part of the trust boundary, not implementation detail.

This is where many programmes underestimate risk. A system can appear compliant at the model layer while quietly expanding data access through a retriever, plugin, or backup model with broader permissions. The governance problem is often not the primary model, but the path taken when the primary model fails or defers. NIST’s Cybersecurity Framework 2.0 is useful here because it forces attention on governance, asset visibility, and control outcomes rather than isolated components.

NHIMG research on the State of Secrets in AppSec shows why this matters operationally: 43% of security professionals are concerned about AI systems learning and reproducing sensitive information patterns from codebases. In practice, many security teams encounter hidden privilege expansion only after a fallback path or connector has already exposed data, rather than through intentional review.

How It Works in Practice

Before production, security teams should map the full decision path of the compound system. That means identifying which component selects the model, which component retrieves context, which component calls tools, and which identity is used at each hop. For compound systems, the practical security primitive is not just the model account. It is the workload identity, the secrets boundary, and the policy engine that authorises each action at runtime.

At minimum, review these control points:

  • Model routing logic and whether it can be overridden by prompts, confidence thresholds, or fallback conditions.
  • Per-step credentials, especially whether a downstream tool receives broader access than the initial request requires.
  • Data flow constraints, including what content may be cached, logged, embedded, or passed into retrieval layers.
  • Policy evaluation at request time, ideally with explicit deny rules for high-risk actions and data classes.
  • Auditable fallback paths, so a secondary model or human escalation does not silently change the trust profile.

For implementation, current best practice is evolving toward context-aware authorisation and short-lived secrets, because static role mappings do not describe what a routed agentic pipeline may do in a given moment. The Ultimate Guide to NHIs frames this as an identity problem as much as a model problem: if each step cannot be tied to a distinct non-human identity and purpose, auditability weakens fast. This lines up with the NIST Cybersecurity Framework 2.0 emphasis on governed access, traceability, and monitored execution.

These controls tend to break down when routing is driven by opaque vendor logic or when a single service account is reused across multiple tools and environments because the resulting access graph becomes impossible to reason about in incident response.

Common Variations and Edge Cases

Tighter routing controls often increase operational overhead, requiring organisations to balance model flexibility against auditability and change-management friction. That tradeoff becomes sharper in environments that use multiple providers, retrieval-augmented generation, or dynamic tool selection, where the “right” path may depend on context that changes minute by minute.

There is no universal standard for this yet, but current guidance suggests treating fallback behavior as a first-class risk. If a cheaper model, safer model, or human escalation path has different permissions, the security team must test whether the system preserves least privilege when it switches. The same is true when policy changes are pushed centrally: a silent rule update that expands connector access is a governance event, not a routine configuration tweak.

Edge cases also include systems that look benign in isolation but become risky when chained with other agents, external memory, or shared secrets managers. In those cases, use The State of Secrets in AppSec to benchmark secrets hygiene and DeepSeek breach as a reminder that exposed secrets can turn AI infrastructure into an immediate attack path. Best practice is evolving, but the evaluation principle is stable: if the routing layer can alter privilege, data flow, or fallback behavior without a visible control record, the system is not ready for production.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10A05Compound AI routing and tool use create agentic abuse paths and hidden privilege expansion.
CSA MAESTROM2MAESTRO addresses governance for multi-step AI workflows with dynamic access and orchestration risks.
NIST AI RMFAI RMF helps evaluate governance, transparency, and risk controls across compound AI systems.

Inventory every tool call, route, and fallback path, then block any step that can escalate authority unexpectedly.

NHIMG Editorial Note
Reviewed and updated by the NHIMG editorial team on June 7, 2026.
NHI Mgmt Group — the #1 independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org