Subscribe to the Non-Human & AI Identity Journal

What breaks when agent-to-agent workflows are left ungoverned?

The organisation loses control over how operational behavior spreads. Agents can copy shortcuts, inherit unsafe patterns, and amplify weak permissions across multiple systems. Without governance, a small instruction change can turn into repeated action in places no human reviewed, which is why trust boundaries must include inter-agent channels.

Why This Matters for Security Teams

Un-governed agent-to-agent workflows break security in a different way than ordinary application sprawl: the risk is not just access, but propagation. One agent can inherit a shortcut, reuse a tool chain, or repeat an unsafe instruction across systems faster than a human reviewer can notice. That makes trust boundaries between agents as important as the boundary around the original workload. Guidance from the OWASP Agentic AI Top 10 and the NIST AI Risk Management Framework both point to the same issue: autonomous systems need runtime control, not just setup-time approval.

This is where NHI governance becomes operational rather than theoretical. Agents are not static users, so they should not be managed as if they were. Their identities, secrets, and authorisations must be treated as short-lived and context-sensitive, especially when one agent can trigger another. NHIMG research on the OWASP NHI Top 10 shows how quickly weak identity hygiene turns into systemic exposure when automation is chained together. In practice, many security teams encounter inter-agent abuse only after an automated workflow has already repeated the mistake at scale, rather than through intentional testing.

How It Works in Practice

Left ungoverned, agent-to-agent workflows usually fail in three places: delegated authority, secret sharing, and uncontrolled action chaining. A planning agent may call a task agent, which then calls a retrieval or execution agent, and each hop expands the blast radius unless policy is evaluated at every step. Static RBAC is weak here because the next action is not always predictable in advance. Better practice is evolving toward intent-based authorisation, where the runtime decision checks what the agent is trying to do, what data it is touching, and whether the receiving agent is allowed to act on that context.

That usually means combining workload identity with ephemeral credentials. Cryptographic workload identity, such as SPIFFE or OIDC-backed service identity, proves what the agent is, while JIT-issued credentials limit what it can do and for how long. Secrets should be short-lived, scoped to a task, and revoked automatically when the task ends. Policy-as-code engines such as OPA or Cedar can enforce this at request time, while logs and traces capture inter-agent delegation for later review. The CSA MAESTRO agentic AI threat modelling framework and NHIMG’s Lifecycle Processes for Managing NHIs both reinforce this runtime-first model. This is the same operational lesson visible in the AI LLM hijack breach pattern: once one automated step becomes trustworthy by default, the next step often inherits that trust without review.

  • Evaluate authorisation at each hop, not just at session start.
  • Issue short-lived credentials per task, then revoke them automatically.
  • Treat agent-to-agent messages as privileged control traffic, not ordinary app traffic.
  • Record delegation chains so investigators can reconstruct how an instruction spread.

These controls tend to break down in high-churn multi-agent pipelines because ownership changes quickly and policy drift outpaces manual review.

Common Variations and Edge Cases

Tighter inter-agent control often increases latency and integration overhead, so organisations must balance containment against the need for fast orchestration. There is no universal standard for this yet, especially in systems where agents collaborate across vendors or across loosely coupled internal platforms. That uncertainty is why current guidance suggests using runtime policy checks for anything that can write, deploy, spend, or exfiltrate data, while allowing lower-risk coordination to remain narrowly scoped.

Two edge cases deserve attention. First, agents that merely summarize or route requests can still become policy bypasses if they forward unsafe instructions into a more privileged workflow. Second, agents that operate in partially trusted ecosystems can copy patterns from neighbouring agents and amplify a flawed permission model without any single obvious violation. NHIMG’s Top 10 NHI Issues and the NIST Cybersecurity Framework 2.0 both support stronger visibility, but they do not remove the need for human-defined trust boundaries between agents. Current best practice is to assume that any agent capable of calling another agent can also become a distribution point for unsafe behaviour unless that channel is explicitly governed.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A2 Agent chaining and unsafe delegation are core agentic AI risks.
CSA MAESTRO TRUST-2 MAESTRO addresses trust boundaries and runtime control for agent workflows.
NIST AI RMF AI RMF governance applies to accountability and monitoring of autonomous agents.

Model each inter-agent hop as a policy gate with scoped, revocable authority.