What are the main reasons AI agents struggle to achieve enterprise-scale deployment?

AI agents often face deployment challenges due to gaps in identity management, authorization policies, access control mechanisms, and audit trails. Without addressing these essential governance issues, enterprises may find it hard to scale their AI initiatives.

Why AI Agents Fail to Scale Across the Enterprise

AI agents do not struggle at scale because they are “too smart”; they struggle because they are autonomous software entities with execution authority, tool access, and changing intent. Static RBAC and perimeter assumptions work poorly when an agent can chain actions, request new data, and pursue a goal in ways that were not pre-approved. NIST’s NIST AI Risk Management Framework and the OWASP Agentic AI Top 10 both point to governance, traceability, and runtime control as core requirements, not optional add-ons.

That is why NHIMG’s research on OWASP NHI Top 10 is so relevant: enterprise deployment fails when identity, authorization, and audit design are bolted on after agents have already been allowed to act. The practical issue is not only access to a model, but the full operating path around it: workloads, tokens, connectors, and downstream systems. In practice, many security teams encounter agent overreach only after sensitive data exposure or tool abuse has already occurred, rather than through intentional design.

How It Works in Practice

Enterprise-scale deployment usually breaks down in four places. First, agents need a workload identity that proves what they are, not just a shared application account. That is why SPIFFE-style workload identity, short-lived OIDC tokens, and similar cryptographic identities matter more than long-lived service credentials. Second, authorisation must move from static roles to intent-based or context-aware decisions. Current guidance suggests evaluating what the agent is trying to do at request time, using policy-as-code and full context, rather than assuming a fixed job function.

Third, credential lifecycle becomes a scaling constraint. JIT credential provisioning and ephemeral secrets reduce blast radius because access is issued per task and revoked when the task completes. This is especially important for autonomous agents that can continue operating after an operator has moved on. Fourth, audit trails must be complete enough to reconstruct tool use, data access, and downstream side effects. NHIMG’s reporting on the AI LLM hijack breach and the DeepSeek breach shows how exposed secrets and weak governance quickly become enterprise incidents.

Vendor research reinforces the scale problem: SailPoint found that 80% of organisations say their AI agents have already acted beyond intended scope, while only 52% can track and audit the data those agents access. That gap matters because an agent that can reach a ticketing system, a cloud API, and a database can also chain those tools in ways security teams did not model. These controls tend to break down when agents are granted broad API access in loosely governed SaaS and cloud environments because runtime policy checks are either missing or bypassed by convenience exceptions.

Common Variations and Edge Cases

Tighter controls often increase integration overhead, requiring organisations to balance faster agent rollout against stricter identity and policy design. There is no universal standard for every agent pattern yet, so best practice is evolving around the risk level of the workload rather than one fixed control set. For low-risk internal assistants, coarse approvals may be acceptable; for agents handling production changes, customer data, or finance workflows, real-time authorisation is the safer path.

Edge cases appear when agents are multi-tenant, operate across multiple tools, or are orchestrated by other agents. In those environments, a single human approval is rarely enough because downstream actions can diverge from the original request. The Anthropic report on AI-orchestrated cyber operations and NHIMG’s Moltbook AI agent keys breach both illustrate how quickly secrets and tool access can be abused once they are exposed. The right question is not whether an agent can be trusted once, but whether it can be safely re-authorised every time it acts.

In practice, the hardest failures show up when teams treat agents like ordinary apps instead of goal-driven workloads with live privileges and hidden branching behaviour.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A01	Agentic apps fail when identity and tool authorization are not runtime-governed.
CSA MAESTRO		MAESTRO addresses governance for autonomous agents, including access and oversight gaps.
NIST AI RMF	GOVERN	AI RMF governance is central to accountability, auditability, and risk oversight.

Assign accountable owners, log agent actions, and review risk continuously under AI RMF GOVERN.

What are the main reasons AI agents struggle to achieve enterprise-scale deployment?

Why AI Agents Fail to Scale Across the Enterprise

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group