Why do AI pilots create so many identity and access control problems?

Why This Matters for Security Teams

AI pilots usually begin as speed-first exercises, but identity control failures appear immediately because pilots lean on shared service accounts, copied API keys, and broad environment-level access. That pattern is exactly what the OWASP Non-Human Identity Top 10 warns against: once a workload can act autonomously, human-style assumptions about fixed access paths stop holding. NHIMG research shows the same problem at scale in production, where Ultimate Guide to NHIs reports that 97% of NHIs carry excessive privileges and only 5.7% of organisations have full visibility into service accounts.

The practical risk is not just secret leakage. AI pilots often create a chain of weak identity signals: no unique workload identity, no per-action authorization, and no trustworthy audit trail tying a prompt, tool call, or file access back to a specific agent instance. That makes it hard to prove policy compliance or contain blast radius when the pilot crosses from test data into live systems. In practice, many security teams encounter this only after a pilot has already been promoted and the first production access review exposes how little is actually controlled.

How It Works in Practice

The core issue is that pilots usually treat the AI agent like a user with a temporary login, when it behaves more like an autonomous workload with evolving intent. For that reason, static RBAC alone is usually too blunt. Current guidance suggests combining workload identity, runtime policy checks, and SPIFFE-style cryptographic identity so the system can verify what the agent is, what task it is attempting, and whether the request fits policy right now.

A safer operating model looks like this:

Issue a unique workload identity for each agent or task, rather than reusing a shared secret.

Use just-in-time, short-lived credentials that expire when the task ends.

Evaluate authorisation at request time using policy-as-code, not a fixed pilot approval matrix.

Log tool calls, data access, and secret retrieval with enough context to reconstruct agent behaviour.

Separate read, write, and execution permissions so one successful prompt cannot unlock an entire environment.

This matters because AI pilots often chain tools in ways humans do not anticipate. An agent that can read a file, call an API, and trigger another workflow can accidentally or deliberately expand its own access path. The NHI lifecycle guidance in Ultimate Guide to NHIs — Key Challenges and Risks is relevant here, but pilots add a runtime decision layer that traditional account governance does not cover. Best practice is evolving, not settled, around how much autonomy to permit before a human approval step is required.

These controls tend to break down when pilots are embedded inside legacy CI/CD pipelines or shared SaaS tenants, because the environment cannot reliably distinguish one agent instance from another and the surrounding tools were built for human sessions.

Common Variations and Edge Cases

Tighter identity control often increases operational overhead, requiring organisations to balance pilot speed against auditability and revocation discipline. Some teams try to keep the pilot “lightweight” by using a single developer token across multiple agents or by extending token lifetimes for convenience. That usually creates the opposite of what the pilot intended: more access, less traceability, and harder rollback.

There is no universal standard for this yet, but a few edge cases are becoming clearer. Multi-agent systems need separate identities for each agent and for each high-risk tool path, otherwise one compromised planner can inherit the authority of the whole chain. Human-in-the-loop approvals also help, but only if they are tied to a specific runtime action, not a generic approval for the project. For sensitive environments, PCI DSS v4.0 reinforces the need to constrain access to cardholder data systems, which becomes especially important when an AI pilot is allowed to inspect logs, tickets, or customer records. NHIMG’s 52 NHI Breaches Analysis also shows that weak non-human identity handling is rarely a theoretical problem; it is usually discovered after credentials, automation, or third-party access have already been abused.

For pilots that touch regulated data, the right question is not whether the model is accurate, but whether every action it takes can be attributed, limited, and revoked before the next task begins.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A2	Agentic systems need runtime controls for autonomous tool use and identity sprawl.
CSA MAESTRO	I1	MAESTRO addresses identity and access control for autonomous multi-agent workflows.
NIST AI RMF		AI RMF governance is needed to manage accountability and runtime risk in pilots.

Assign per-agent identities and enforce request-time policy checks before any tool call.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why do AI pilots create so many identity and access control problems?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group