Why do LLM applications create governance problems for IAM and security teams?

Why This Matters for Security Teams

LLM applications change the governance problem because the model is no longer just producing text. Once it can call tools, query internal systems, or trigger workflows, untrusted prompts can become authorised actions. That collapses the old separation between application security, identity governance, and runtime policy. Security teams are then forced to answer not only “who can log in” but “what can this model do right now, with this context, and for how long?”

This is why current guidance increasingly points to agent and workload controls rather than human-centric IAM alone, as reflected in the OWASP Agentic AI Top 10 and NIST’s NIST AI Risk Management Framework. NHIMG research shows the operational gap is already visible: in AI Agents: The New Attack Surface report, only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.

In practice, many security teams encounter uncontrolled agent behaviour only after a prompt, tool call, or data leak has already occurred, rather than through intentional design review.

How It Works in Practice

Governance failures usually begin when an LLM application inherits privileges from the surrounding service account instead of receiving task-scoped authority. That model works poorly because the application’s behaviour is not fixed. The same agent may summarise a ticket, query a customer record, open a file, or initiate an API call depending on the prompt and state. Static RBAC cannot express that variability well, especially when the question is whether a specific action is safe at this moment.

A more practical pattern is to treat the agent as a workload identity and evaluate permissions at request time. That means using cryptographic workload identity for proof of what the agent is, then issuing short-lived, just-in-time credentials for the specific task it is about to perform. Where possible, policy should be context-aware and evaluated in real time, using policy-as-code rather than pre-defined allow lists. NIST’s NIST AI Risk Management Framework supports this kind of lifecycle-based control, while the CSA MAESTRO agentic AI threat modeling framework helps teams map tool access, memory, and action chains.

Issue task-bound tokens with short TTLs instead of long-lived API keys.

Separate read-only retrieval from write-capable actions.

Log prompt, tool call, and output events as one control plane.

Revoke credentials automatically when the task ends or the context changes.

NHIMG’s OWASP NHI Top 10 is useful here because many of the same failure modes reappear as over-privilege, missing rotation, and weak visibility. These controls tend to break down when agents chain tools across multiple systems because context is lost between the model, the orchestrator, and downstream APIs.

Common Variations and Edge Cases

Tighter control often increases latency and operational overhead, so teams have to balance blast-radius reduction against developer friction and user experience. That tradeoff becomes sharper in high-volume environments, where per-request policy checks, secret minting, and audit logging can slow down workflows if they are bolted on late.

There is no universal standard for this yet, but current guidance suggests three recurring edge cases. First, retrieval-only LLM apps still become governance issues if retrieved content can influence downstream decisions or if output is copied into privileged systems. Second, multi-agent pipelines are harder than single-agent assistants because each hop expands the trust boundary and increases the chance of privilege chaining. Third, human approval steps do not fully solve the problem if the agent already has the ability to stage actions, collect secrets, or prepare harmful payloads before approval.

NHIMG’s The State of Non-Human Identity Security highlights why this matters operationally: only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs. That confidence gap is exactly what shows up when LLM applications are treated as ordinary software rather than as autonomous, policy-sensitive workloads. The safest path is to align IAM, application security, and AI governance around runtime decisioning, not static entitlement reviews.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A1	Agent prompt and tool abuse are central to LLM governance failures.
CSA MAESTRO	M1	MAESTRO addresses agent threat modeling across tools, memory, and actions.
NIST AI RMF	GOVERN	AI RMF governance is needed to assign accountability for agent behaviour.

Model each agent workflow and bound every tool chain with explicit trust decisions.

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Why do LLM applications create governance problems for IAM and security teams?

Why This Matters for Security Teams

How It Works in Practice

Common Variations and Edge Cases

Standards & Framework Alignment

Related resources from NHI Mgmt Group