Subscribe to the Non-Human & AI Identity Journal

Why do AI models create more security risk than traditional applications?

AI models create more risk because they can be manipulated through prompts, poisoned data, and connected APIs, not just through code defects. Their behaviour also changes with context, which means access, data provenance, and runtime monitoring matter as much as static hardening.

Why This Matters for Security Teams

Traditional applications usually fail in bounded, testable ways: a vulnerability exists, a patch is applied, and the risk surface is reduced. AI models are different because they can be steered at runtime through prompts, training data, retrieval sources, and connected tools. That means security teams are no longer defending only code quality, but also model behaviour, data provenance, and the trust placed in every API the model can invoke. NIST’s Cybersecurity Framework 2.0 is useful here because it frames governance, identification, protection, detection, response, and recovery as operational functions rather than a one-time build activity.

NHIMG’s OWASP NHI Top 10 highlights a related reality: once an AI system has meaningful execution authority, the identity and access model becomes part of the attack surface, not just a support control. The same pattern shows up in the Ultimate Guide to NHIs, where weak lifecycle control and over-privilege repeatedly turn machine identities into high-impact breach paths. In practice, many security teams encounter AI risk only after a model has already exposed data, misused a tool, or chained a privilege into a wider incident, rather than through intentional testing.

How It Works in Practice

Security teams should treat AI models as dynamic decision systems, not static code artifacts. A traditional application can be hardened with secure defaults and fixed entitlements, but an AI model may change output depending on the prompt, retrieved content, conversation state, or tool availability. That is why current guidance suggests combining model safeguards with workload identity, runtime policy, and short-lived access rather than relying on perimeter controls alone.

In practice, this means the model should not hold standing access to sensitive systems. Instead, the model or agent should authenticate as a workload, receive just-in-time credentials for a specific task, and lose those credentials automatically when the task ends. This is where ephemeral secrets, OIDC-based workload identity, and standards such as SPIFFE become important. The identity primitive is not the prompt; it is the cryptographic proof of what the workload is allowed to do at that moment.

  • Use runtime authorisation for each request, not only pre-approved roles.
  • Issue short-lived tokens for narrowly scoped actions and revoke them on completion.
  • Log prompt inputs, retrieved documents, tool calls, and output paths for traceability.
  • Separate model access to data from direct write access to downstream systems.

NHIMG’s Top 10 NHI Issues is a useful reminder that credential rotation and monitoring matter because machine identities are often the easiest route to lateral movement once they are compromised. For implementation detail, SPIFFE provides a clear model for workload identity, while NIST SP 800-207 supports the Zero Trust principle of continuous verification. These controls tend to break down when a model is wired into legacy systems with shared service accounts and no task-level scoping because the model can reuse broad privileges faster than teams can inspect each call.

Common Variations and Edge Cases

Tighter controls often increase orchestration overhead, so organisations have to balance speed against containment. That tradeoff is real in AI environments, especially when teams want low-latency responses, shared retrieval layers, or multi-agent workflows that coordinate across several tools.

Best practice is evolving for agentic systems, and there is no universal standard for this yet. Some environments can tolerate a human-in-the-loop approval step for high-risk actions, while others need fully automated policy evaluation because humans cannot intervene quickly enough. NIST AI RMF guidance is helpful here, but practitioners should also look to the Ultimate Guide to NHIs for the operational reality of over-privileged machine access and fragmented secrets control. The State of Secrets in AppSec is also relevant because AI systems often inherit secret sprawl, and sensitive patterns can be learned or reproduced if data boundaries are weak.

Edge cases usually appear in multi-tenant models, retrieval-augmented generation, or agent chains where one component can escalate the next. In those setups, static RBAC alone is too coarse, and coarse logging is too late. Security teams should treat the environment as high-variance and assume the model will explore paths that were not explicitly scripted.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and CSA MAESTRO address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 A2 Prompts, tools, and agent actions expand the attack surface beyond code defects.
CSA MAESTRO GOV-02 Agentic systems need governance for dynamic access and execution authority.
NIST AI RMF AI RMF covers governance and monitoring for unpredictable model behaviour.

Inventory agent inputs, tool permissions, and output handling, then test for abuse paths at runtime.