Promptfoo and WorkOS expose the agentic auth testing gap

By NHI Mgmt Group Editorial TeamPublished 2025-11-03Domain: Agentic AI & NHIsSource: WorkOS

TL;DR: Promptfoo’s adversarial red-teaming probes target prompt injection, privilege escalation, memory poisoning, and goal hijacking in AI applications, while WorkOS provides the underlying authentication and authorization layer these agents rely on, according to WorkOS. The governance lesson is that validation and enforcement must be designed together, because testing alone cannot compensate for weak identity controls.

At a glance

What this is: This is a comparison of AI security testing and enterprise authentication, and its key finding is that testing tools validate controls while identity infrastructure enforces them.

Why it matters: It matters because IAM, NHI, and autonomous-system programmes need to separate verification from runtime authority, or they will mistake red-teaming coverage for actual access governance.

By the numbers:

The platform has over 8,800 GitHub stars and adoption by more than 200,000 developers.
The article says 44 Fortune 500 companies use Promptfoo, indicating enterprise adoption at scale.

👉 Read WorkOS's comparison of AI security testing and enterprise authentication

Context

AI agent security breaks when teams confuse validation with enforcement. A testing platform can show that a prompt injection, privilege escalation path, or tool abuse route exists, but it cannot provide the runtime identity controls that decide whether the action is allowed. For enterprise AI systems, that distinction matters across autonomous agents, service identities, and the human users who trigger them.

In practice, the governance question is not whether an AI system was tested, but whether its authentication, authorization, and audit trail are strong enough to survive adversarial pressure. That makes this a programme design issue, not just a tooling choice. The relevant comparison is between the control plane that proves security and the identity layer that actually enforces it.

Key questions

Q: How should security teams govern AI agents that need enterprise access?

A: They should treat AI agents as non-human identities and govern them through explicit authentication, authorization, and audit controls. Testing tools can reveal bypasses, but runtime access must be enforced through the identity layer. The practical objective is to make sure the agent can only use approved tools, approved data, and approved actions.

Q: Why do AI security tests not replace authentication infrastructure?

A: Because tests show whether controls can be bypassed, while authentication infrastructure decides who is allowed to act in the first place. A red-team result is evidence of exposure, not evidence of enforcement. Organisations need both functions so vulnerabilities are found before production and blocked during real access decisions.

Q: What breaks when AI agent access decisions are handled in prompts?

A: Prompt-based access control is fragile because it places security logic inside the same system attackers are trying to influence. That makes policy easier to manipulate than an external authorization layer. When the model owns the decision path, organisations lose a clear boundary for audit, enforcement, and separation of duties.

Q: How do I know whether my AI agent governance is actually working?

A: Look for evidence that access decisions are enforced outside the model, logged immutably, and tied to enterprise identity changes. If testing only proves that attacks are possible, governance is incomplete. A working programme should show both successful validation of attack paths and hard runtime denial where policy requires it.

Technical breakdown

Adversarial red-teaming for AI agents

Promptfoo operates as a security testing layer that generates adversarial probes against AI applications. Its value is in simulating attack conditions across black-box, component-level, and trace-based tests, which helps reveal whether guardrails, function calls, or data access paths can be bypassed. The core mechanism is validation through attack simulation, not prevention through identity enforcement. In other words, it can expose that a control fails, but it does not own the control itself. This distinction is especially important in agentic systems where tool use, retrieval, and runtime prompts can all become attack surfaces.

Practical implication: Use adversarial testing to prove your controls fail safely, not as a substitute for runtime access governance.

Enterprise authentication and authorization for AI systems

WorkOS sits in the identity layer, where authentication, directory sync, and fine-grained authorization determine whether a request is allowed. That means it governs the runtime decision path, not the testing path. For AI agents that touch customer data or execute privileged actions, this layer is where SSO, MFA, role assignment, and audit logging become operational controls. The architectural lesson is that agent security depends on the same identity primitives used elsewhere in enterprise IAM, but they must be enforced before the model or tool chain can act.

Practical implication: Anchor AI agent access on SSO, role sync, and API-level authorization before exposing sensitive tools or data.

Why validation and enforcement are not interchangeable

A secure AI stack needs both a way to test for bypasses and a way to block them in production. Red-teaming tells you whether prompt injection, BOLA, or privilege escalation is possible. Authentication infrastructure decides who can act, what they can access, and what gets logged. Treating those as one capability creates a false sense of coverage, because a successful test does not mean a secure control plane exists. For agentic systems, the split between testing and enforcement is the central design boundary.

Practical implication: Separate security testing ownership from identity enforcement ownership so failures are detected and blocked in different layers.

Moltbook AI agent keys breach — Moltbook breach exposed 1.5M AI agent keys.
AI LLM hijack breach — attackers used stolen AWS access keys to hijack Anthropic LLM models on Bedrock.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Testing coverage is not identity governance. A red-teaming platform can prove that an AI agent is vulnerable to prompt injection or privilege escalation, but that proof does not establish control over runtime authority. The operational question is whether the organisation has an enforceable identity layer for the agent, not whether it has exercised a test suite. Practitioners should treat validation as evidence of exposure, not as evidence of governance.

AI agent access inherits the same identity problem as NHI, but with a different failure mode. Service accounts fail when credentials are overexposed or poorly scoped; AI agents fail when they can combine tools and actions in ways the identity model did not anticipate. That makes access decisions harder to predefine and audit unless the runtime authorization layer is explicit. The implication is that agent security belongs in the same governance programme as NHI, not in a separate experimental silo.

Workflows that rely on prompt logic for access decisions are structurally brittle. Prompt-based controls sit too close to the attack surface and too far from the identity authority. Once the access decision is embedded in model behaviour, attackers can target the reasoning path rather than the policy engine. Practitioners should assume that any AI feature handling sensitive data needs externalized authorization and immutable audit logs.

Prompt injection testing belongs to assurance, while authorization belongs to the control plane. That split sounds obvious, but many organisations still treat AI security as a single stack of model guardrails. In practice, the two functions answer different questions: can the system be tricked, and is the system allowed to act. The governance model is incomplete until both are managed independently.

Named concept: validation-enforcement gap. This is the gap between proving that an AI system can be attacked and actually governing what it may do at runtime. It is not a tooling issue alone, and it is not solved by better prompts or more test cases. The practitioner conclusion is that AI security programmes must assign separate ownership to adversarial testing and identity enforcement.

From our research:
98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials.
That pattern makes OWASP Agentic AI Top 10 a useful forward reference for teams moving from testing to runtime governance.

What this signals

Validation demand will keep rising faster than governance maturity. With 98% of companies planning to deploy more AI agents in the next 12 months, security teams will face increasing pressure to prove coverage before the control model is settled. The programme risk is not lack of testing, but assuming test coverage equals governed access.

Agent security should be designed as an identity programme, not a model-safety exercise. The control boundary sits in authorization, directory sync, and auditability, not in prompt tuning alone. Teams that already manage NHI sprawl should extend the same discipline to AI agent identities, using the same lifecycle mindset they apply to service accounts.

Prompt injection is now an assurance test, not a theoretical threat class. Organisations should expect adversarial testing to become standard evidence in enterprise AI procurement and internal control reviews. The practical question is whether the security team can separate findings from enforcement, then close the gap with a real identity control plane.

For practitioners

Separate red-team evidence from runtime control ownership Assign testing teams to prove bypass paths and IAM teams to own the authorization layer that blocks them. Keep those responsibilities distinct in design reviews, especially when AI agents can access customer data or privileged APIs.
Enforce API-level authorization for agent actions Do not rely on prompt instructions to decide whether an agent may read, write, or execute. Place the decision at the identity and policy layer so tool use is governed before the model acts.
Tie agent access to enterprise identity lifecycle controls Use directory sync, role assignment, and offboarding procedures so AI agent permissions change when human ownership or business context changes. That reduces the chance that an agent retains access after the governing account or workflow has moved on.
Use adversarial testing to verify least privilege boundaries Run prompt injection, BOLA, and privilege escalation tests against the exact tool and data paths an agent can reach. The goal is to confirm that least privilege is enforced where the agent actually acts, not just where it was designed.

Key takeaways

AI agent security fails when teams confuse red-teaming with access control, because testing exposes weakness while identity infrastructure enforces policy.
The evidence shows a sharp mismatch between AI agent deployment plans and governance readiness, which raises the bar for runtime authorization and auditability.
Practitioners should manage AI agents as non-human identities, with the same separation of testing, authorization, and lifecycle control used for other privileged systems.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A3	Covers prompt injection and agent tool abuse discussed in the article.
OWASP Non-Human Identity Top 10	NHI-01	Applies to agent identities that need authentication and authorization controls.
NIST CSF 2.0	PR.AC-4	Access permissions and authorization are central to this comparison.

Map agent testing findings to A3 and validate tool-use boundaries before production.

Key terms

Agentic Security Testing: Adversarial testing for AI systems that use tools, data, or delegated actions. The purpose is to discover prompt injection, privilege escalation, and control bypass paths before production. For autonomous and non-human identities, it validates whether runtime behaviour can be manipulated across tool chains and access boundaries.
Authentication Infrastructure: The identity layer that verifies who or what is allowed to act and under what conditions. In AI systems, it covers SSO, MFA, directory sync, authorization, and audit logging. It is distinct from testing because it governs live access rather than simulated attacks.
Validation-Enforcement Gap: The mismatch between proving that an attack is possible and actually blocking that attack in production. In AI and NHI governance, this gap appears when security teams rely on red-teaming evidence without attaching it to a runtime control plane. Closing it requires separate ownership for assurance and enforcement.
Fine-Grained Authorization: Policy-based access control at the action or resource level rather than broad role assignment alone. For AI agents, it determines whether a specific tool call, data read, or write action is permitted. It becomes more important as agents gain access to sensitive systems and cross-application workflows.

Deepen your knowledge

AI agent authentication and authorization are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building governance for agent-driven access decisions, it is worth exploring.

This post draws on content published by WorkOS: Promptfoo vs. WorkOS: Security Testing Meets Enterprise Authentication. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-03.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org