Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

AI agent safety testing: what it means for IAM and access control


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 9016
Topic starter  

TL;DR: Haize Labs’ automated red-teaming platform targets prompt injection, goal misalignment, hallucination, and other behavioral failures in LLMs and AI agents, while also reporting a $100M post-money valuation and 38x faster attack generation with 4x less GPU memory use. The real lesson is that safety testing and enterprise authentication solve different problems: one checks behaviour, the other governs access.

NHIMG editorial — based on content published by WorkOS: Haize Labs: AI Safety Testing Haize Labs for AI Agent Security: Features, Pricing, and Alternatives

By the numbers:

Questions worth separating out

Q: How should security teams govern AI agents that can produce unsafe outputs after login?

A: Security teams should govern AI agents with two separate controls: identity access and behavioural assurance.

Q: Why do AI agents create risk even when identity controls are in place?

A: AI agents create risk because identity controls only validate the requester, not the runtime behaviour of the model or agent.

Q: What do security teams get wrong about AI safety testing?

A: The common mistake is treating AI safety testing as if it were just another security scan.

Practitioner guidance

  • Separate access approval from behaviour approval Require distinct sign-off for enterprise identity controls and for AI safety validation.
  • Add multi-turn red-teaming to release gates Test chained prompts, prompt injection, and goal drift before any model or agent reaches production.
  • Tie monitoring to known failure modes Instrument production AI systems so alerts map back to the exact behaviours uncovered in pre-production testing, such as leakage, hallucination, or unsafe action selection.

What's in the full article

WorkOS' full article covers the operational detail this post intentionally leaves for the source:

  • Comparative feature-by-feature breakdown of Haize Labs products such as Cascade, ACG, Verdict, Monitor, and Robustify.
  • Implementation and pricing detail for enterprise buyers evaluating AI safety testing alongside authentication infrastructure.
  • Vendor positioning that distinguishes behavioural safety validation from enterprise identity and access management in more depth.
  • Examples of where WorkOS fits for SSO, SCIM, and admin workflows in production AI applications.

👉 Read WorkOS' analysis of Haize Labs AI safety testing for production AI →

AI agent safety testing: what it means for IAM and access control?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 2 months ago
Posts: 8472
 

Behavioural safety testing is becoming a parallel control to IAM, not a replacement for it. The article makes clear that the vendor sits in the validation layer, not the identity layer. That matters because enterprises are now building AI systems where the access decision may be correct while the runtime behaviour is still unsafe. Practitioners should stop treating access governance and behavioural assurance as the same control family.

A few things that frame the scale:

  • When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases, according to LLMjacking: How Attackers Hijack AI Using Compromised NHIs.
  • DeepSeek accidentally embedded over 11,000 secrets in its training data and left a database exposed online, revealing more than one million sensitive records including chat histories, backend credentials, and API keys.

A question worth separating out:

Q: What is the difference between enterprise authentication and AI safety validation?

A: Enterprise authentication proves identity and controls entry. AI safety validation proves the model or agent behaves acceptably once entry has already been granted. Authentication supports trust at the boundary, while safety validation supports trust in the runtime behaviour inside that boundary. Mature programmes need both, not one as a substitute for the other.

👉 Read our full editorial: AI agent safety testing exposes the limits of enterprise auth



   
ReplyQuote
Share: