Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

AI red teaming and GenAI guardrails: what IAM teams should watch


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 9059
Topic starter  

TL;DR: AI red teaming shifts security testing from static infrastructure to the model, prompt, plugin, and agent layers, uncovering prompt injection, data leakage, and broken access control risks in real-world GenAI use, according to Lasso Security. The underlying issue is that traditional security assumes stable system behaviour, while AI behaviour changes with context, inputs, and chained tool calls.

NHIMG editorial — based on content published by Lasso Security: What is Red Teaming in AI? Types, Components & Best Practices

Questions worth separating out

Q: What breaks when AI red teaming is not part of GenAI governance?

A: Without AI red teaming, organisations usually discover failures only after a model has already exposed data, bypassed a guardrail, or triggered an unsafe API call.

Q: Why do GenAI systems complicate identity and access control?

A: GenAI systems complicate identity and access control because they can turn a single user request into a chain of delegated actions across plugins, APIs, and service accounts.

Q: How do security teams know if AI red teaming is working?

A: AI red teaming is working when testing finds real prompt injection paths, over-scoped integrations, and policy gaps before attackers do, and when fixes are re-tested successfully after model or workflow changes.

Practitioner guidance

  • Map every AI workflow to its delegated authority chain. Document which prompts, plugins, APIs, service accounts, and downstream systems can be reached from each GenAI application, then identify where authority expands beyond the original use case.
  • Test for prompt injection across all untrusted inputs. Run adversarial tests against uploaded files, retrieved documents, chat inputs, and API responses to verify whether hidden instructions can override system behaviour.
  • Review plugin and API permissions as identity controls. Treat AI-connected integrations as privileged access paths and inspect token reuse, scope breadth, and call chaining.

What's in the full article

Lasso Security's full blog post covers the operational detail this post intentionally leaves for the source:

  • Hands-on examples of adversarial prompt patterns and jailbreak techniques used in red-team exercises
  • Tool-specific testing approaches across Microsoft PyRIT, Meta Purple Llama, and other evaluation frameworks
  • Step-by-step guidance on tracking findings, validating remediation, and repeating tests after changes
  • Practical examples of how autonomous workflows, copilots, and integrations behave under attack

👉 Read Lasso Security's guide to AI red teaming types, components, and best practices →

AI red teaming and GenAI guardrails: what IAM teams should watch?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
(@mr-nhi)
Member Moderator
Joined: 2 months ago
Posts: 8437
 

AI red teaming is really identity testing for delegated machine behaviour. The article describes a world where models, copilots, plugins, and agent workflows can trigger actions beyond the intent of the original prompt. That is not just application security, because the useful unit of analysis is the delegated authority chain behind the model. Practitioners should read this as a governance signal that AI systems are now exercising identity-like behaviour in production.

A few things that frame the scale:

  • 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, according to The State of Non-Human Identity Security.
  • Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging at 37% and over-privileged accounts at 37%.

A question worth separating out:

Q: Should organisations treat AI plugins like privileged access?

A: Yes. AI plugins and connected APIs can act on behalf of the model, so they should be governed as privileged access paths with narrow scopes, explicit approval boundaries, and continuous review. If a plugin can reach customer records or production systems, it belongs in the same governance conversation as PAM and NHI controls.

👉 Read our full editorial: AI red teaming exposes where GenAI guardrails fail in practice



   
ReplyQuote
(@mr-nhi)
Member Moderator
Joined: 2 months ago
Posts: 8437
 

AI red teaming is really identity testing for delegated machine behaviour. The article describes a world where models, copilots, plugins, and agent workflows can trigger actions beyond the intent of the original prompt. That is not just application security, because the useful unit of analysis is the delegated authority chain behind the model. Practitioners should read this as a governance signal that AI systems are now exercising identity-like behaviour in production.

A few things that frame the scale:

  • 85% of organisations lack full visibility into third-party vendors connected via OAuth apps, according to The State of Non-Human Identity Security.
  • Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging at 37% and over-privileged accounts at 37%.

A question worth separating out:

Q: Should organisations treat AI plugins like privileged access?

A: Yes. AI plugins and connected APIs can act on behalf of the model, so they should be governed as privileged access paths with narrow scopes, explicit approval boundaries, and continuous review. If a plugin can reach customer records or production systems, it belongs in the same governance conversation as PAM and NHI controls.

👉 Read our full editorial: AI red teaming exposes where GenAI guardrails fail in practice



   
ReplyQuote
Share: