Subscribe to the Non-Human & AI Identity Journal

Notifications
Clear all

AI red teaming and GenAI guardrails: what IAM teams should watch


(@nhi-mgmt-group)
Member Moderator
Joined: 1 year ago
Posts: 3789
Topic starter  

TL;DR: AI red teaming shifts security testing from static infrastructure to the model, prompt, plugin, and agent layers, uncovering prompt injection, data leakage, and broken access control risks in real-world GenAI use, according to Lasso Security. The underlying issue is that traditional security assumes stable system behaviour, while AI behaviour changes with context, inputs, and chained tool calls.

NHIMG editorial — based on content published by Lasso Security: What is Red Teaming in AI? Types, Components & Best Practices

Questions worth separating out

Q: What breaks when AI red teaming is not part of GenAI governance?

A: Without AI red teaming, organisations usually discover failures only after a model has already exposed data, bypassed a guardrail, or triggered an unsafe API call.

Q: Why do GenAI systems complicate identity and access control?

A: GenAI systems complicate identity and access control because they can turn a single user request into a chain of delegated actions across plugins, APIs, and service accounts.

Q: How do security teams know if AI red teaming is working?

A: AI red teaming is working when testing finds real prompt injection paths, over-scoped integrations, and policy gaps before attackers do, and when fixes are re-tested successfully after model or workflow changes.

Practitioner guidance

  • Map every AI workflow to its delegated authority chain. Document which prompts, plugins, APIs, service accounts, and downstream systems can be reached from each GenAI application, then identify where authority expands beyond the original use case.
  • Test for prompt injection across all untrusted inputs. Run adversarial tests against uploaded files, retrieved documents, chat inputs, and API responses to verify whether hidden instructions can override system behaviour.
  • Review plugin and API permissions as identity controls. Treat AI-connected integrations as privileged access paths and inspect token reuse, scope breadth, and call chaining.

What's in the full article

Lasso Security's full blog post covers the operational detail this post intentionally leaves for the source:

  • Hands-on examples of adversarial prompt patterns and jailbreak techniques used in red-team exercises
  • Tool-specific testing approaches across Microsoft PyRIT, Meta Purple Llama, and other evaluation frameworks
  • Step-by-step guidance on tracking findings, validating remediation, and repeating tests after changes
  • Practical examples of how autonomous workflows, copilots, and integrations behave under attack

👉 Read Lasso Security's guide to AI red teaming types, components, and best practices →

AI red teaming and GenAI guardrails: what IAM teams should watch?

Explore further

View Full Forum →  |  NHI Foundation Course →



   
Quote
Share: