Automated AI red teaming now reaches runtime guardrails

By NHI Mgmt Group Editorial TeamPublished 2026-06-07Domain: AnnouncementsSource: Lasso Security

TL;DR: Lasso’s expanded automated AI red teaming adds cloud agent discovery across AWS and Google Vertex, OWASP Agentic Top 10 mapping, and one-click runtime policy generation, with scans that can run in as little as 15 to 20 minutes after change, according to Lasso Security. The deeper implication is that agent security is becoming a continuous identity and policy loop, not a one-time model test.

At a glance

What this is: Lasso Security’s expanded automated AI red teaming combines agent discovery, multi-turn attacks, and runtime policy generation to turn findings into guardrails.

Why it matters: It matters because IAM, NHI, and AI governance teams now need to treat agent behaviour, tool access, and runtime policy as one control surface rather than separate workstreams.

By the numbers:

The platform can rerun targeted scans in 15-20 minutes after a change, according to Lasso Security.

👉 Read Lasso Security's expanded automated AI red teaming update

Context

AI red teaming for agentic systems is no longer just about testing model prompts in isolation. Once an application can call tools, remember context, and reach into cloud-hosted data sources, the security question becomes whether identity, access, and runtime policy can keep pace with the agent’s changing behaviour.

The primary governance gap is that many teams still validate AI systems as if they were static applications. In practice, every new MCP server, API, memory layer, or cloud agent integration expands the identity attack surface and creates a policy problem that sits between NHI governance and emerging agentic AI controls.

Key questions

Q: How should security teams govern AI agents that can call tools and reach cloud data?

A: Security teams should govern AI agents as dynamic identity subjects, not static applications. That means inventorying their tool access, cloud reach, memory, and retrieval dependencies, then mapping each capability to an enforced runtime boundary. If an agent can change what it touches during execution, governance has to follow the behaviour, not just the deployment record.

Q: When does AI red teaming need to move from periodic testing to continuous testing?

A: Continuous testing becomes necessary when model updates, new integrations, or prompt changes can alter agent behaviour without a code rewrite. If a new API, MCP server, or data source expands the reachable surface, a calendar-based test is no longer enough. Change-triggered red teaming keeps the control posture aligned with the system’s actual identity scope.

Q: What do security teams get wrong about agentic AI red teaming?

A: The common mistake is treating prompt filters as the main control and assuming a clean single-turn result means the system is safe. Agentic failures often emerge across a longer sequence, especially when tools, memory, and external content are involved. The right test asks whether intent, scope, and authority remain stable under sustained pressure.

Q: How do runtime guardrails differ from red team findings in AI governance?

A: Red team findings identify the weakness, but runtime guardrails are the control that prevents it from recurring. In practice, the finding should translate into policy logic, tool restrictions, or classifier updates that change the agent’s allowed behaviour. Without that enforcement step, the test is informative but not corrective.

How it works in practice

Model-aware recon across cloud agent environments

Reconnaissance in agentic red teaming starts by identifying the model, its system prompt, connected tools, memory layers, and retrieval paths. That matters because an attacker does not probe an agent as a generic chat surface. They first learn what the system is allowed to see and do, then tailor prompts, tool abuse, and follow-on pressure to that exact architecture. When scans extend into AWS and Google Vertex environments, discovery becomes an identity exercise as much as a security exercise: what agents exist, what they can reach, and what third-party cloud context they inhabit.

Practical implication: inventory agent identities, their tool bindings, and their cloud reach before you try to test prompts or policies.

Multi-turn attacks and fragile intent

Single-turn testing misses one of the most important failure modes in agentic systems: intent drift over time. Multi-turn attacks use sustained dialogue, contextual pressure, and instruction override to see whether the system can be steered away from its intended purpose. That is especially relevant when the agent can preserve context, ingest untrusted content, or chain reasoning across steps. The issue is not whether one prompt gets blocked, but whether the agent’s control boundaries remain stable across an interaction that evolves.

Practical implication: test agent behaviour across conversation sequences, not just isolated prompts or static payloads.

Runtime policy generation from red team findings

The most operationally important part of the workflow is the handoff from finding to control. A red team result is useful only if it changes the runtime boundary that allowed the behaviour in the first place. In agentic systems, that can mean classifier rules, prompt constraints, tool restrictions, or policy logic tied to a specific attack pattern. The architectural shift here is purple teaming, where discovery, testing, and protection form a continuous loop rather than separate disciplines.

Practical implication: require every validated finding to map to a runtime guardrail before the next deployment goes live.

NHI Mgmt Group analysis

Agentic red teaming is becoming a governance function, not a specialist test. Once an AI system can discover tools, retain context, and trigger runtime policy changes, the control problem extends beyond model evaluation. The post-test output is no longer just a finding, it is a change in the effective identity boundary. Practitioners should treat red teaming, policy generation, and access governance as one operating model.

Identity blast radius is the right named concept for this control surface. The breach risk is no longer confined to a model prompt or a single API call. It is the combination of model awareness, connected tools, memory, and cloud reach that determines how far a compromise or manipulation can travel. The practical conclusion is that agent identity scope, not just model quality, defines blast radius.

One-click guardrail generation shortens remediation, but it also shifts accountability. When a finding can be turned into a runtime policy without engineering intervention, the key question becomes who owns the policy decision and how it is reviewed. That matters because AI security tooling is moving from detection to enforcement. Practitioners need explicit ownership for the policy layer, not just a ticket in the queue.

Continuous red teaming validates the assumption that agent behaviour stays stable after change. The article’s workflow ties testing to deployment events rather than a calendar, which reflects the reality that new models, prompts, and integrations alter behaviour. That is a sensible direction for the field. Security teams should stop treating AI systems as static assets and start treating them as continuously changing identity surfaces.

OWASP Agentic AI Top 10 mapping is useful only when it is tied to runtime scope. Mapping a finding to a category is not the same as closing the gap. The value is in translating the category into a control boundary that changes what the agent can access, call, or persist across turns. Practitioners should use taxonomy to prioritise, then enforce at runtime.

From our research:
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials, according to AI Agents: The New Attack Surface report.
Only 52% of companies can track and audit the data their AI agents access, leaving 48% with a complete blind spot for compliance and breach investigation.
For teams building controls now, OWASP NHI Top 10 is a useful forward step for translating agent findings into enforceable runtime boundaries.

What this signals

Identity blast radius is now a design variable for AI programmes. When agents can discover tools, preserve context, and trigger runtime policy updates, the security question shifts from whether the model is safe to how far a compromised or misdirected agent can reach. That is why agent inventories, tool bindings, and cloud integrations need to be governed as one surface, not three separate controls.

The 80% figure from our research on AI agent scope drift shows that the issue is already operational, not theoretical. Teams that cannot audit what their agents accessed will struggle to answer basic compliance and incident questions, especially as deployments move from pilots to production.

Practitioners should align their AI control work with the NIST AI Risk Management Framework and with agent-specific threat models such as the OWASP Agentic AI Top 10. The next phase of governance is not another review cycle, it is continuous proof that runtime boundaries still match intended scope.

For practitioners

Inventory agent identities and tool reach Map every AI agent, cloud integration, memory store, API connection, and retrieval path before red teaming begins. Include third-party cloud environments such as AWS and Vertex so you know which identities can actually be discovered and exercised under test.
Test multi-turn behaviour, not just prompt filters Run adversarial sequences that include context poisoning, instruction override, and tool chaining. A passing single-turn result does not prove the agent will hold its intended scope over a longer interaction.
Tie every finding to a runtime guardrail Require each validated issue to produce a concrete policy update, classifier rule, or tool restriction before the next deployment. A finding that does not change the runtime boundary remains an observation, not a control.
Gate new integrations with change-triggered red team scans Re-test whenever a new MCP server, API, model, or data source is connected. Those changes alter the agent’s identity reach and should trigger recon plus bespoke testing before production use.
Assign ownership for AI policy enforcement Define who approves runtime policy changes generated from red team results and who can override them. Without explicit ownership, automated guardrails can outpace governance.

Key takeaways

Automated AI red teaming is moving into runtime governance, where findings must become guardrails rather than tickets.
The article’s core risk is agentic scope drift across tools, memory, and cloud integrations, not just prompt-level failure.
Practitioners should require change-triggered re-testing and explicit policy ownership before new agent capabilities reach production.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10		The article maps findings to agentic application risks and runtime abuse paths.
NIST AI RMF		Continuous testing and governance ownership align with AI risk management practices.
NIST Zero Trust (SP 800-207)	PR.AC-4	Agent tool access should be constrained by least privilege and continuous verification.

Assign governance ownership and continuous monitoring for AI systems that change over time.

Key terms

Agent Identity Scope: The set of tools, data sources, memories, and permissions an AI agent can reach at runtime. In practice, scope is not just a configuration detail. It is the boundary that determines how far a manipulated or compromised agent can move before detection or containment.
Runtime Guardrail: A live control that shapes what an AI system can do while it is operating, such as a policy rule, classifier, or tool restriction. Unlike a test finding, a guardrail changes behaviour immediately and is intended to prevent the same unsafe action from recurring.
Multi-Turn Attack: An adversarial sequence that pressures an AI system across several interactions instead of a single prompt. This matters because many agent failures emerge gradually, as context, memory, and tool use interact over time and steer the system away from its intended purpose.
Purple Teaming Loop: A continuous cycle that connects testing, detection, and remediation so that security findings feed directly into enforceable controls. For AI agents, the loop matters because the system’s behaviour can change after each model, prompt, or integration update.

Deepen your knowledge

AI red teaming, agent identity scope, and runtime policy enforcement are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI agents in cloud environments, it is worth exploring.

This post draws on content published by Lasso Security: Introducing Lasso's Expanded Automated AI Red Teaming. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-07.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org