Notifications

Clear all

AI security rule-based guardrails: what practitioners need to know

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12387

Topic starter 05/07/2026 6:56 pm

TL;DR: AI security teams are repeating the pre-AlexNet mistake by relying on static filters, manual test cases, and pattern matching against systems that evolve faster than handcrafted rules, according to Lakera. The real risk is that yesterday’s control model cannot generalise to dynamic AI behaviour, so governance must move from exception-chasing to adaptive assurance.

NHIMG editorial — based on content published by Lakera: What the AI Past Teaches Us About the Future of AI Security

Questions worth separating out

Q: How should security teams defend AI systems that change behaviour quickly?

A: Security teams should combine policy with adaptive detection, continuous testing, and behavioural monitoring.

Q: Why do static guardrails fail in AI security?

A: Static guardrails fail because they are built to recognise known patterns in a system that attackers can vary endlessly.

Q: What do organisations get wrong about AI security controls?

A: Organisations often assume that more rules automatically mean more security.

Practitioner guidance

Replace static prompt rules with behavioural detection Measure whether your AI security controls can detect rephrased abuse, multi-step prompt variation, and context shifting instead of only matching known bad strings.
Review where AI systems can affect tools and data Map every model or agent path that can trigger tool use, retrieve sensitive data, or influence downstream actions, then classify those paths as governance boundaries.
Build feedback loops into security validation Use continuous red-team findings, false-positive review, and attack variation testing to keep security controls aligned with a changing AI attack surface.

What's in the full article

Lakera's full article covers the broader AI security argument and source commentary this post intentionally leaves at the strategic level:

The article's detailed analogy between rule-based computer vision and modern AI security
Mateo Rojas-Carulla's full argument on why static guardrails create a whack-a-mole cycle
The closing commentary on why AI security needs to become AI-native rather than rule-first

👉 Read Lakera's analysis of why AI security needs AI-native defences →

AI security rule-based guardrails: what practitioners need to know?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 3 months ago

Posts: 11961

05/07/2026 7:19 pm

Static AI security rules are a temporary control, not a durable governance model. The article is right to frame prompt filters and manual test cases as useful but incomplete. They can absorb known patterns, but they do not hold up when inputs are reworded, recombined, or chained across sessions. The field should treat this as a boundary problem: controls that depend on memorising attacks will always lag attackers who can vary the expression of the same behaviour. Practitioners should assume that rule coverage will decay faster than teams can refresh it.

A few things that frame the scale:

85% of organisations lack full visibility into third-party vendors connected via OAuth apps, according to The State of Non-Human Identity Security.
Only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs, compared to nearly 1 in 4 for securing human identities, according to The State of Non-Human Identity Security.

A question worth separating out:

Q: How should identity teams think about AI systems that can take actions?

A: Identity teams should treat action-capable AI as part of the governance boundary, not just as a content generator. Once a model can influence tools, retrieve data, or trigger workflows, access control, monitoring, and review become shared concerns across IAM, PAM, and NHI programmes. That requires a single operating model for runtime behaviour.

👉 Read our full editorial: AI security is repeating the old rule-based failure pattern

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26.1 K Posts

34 Online

135 Members

Latest Post: LLM security and AI-driven crime: what security teams must change Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies