TL;DR: GenAI policy management remains hard because teams must set risk thresholds for novel behaviours across dynamic inputs and outputs, according to Lakera’s product update. Opinionated starting policies reduce blank-slate effort, but they also make policy design a governance decision, not just a tuning exercise.
At a glance
What this is: This product update argues that GenAI security should start from pre-built policies, with a single sensitivity setting and gradual tuning as teams mature.
Why it matters: It matters because IAM, NHI, and AI governance teams now need defensible defaults for autonomous or semi-autonomous application behaviour, not just ad hoc guardrails.
By the numbers:
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities.
- When AWS credentials are exposed publicly, attackers attempt access within an average of 17 minutes and as quickly as 9 minutes in some cases.
👉 Read Lakera's update on opinionated GenAI policy management and default controls
Context
GenAI policy management is the control layer that decides what an application can do, what it should block, and how much uncertainty the team is willing to tolerate. The governance gap is that many teams are trying to secure behaviour that is still evolving, while the policy surface itself is too broad to configure from scratch every time.
For IAM and security leaders, the question is no longer whether to add guardrails, but how to set a usable default that matches the use case and risk appetite. That is especially relevant where GenAI systems sit between human users, secrets, and downstream tools, because poor policy design can create either false confidence or unnecessary friction.
Key questions
Q: How should security teams roll out GenAI policy controls without blocking too much?
A: Start with a default policy in logging mode, then move to limited enforcement only after you have observed the application’s normal behaviour. Use use-case-specific baselines, clear exception criteria, and change control for any threshold adjustment. That keeps the rollout practical while preserving auditability and avoiding blanket controls that are hard to operate.
Q: When does a single sensitivity setting become too simplistic for GenAI governance?
A: A single setting becomes too simplistic when the same threshold is being used across materially different applications, such as customer-facing chatbots and internal assistants. At that point, the setting hides risk differences instead of managing them. Teams should only keep one threshold when the use case, data sensitivity, and tolerance for false positives are genuinely aligned.
Q: What do teams get wrong about pre-built GenAI policies?
A: They often assume a pre-built policy is the end state rather than a starting point. In reality, a vetted policy should reduce deployment friction while still being reviewed, tuned, and documented against business risk. If teams do not assign ownership for exceptions and escalation, the policy becomes a static default rather than a governed control.
Q: Who should own policy decisions for GenAI applications?
A: Ownership should sit with the team responsible for the risk appetite, enforcement criteria, and exception handling, usually across security, platform, and application leadership. GenAI policy is not just a technical setting. It is a governance decision that affects how fast teams can deploy, how much they can trust the system, and how they respond when behaviour changes.
Technical breakdown
Why GenAI policy defaults matter more than bespoke tuning
GenAI applications are difficult to secure because their behaviour depends on prompts, context, downstream tools, and model output, all of which can vary at runtime. A default policy is not just a convenience feature. It is an opinionated control baseline that tells the system what to watch, what to flag, and where to enforce. In practice, the value is in reducing the gap between deployment and meaningful protection. The risk is that a weak default can become the de facto standard across teams, even when the use case demands stronger controls.
Practical implication: standardise policy baselines by use case so teams do not invent controls project by project.
Sensitivity thresholds and the false-positive trade-off
A single sensitivity setting is a governance shortcut, but it still represents a real control decision. Lower thresholds reduce friction, while higher thresholds increase coverage at the cost of more blocking or alert noise. The technical issue is not whether a setting exists, but whether teams understand what behavioural patterns it is meant to catch and what they are willing to miss. Without that clarity, sensitivity tuning becomes reactive. Mature programmes treat threshold selection as part of policy design, not as a post-deployment cleanup exercise.
Practical implication: define threshold ownership and review criteria before enabling enforcement in production.
From logging mode to enforcement: how policy maturity usually evolves
Most security programmes do not move straight to blocking. They start with observation, then introduce limited enforcement, and only later refine templates, exceptions, and custom guardrails. That progression is sensible because GenAI use cases are still being learned in production. The architectural point is that the policy engine must support both visibility and control without requiring a full redesign each time the use case changes. If the policy model is rigid, teams either over-block or silently accept risk. Flexibility matters, but only if it preserves auditability.
Practical implication: build a staged rollout path from logging to enforcement and keep audit trails intact at each step.
NHI Mgmt Group analysis
Opinionated policy baselines are becoming the real control plane for GenAI. Blank-slate policy design does not scale when applications are being deployed faster than teams can manually assess every prompt path and output pattern. The shift is from configuration as a one-off exercise to governance as a reusable starting point. Practitioners should treat policy baselines as an operating model decision, not a UI convenience.
One sensitivity flag is useful only if the organisation has already defined acceptable risk by use case. A single global threshold can simplify operations, but it also concentrates judgement into one setting. That makes the policy decision more visible, not less. The practical conclusion is that security teams need use-case-specific tolerance levels before they can claim the control is meaningful.
Control maturity gap: teams often mistake deployment speed for governance maturity when they adopt pre-built policies without defining escalation criteria. The article’s central message is that faster rollout is valuable only when the team can explain what triggers a tighter policy, a custom guardrail, or a shift from logging to blocking. Practitioners should ensure the policy lifecycle is documented as carefully as the model lifecycle.
GenAI policy management now sits at the intersection of IAM, application security, and operational risk. That means security teams can no longer leave it to a single product owner or a single engineering team. The right question is who owns the risk appetite, who tunes enforcement, and who can override the default when the business use case changes. Practitioners should assign that accountability explicitly.
From our research:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap, according to the same research.
- The broader lifecycle lesson is already captured in the NHI Lifecycle Management Guide, which helps teams move from ad hoc handling to governed rotation and offboarding.
What this signals
Control maturity will matter more than feature count. As GenAI policy management becomes more opinionated, teams will be judged on whether they can explain their defaults, not just whether they can switch features on. The governance pattern is converging with NIST Cybersecurity Framework 2.0 thinking: define, protect, detect, and iterate with evidence.
Policy defaults will start to function like identity controls for AI-facing systems. That means security teams should expect the same questions they already face in IAM and NHI programmes: who owns the policy, who can change it, and what records prove the control was actually applied. With the 27 days to remediate a leaked secret still a reality in many environments, slow governance cycles are no longer just inefficient.
The practical signal for practitioners is clear. If GenAI policy changes cannot be tracked, reviewed, and tied to a business use case, then the programme is still operating like a prototype rather than a control system.
For practitioners
- Define policy baselines by use case Map public-facing, internal-facing, prompt-defense-only, and content-safety scenarios to distinct enforcement defaults before rollout. Avoid using a single template across all GenAI applications, especially where data sensitivity and user trust differ.
- Set threshold ownership before enforcement Document who approves sensitivity changes, who reviews false positives, and what evidence is required before a policy moves from logging to blocking. That prevents threshold drift from becoming an untracked operational decision.
- Stage rollout from observation to control Start in logging mode, move to limited blocking, then expand only after you have observed repeatable behaviour and stable exceptions. Keep audit trails and change records for every policy adjustment.
- Separate guardrail tuning from model deployment Treat policy updates as a governed change stream rather than an informal prompt-tweaking task. Use change control for templates, sensitivity levels, and exceptions so security and product teams can review the same evidence.
Key takeaways
- GenAI policy management is shifting toward reusable control baselines because blank-slate configuration does not scale across dynamic applications.
- A single sensitivity setting can simplify operations, but it only works when the organisation has already defined use-case-specific risk tolerance.
- Security teams should treat policy rollout as a governed lifecycle, with ownership, exception handling, and audit trails built in from the start.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| OWASP Agentic AI Top 10 | AG-03 | Policy defaults and guardrails map to controlling agent behaviour at runtime. |
| NIST AI RMF | GV-1 | The article is about risk appetite and governance for AI system behaviour. |
| NIST CSF 2.0 | PR.AC-4 | Policy enforcement and access control logic are part of runtime protection for GenAI apps. |
Define default guardrails for GenAI systems and review them before enabling broader tool access.
Key terms
- GenAI policy management: GenAI policy management is the process of defining what a generative AI application should allow, flag, or block based on risk and use case. It combines behavioural controls, thresholds, and exception handling so teams can govern outputs and tool use without manually tuning every interaction.
- Sensitivity setting: A sensitivity setting is a control parameter that changes how aggressively a policy flags or blocks behaviour. In GenAI environments, it is effectively a risk-tolerance dial, balancing user friction against the chance of missing harmful prompts, outputs, or manipulative interactions.
- Policy baseline: A policy baseline is the default control set an organisation applies before customising for a specific deployment. It gives teams a repeatable starting point for enforcement, logging, and review, which is especially useful when many GenAI use cases share similar risk patterns.
What's in the full article
Lakera's full product update covers the operational detail this post intentionally leaves for the source:
- The exact five one-click policy templates and the use cases each one is intended to support.
- The sensitivity guidance for moving from L1 to L4 and the rationale behind each threshold.
- The progression from logging mode to blocked enforcement and how Lakera describes the rollout path.
- The advanced settings workflow for templated guardrails, custom behaviour, and self-hosted deployment.
Deepen your knowledge
NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.
Published by the NHIMG editorial team on 2025-08-27.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org