TL;DR: California’s new AI laws take effect on January 1, 2026 and require companion and healthcare-focused systems to prevent self-harm content, avoid misleading medical authority claims, and intervene in live conversations, according to Lakera. The shift is from policy intent to runtime control, where governance must hold up under user interaction, not just documentation.
At a glance
What this is: California is moving AI governance from policy statements to runtime enforcement for user-facing systems that can influence vulnerable users or imply medical authority.
Why it matters: IAM, NHI, and autonomous programme owners need to treat live conversational behaviour as a governed identity surface, not just an application feature, because enforcement now targets what the system does in-session.
👉 Read Lakera’s analysis of California’s AI laws and runtime guardrails
Context
California’s new AI laws focus on a simple governance problem: what happens when a deployed AI system is already interacting with a person and the conversation changes direction. The primary keyword here is California’s AI laws, and the core issue is runtime control, not model training or architectural choice.
For IAM and security teams, that creates a familiar identity lesson. When a system speaks with users in real time, it behaves like an operating actor with permissions, boundaries, and escalation paths that must be enforced at the moment of use. Static policy documents are no longer enough if they cannot shape live behaviour.
The article is typical of the current regulatory turn: lawmakers are not banning AI assistants, but forcing teams to prove that guardrails work under pressure. That makes production governance, exception handling, and auditability the practical centre of gravity for both NHI and autonomous AI programmes.
Key questions
Q: How should security teams govern user-facing AI that can change tone in live conversations?
A: They should treat the conversation itself as a governed control surface. That means defining runtime policies for disclosure, risky content, and escalation, then enforcing those rules before the output reaches the user. If the system can alter tone, guidance, or authority based on session context, the governance model must be able to intercept behaviour in session, not only approve it in advance.
Q: Why do companion chatbots create compliance risk even when they do not claim to be human?
A: Because users respond to tone, persistence, and conversational memory, not just explicit identity claims. A chatbot can still create dependence or perceived trust if it stays present, remembers context, and speaks with emotional continuity. Compliance risk rises when the system fails to interrupt harmful conversations or keep disclosure visible throughout the interaction.
Q: What do security teams get wrong about AI systems that sound like clinicians?
A: They focus on whether the system explicitly says it is a doctor, but that is only part of the problem. Users can still infer medical authority from phrasing, confidence, and design cues. The right control question is whether the system can be mistaken for licensed expertise in practice, and whether outputs are blocked before that happens.
Q: Who is accountable when an AI guardrail fails in production?
A: Accountability sits with the operator that deployed the system and the team responsible for its live control design. If the guardrail did not trigger, the issue is not just model behaviour, but governance failure. California’s approach makes that distinction sharper by focusing on observed system behaviour rather than written intent.
Technical breakdown
Runtime guardrails for conversational AI
California’s approach treats a chatbot as a live decision surface, not a static content generator. The law is concerned with what the system says when a user is in front of it, which means the control point is the response path, not the model file. Runtime guardrails are policies that inspect prompts, outputs, and conversation state before a reply is delivered. In practice, this shifts governance from design-time review to session-time enforcement, where unsafe, misleading, or emotionally risky outputs can be blocked or redirected.
Practical implication: teams need response interception and policy enforcement at inference time, not just pre-launch review.
Companion AI disclosure and intervention logic
The companion chatbot rules are aimed at systems that can create perceived relationship, dependence, or trust over time. Disclosure is not a one-time banner in this model. It has to persist through the interaction, and the system must change behaviour when the conversation becomes dangerous, including self-harm scenarios. That requires state-aware conversation logic, documented escalation pathways, and observable triggers that can be audited after the fact. The mechanism is less about classification and more about enforcing behavioural boundaries in-session.
Practical implication: define escalation triggers, published response protocols, and audit events for high-risk conversational branches.
Why medical-sounding AI creates governance risk
AB 489 addresses the gap between factual output and implied authority. A system can avoid claiming it is a doctor and still sound clinical enough to mislead users into treating it as one. That makes language, presentation, and context part of the control surface. For governance teams, the technical challenge is to identify when phrasing, tone, or UI cues imply licensed expertise, then suppress or rewrite those outputs before they reach the user. This is a behavioural control problem, not a model accuracy problem.
Practical implication: review prompts, templates, and UI language for implied expertise, not only explicit medical claims.
Breaches seen in the wild
- ASP.NET machine keys RCE attack — 3,000+ exposed ASP.NET machine keys enabled remote code execution.
- Codefinger AWS S3 ransomware attack — Codefinger used compromised AWS credentials to encrypt S3 buckets via SSE-C.
Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.
NHI Mgmt Group analysis
Runtime policy enforcement is becoming the real control plane for user-facing AI. California’s laws make it clear that governance is no longer judged by what teams wrote down before deployment. It is judged by whether the system changes behaviour when a user conversation crosses a safety or authority boundary. For IAM and NHI programmes, that is the same structural problem as any identity control that must act at runtime rather than at approval time.
Conversation-state awareness is now a governance requirement, not a product feature. The law’s expectations around repeated disclosure, self-harm intervention, and misleading medical language all depend on the system remembering what kind of interaction is under way. That pushes teams toward controls that understand session context, trigger conditions, and escalation states. Practitioners should treat that as an identity and access boundary issue, because the system is deciding what it may express in the moment.
Guardrail failure now creates regulatory exposure in the same way access failure creates security exposure. If a deployed system can still behave like a human companion or a clinician without constraint, then the governance model has failed at the point of use. That failure mode is especially relevant to autonomous AI programmes, where live behaviour matters more than declared intent. Teams should rethink whether their current controls can actually constrain production-time decisions.
Behavioural compliance is the new audit surface for AI systems that talk to users. California is effectively requiring evidence that controls work in live interactions, not just in policy reviews. That shifts the centre of gravity toward logging, exception handling, and post-event traceability. Practitioners should expect the most defensible programmes to be the ones that can prove intervention happened when the conversation changed shape.
Named concept: runtime guardrail accountability. This is the point at which governance moves from approving what an AI system is allowed to be to proving what it is allowed to do in-session. The implication for practitioners is that auditability, enforcement, and escalation design become inseparable from the identity surface itself.
From our research:
- The average estimated time to remediate a leaked secret is 27 days, despite 75% of organisations expressing strong confidence in their secrets management capabilities, according to The State of Secrets in AppSec.
- Only 44% of developers are reported to follow security best practices for secrets management, exposing a significant developer behaviour gap.
- Forward pivot: For deeper lifecycle control, read NHI Lifecycle Management Guide and connect runtime policy with offboarding, rotation, and access review discipline.
What this signals
Runtime guardrail accountability: the industry is moving toward evidence that production controls actually shape AI behaviour, not just that policy exists on paper. For teams building user-facing AI, that means logging intervention events, retaining conversation context, and proving that escalation paths work when the interaction turns risky. The governance bar is now closer to access enforcement than documentation review.
California’s model also reinforces a broader programme lesson: once an AI system can speak continuously with users, it starts to resemble a governed identity with behavioural constraints. That matters to NHI and autonomous teams because the control surface is the live interaction, not the model artefact. The strongest programmes will align runtime policy with frameworks like NIST Cybersecurity Framework 2.0 and session-level traceability.
For practitioners
- Define runtime response controls Map every user-facing AI flow to a response policy that can block, rewrite, or route outputs before they are delivered. Focus on self-harm, health guidance, and any interaction that could create false authority.
- Log intervention events Record when a guardrail fires, what condition triggered it, and what response the system took. Those event logs are the evidence trail for compliance reviews and incident investigations.
- Review implied-authority language Audit prompts, templates, and user interface copy for phrases, titles, or visual cues that could make AI outputs feel clinician-guided or human-authored. Remove subtle authority signals, not just explicit claims.
Key takeaways
- California’s AI laws move user-facing AI governance into the production layer, where live behaviour must be constrained in real time.
- The main risk is not model capability alone, but the failure to control disclosure, implied authority, and harmful interaction paths during a live session.
- Teams that cannot prove runtime intervention, escalation, and auditability will struggle to defend their AI governance posture under the new rules.
Standards & Framework Alignment
This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.
NIST CSF 2.0, NIST AI RMF and NIST SP 800-63 set the governance and control requirements practitioners need to meet.
| Framework | Control / Reference | Relevance |
|---|---|---|
| NIST CSF 2.0 | PR.DS-5 | Runtime guardrails support data and output protection in live AI interactions. |
| NIST AI RMF | California’s rules focus on governed AI behaviour in production. | |
| NIST SP 800-63 | Identity assurance matters when AI impersonates or implies human expertise. |
Align user-facing AI disclosure and trust cues with digital identity assurance principles.
Key terms
- Runtime guardrail: A runtime guardrail is a control that evaluates AI output during live execution and can block, rewrite, or route the response before the user sees it. It shifts governance from design-time approval to session-time enforcement, which is essential when behaviour changes based on user context.
- Implied authority: Implied authority is the sense that an AI system knows or is licensed to act as an expert, even when it has not said so explicitly. It can come from tone, wording, interface design, or persistence, and it creates governance risk because users may trust the output more than they should.
- Conversation-state awareness: Conversation-state awareness is the ability to recognise what kind of interaction is happening and apply different rules based on that state. In governed AI, it is what lets a system detect escalation, preserve context, and switch responses when a conversation becomes high risk.
- Behavioural compliance: Behavioural compliance is proof that a deployed system actually behaves according to policy when users interact with it. It matters because written rules alone do not demonstrate that interventions, disclosures, and safety responses occur at the right moment in production.
Deepen your knowledge
California’s AI laws and runtime guardrails are covered in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building production controls for user-facing AI, this is directly relevant to your governance model.
This post draws on content published by Lakera: California’s AI Laws Are About to Meet Reality. Read the original.
Published by the NHIMG editorial team on 2026-01-01.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org