By NHI Mgmt Group Editorial TeamPublished 2026-06-12Domain: Agentic AI & NHIsSource: Zenity

TL;DR: Claude agents can move from prompt to tool use, code change, and business action fast enough that logs alone miss setup-layer risk, runtime abuse, and downstream impact, according to Zenity. Identity and access programmes now need lifecycle visibility, posture checks, and inline enforcement for agent behaviour, not just after-the-fact review.


At a glance

What this is: This is Zenity’s analysis of why Claude agent security needs full-lifecycle visibility, posture management, and runtime enforcement across code, chat, and enterprise workflows.

Why it matters: It matters because practitioners governing NHI, autonomous systems, and human access all need controls that follow identity behaviour from configuration through execution, not just through logging.

👉 Read Zenity's analysis of Claude agent lifecycle security and runtime controls


Context

Claude agents are not passive applications. They can read context, call tools, modify code, and trigger downstream business actions, which means the security boundary is no longer the login event but the full agent lifecycle.

That changes identity governance for non-human identities and agentic AI alike. If an organisation can only see what the agent returned, it cannot reliably judge what the agent was allowed to do, what extensions shaped its behaviour, or whether the execution path stayed within policy.


Key questions

Q: How should security teams govern Claude agents that can change code and data?

A: They should treat Claude as an identity-bearing execution surface, not just an application. Governance needs to cover configuration, connected extensions, prompts, tool use, and downstream outputs such as commits or business actions. If teams only review logs after execution, they will miss setup-layer compromise and behaviour drift that occurred before the alert ever fired.

Q: Why do agent controls need to start before the first prompt?

A: Because hostile behaviour can be introduced in the setup layer through MCP servers, plugins, skills, hooks, or misconfiguration. By the time a session starts, the agent may already be working from a compromised trust base. Pre-session assessment is therefore essential for preventing unsafe actions rather than merely documenting them.

Q: What do security teams get wrong about agent session logs?

A: They often assume logs are enough to explain risk. In practice, logs show the event sequence but not always the extension posture, hidden instructions, or cumulative impact that shaped the session. The useful control is reconstructed context that connects prompts, tools, commands, and downstream artefacts into one evidence chain.

Q: How do organisations decide whether an AI agent action was acceptable?

A: They should evaluate the full session against the stated task, the approved tool set, and the downstream effect. If the agent’s cumulative actions exceed the intended work or reach sensitive systems without clear policy basis, the action was not acceptable even if individual steps looked routine in isolation.


Technical breakdown

Why logs are not enough for Claude agent governance

Traditional logging tells you what happened after the fact, but agent systems can be influenced before a session begins and can act across multiple tools during execution. For Claude Code, Cowork, and Chat, the meaningful control surface includes configuration, connected extensions, prompts, tool calls, and downstream artefacts such as code commits or business actions. That is why visibility has to be reconstructed as a chain, not treated as isolated events. Without that chain, security teams see evidence without context and cannot separate expected behaviour from manipulated execution.

Practical implication: correlate sessions, tools, and downstream outputs before relying on logs as evidence.

How setup-layer risk changes MCP server and plugin governance

The setup layer is where many agent risks are introduced. A malicious or overly permissive MCP server, plugin, skill, or hook can shape what the agent sees and what actions it is capable of taking before the first prompt is processed. That makes pre-session assessment essential, because the dangerous state may already exist when runtime monitoring starts. Governance has to cover source trust, permission scope, and hidden behaviour in connected components, not just the agent application itself.

Practical implication: inventory and review MCP servers, plugins, and skills before enabling them for enterprise use.

Why runtime enforcement must understand intent and cumulative impact

Agent actions are not risky only when a single event looks malicious. A safe-seeming sequence can become unsafe when the cumulative impact exceeds the task the user intended. Runtime controls therefore need to evaluate prompt content, tool use, command sequences, and behaviour drift together. That is especially important when agents can reach source code, secrets stores, or production systems through ordinary workstreams. If the control layer cannot interpret the session as a sequence, it will miss abuse that only becomes visible across the whole chain.

Practical implication: enforce inline blocking when cumulative session impact exceeds the stated task.


Threat narrative

Attacker objective: The objective is to use trusted agent activity to turn normal work into code tampering, credential exposure, or unsafe downstream changes without immediate detection.

  1. Entry occurs when a developer or user enables a seemingly legitimate MCP server, plugin, or hook that can influence Claude’s session before the task starts.
  2. Escalation follows when indirect prompt injection or hidden extension behaviour steers the agent toward inspecting environment variables, changing scripts, or touching sensitive repository content.
  3. Impact occurs when the agent’s output reaches code review, secrets exposure, or production-adjacent systems with changes that were shaped by manipulated instructions.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.


NHI Mgmt Group analysis

Claude agent security is now a full lifecycle governance problem, not a logging problem. The article shows that context, extension posture, runtime behaviour, and downstream artefacts all matter at once. That means security teams need to govern the agent path from configuration to execution, not just review the output after the fact. The practitioner conclusion is clear: agent visibility without lifecycle control is incomplete governance.

Setup-layer trust is the new failure plane for agentic systems. A Claude session can be compromised before the first prompt if an MCP server, skill, plugin, or hook is hostile or over-permissive. That is a control gap in the setup layer, not a runtime-only problem. Practitioners should treat connected components as part of the identity surface, because the agent inherits their trust and permissions immediately.

Full context is becoming the minimum viable control for autonomous work. Claude Code can influence commits, Cowork can shape decisions through enterprise context, and Chat can interact with sensitive information. Once agent activity can produce durable business effects, the governance model must connect prompts, tools, and downstream outcomes into one record. The practitioner takeaway is that isolated alerts will not support audit, investigation, or policy enforcement.

Runtime enforcement must classify behaviour, not just detect indicators. The article’s emphasis on destructive actions, credential exposure, and memory manipulation shows that the risk is behavioural drift across a session. That requires a governance model that can stop unsafe execution in context, not merely flag suspicious text. The implication for teams is to align AI security operations with action-based policy, not content review alone.

From our research:

What this signals

Claude agent governance is starting to resemble a privileged identity programme with software execution attached. The immediate programme impact is that security teams need evidence across posture, runtime, and downstream artefacts, not just a posture scan or SIEM alert. With 72% of organisations already reporting or suspecting an NHI breach, per The 2024 ESG Report: Managing Non-Human Identities, the governance baseline is already strained.

Full-lifecycle context is the operational differentiator. Teams that can trace a Claude session from extension inventory through prompt, tool use, and commit correlation will be better placed to support audit and incident review. That is especially important where agent behaviour touches code, secrets, or production systems.

Runtime policy needs to move closer to the decision point. If enforcement only happens after the session completes, the programme has already accepted too much risk. Security teams should prioritise controls that can stop unsafe action before repository impact or business execution occurs.


For practitioners

  • Inventory all Claude-connected extensions Catalogue MCP servers, skills, plugins, hooks, and local configuration scopes before allowing enterprise rollout. Record source, owner, permission scope, and whether each component can influence prompts, file access, or tool execution.
  • Correlate agent sessions to downstream artefacts Link Claude activity to pull requests, commits, file changes, and business actions so investigators can reconstruct intent and sequence. Use that evidence to distinguish approved automation from manipulated or out-of-scope behaviour.
  • Block risky execution before repository impact Apply inline controls that stop destructive actions, credential exposure, or suspicious command activity before the agent reaches source code, secrets stores, or production systems. Treat prevention as the primary control, not alerting after the event.
  • Review configuration posture across every scope Assess managed, user, project, and local Claude settings for drift from policy and for combinations that expand agent privilege. Revalidate posture whenever extensions change or the agent is connected to new enterprise systems.

Key takeaways

  • Claude agent security cannot be reduced to logging because setup-layer trust and runtime behaviour both shape risk.
  • Configuration, extensions, prompts, tools, and downstream outputs must be governed as one execution chain.
  • Teams that enforce policy before repository or system impact will be better positioned than teams that investigate after the fact.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

FrameworkControl / ReferenceRelevance
OWASP Agentic AI Top 10AGENTIC-07Agent extension and tool abuse are central risks in this article.
OWASP Non-Human Identity Top 10NHI-03Configuration and secret exposure in agent setups map to NHI lifecycle risk.
NIST AI RMFThe article is about governance and runtime oversight for AI-enabled behaviour.

Apply GOVERN and MAP to assign ownership, controls, and escalation paths for agent risk.


Key terms

  • Agentic Identity Surface: The agentic identity surface is the full set of controls, permissions, and connected components that shape what an AI agent can see and do. It includes configuration, extensions, prompts, tools, and downstream artefacts, because each can influence behaviour and risk.
  • Setup-layer Risk: Setup-layer risk is exposure introduced before an AI agent begins a session. It usually comes from plugins, MCP servers, hooks, or misconfiguration that alter trust, permissions, or hidden instructions, making runtime monitoring too late to prevent the initial compromise.
  • Execution Chain: The execution chain is the sequence from prompt to tool use to resulting system or business action. It matters because agent risk is often only visible when the whole chain is analysed together, rather than when each step is judged in isolation.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity security are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance maturity, it is worth exploring.

This post draws on content published by Zenity: Claude's Agents Are Already Running Across Your Enterprise. Now Security Teams Can Catch Up. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-12.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org