Notifications

Clear all

OpenClaw and autonomous assistants: where governance breaks down

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12387

Topic starter 05/07/2026 6:49 pm

TL;DR: OpenClaw can be driven from indirect prompt injection into command execution, persistent heartbeat backdoors, and plaintext secret exfiltration when untrusted content reaches its tool layer and system prompt, according to HiddenLayer’s analysis. The bigger lesson is that autonomy without hard execution boundaries turns assistant behavior into an access-control problem, not just a model-safety issue.

NHIMG editorial — based on content published by HiddenLayer: Exploring the Security Risks of AI Assistants like OpenClaw

By the numbers:

Only 5.7% of organisations have full visibility into their service accounts.
96% of organisations store secrets outside of secrets managers in vulnerable locations including code, config files, and CI/CD tools.

Questions worth separating out

Q: What breaks when an autonomous assistant can read untrusted content and execute tools in the same session?

A: The separation between input handling and execution breaks down.

Q: Why do autonomous assistants create more risk than ordinary automation for IAM and NHI teams?

A: Because the actor is making runtime decisions about what to do next, not just following a fixed workflow.

Q: What do security teams get wrong about prompt injection in agentic systems?

A: They often treat prompt injection as a model quality issue instead of an execution control issue.

Practitioner guidance

Move tool authorization outside the model Require a separate policy decision before any shell command, file write, or external request is executed.
Make prompt content immutable at runtime Prevent the assistant from writing to files that are later ingested into the system prompt or skill configuration.
Isolate secrets from assistant-accessible storage Keep API keys and messaging tokens out of plaintext environment files that the assistant or its shell access can read.

What's in the full report

HiddenLayer's full research covers the operational detail this post intentionally leaves for the source:

The exact indirect prompt injection sequence used to steer OpenClaw into executing attacker-controlled commands.
The HEARTBEAT.md persistence mechanism and how it lets malicious instructions survive across new sessions.
The security architecture failures around control sequences, guardrails, and approval-free tool execution.
The plaintext secret exfiltration path and why the assistant's local runtime makes the blast radius larger.

👉 Read HiddenLayer’s analysis of OpenClaw’s autonomous assistant security risks →

OpenClaw and autonomous assistants: where governance breaks down?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 3 months ago

Posts: 11961

05/07/2026 7:07 pm

Autonomous assistants collapse the boundary between application logic and identity authority. OpenClaw is not only a model safety problem. It is a delegation problem in which the system itself is allowed to decide when to act, what to run, and what to persist. Once those decisions happen inside the same runtime that sees untrusted content, the assistant is behaving like an identity with execution authority, not a passive interface. Practitioners should treat that as a governance boundary, not a UI concern.

A few things that frame the scale:

Only 5.7% of organisations have full visibility into their service accounts, according to the Ultimate Guide to NHIs.
Another finding from the Ultimate Guide to NHIs , Lifecycle Processes for Managing NHIs shows that only 20% have formal processes for offboarding and revoking API keys, which is directly relevant when assistant access becomes persistent.

A question worth separating out:

Q: Who is accountable when an autonomous assistant exfiltrates secrets or runs destructive commands?

A: Accountability sits with the team that granted the assistant its tool access, data access, and execution paths. For governed environments, that responsibility also extends to the controls that failed to separate instruction content from runtime authority. If the agent can act without a control gate, the governance gap is structural.

👉 Read our full editorial: OpenClaw shows how agent autonomy becomes system exposure

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26.1 K Posts

40 Online

135 Members

Latest Post: LLM security and AI-driven crime: what security teams must change Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies