Runtime protection against Kaiji malware needs stronger drift control

By NHI Mgmt Group Editorial TeamPublished 2025-11-18Domain: Governance & RiskSource: Aqua Security

TL;DR: Kaiji has evolved from a straightforward Linux and IoT threat into malware that uses persistence, fileless execution, and system tampering to stay hidden after compromise, according to Aqua Security. That makes runtime enforcement, drift prevention, and tamper-aware detection more important than simple post-infection cleanup.

At a glance

What this is: This is Aqua Security’s guidance on using runtime protection to detect and block Kaiji malware’s persistence, fileless execution, and container drift tactics.

Why it matters: It matters because workload, container, and server security teams need controls that can stop malware after initial execution, not only before deployment.

👉 Read Aqua Security's guidance on runtime protection against Kaiji malware

Context

Kaiji is a persistence-focused malware strain that targets Linux-based servers and IoT devices, then uses hiding techniques to remain resident after compromise. The operational problem is not just initial infection, but the attacker’s ability to survive reboots, obscure process visibility, and evade basic administrative checks.

For container and workload teams, that makes runtime protection a governance issue as much as a detection issue. If enforcement is limited to pre-deployment hygiene or isolated endpoint checks, malware that mutates its behaviour at runtime can keep its foothold long enough to complicate containment and recovery.

Key questions

Q: How should teams stop malware that hides itself after initial execution?

A: Teams should combine runtime monitoring with enforcement that can block suspicious execution paths, unexpected persistence entries, and drift from the approved workload state. Cleanup after compromise is not enough if the malware can relaunch after reboot or conceal itself from standard inspection tools. Runtime control has to interrupt the behaviour that keeps the threat alive.

Q: Why do containerised workloads need drift prevention for malware defense?

A: Containerised workloads need drift prevention because a workload can be approved at build time and still behave maliciously at runtime. Malware that uses fileless execution or hidden startup logic may never appear in the image scan, so the security team needs controls that compare live behaviour against the expected state.

Q: What do security teams get wrong about persistence in Linux malware cases?

A: They often focus on removing visible processes and overlook the restart mechanism that restores the malware later. If the persistence path survives reboot, the incident is not contained. Security teams need to look for startup entries, tampered tools, and hidden launch points, not just active processes.

Q: What should teams do when runtime protection can only observe suspicious activity?

A: They should use observation for tuning, but move high-confidence malicious patterns into enforcement. If a threat family is built to survive and hide, passive visibility can arrive too late. The goal is to stop execution before the malware finishes setting up persistence and concealment.

Technical breakdown

How Kaiji persistence survives ordinary cleanup

Kaiji persists by planting itself in commonplace startup locations and arranging for automatic execution after reboot. That means cleanup efforts that only remove visible processes can miss the mechanism that restores the malware on the next startup. The article also describes command interception, where the malware alters what administrators see when they inspect running programs or network connections. This is not just concealment. It is active manipulation of the administrative view, which makes trust in basic inspection tools unreliable once the host is compromised.

Practical implication: defenders need runtime inspection that can expose startup persistence and tampered system commands, not just file scans.

Fileless execution and container drift at runtime

Fileless execution reduces the number of on-disk artefacts defenders can use to spot malicious behaviour, while container drift creates a mismatch between the expected workload state and the runtime state. In a container environment, drift matters because a workload can appear compliant on paper while behaving differently in memory or at execution time. Aqua’s guidance centres on blocking both patterns during runtime rather than assuming immutable deployment controls are sufficient. The technical issue is that the malicious behaviour occurs after the image has passed build-time checks, so the real control point is live execution.

Practical implication: enforce runtime policy for execution behaviour, then treat drift as an active containment signal, not a cosmetic alert.

Audit versus enforce modes in runtime policy

The article distinguishes between audit mode and enforce mode for runtime policies. Audit mode observes suspicious behaviour, which is useful for tuning and visibility. Enforce mode actively blocks malicious activity such as fileless execution and drift. This distinction matters because visibility alone does not stop a malware family designed to persist and hide. Runtime protection becomes materially stronger when policies can move from observation to interruption at the point of execution, rather than waiting for a later investigation cycle.

Practical implication: move high-confidence policies into enforcement where the workload risk justifies active interruption.

Threat narrative

Attacker objective: The attacker’s objective is to maintain durable, hidden access on compromised Linux or IoT systems so the malware can survive reboots and resist cleanup.

Entry occurs when Kaiji reaches Linux-based servers or IoT devices and gains an initial foothold that allows it to place persistence artefacts on the host.
Escalation comes from startup wiring, command interception, and hidden execution paths that let the malware stay resident and evade normal administrator visibility.
Impact is sustained compromise, with the malware remaining planted long enough to hinder cleanup, complicate detection, and extend attacker control over the workload.

Shai Hulud npm malware campaign — Shai Hulud campaign: npm malware exposed secrets on GitHub.
Salesloft OAuth token breach — hackers stole OAuth tokens to access Salesforce data via Salesloft.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Persistence is the real control failure here, not initial execution. Kaiji’s behaviour shows that workload security cannot stop at preventing the first malicious action. Once a host accepts a persistence mechanism, the attacker is no longer gambling on a single run, but on repeated re-entry after reboot and on the defender’s inability to see the planted artefact. Practitioners should treat persistence as the control boundary that determines whether an incident becomes a cleanup event or a lasting compromise.

Runtime drift is a governance problem because the approved state is no longer the operating state. Container security programmes often assume that image validation and pre-deployment checks define the security posture. Kaiji’s fileless and drift-based behaviour breaks that assumption by shifting the malicious logic into live execution. The implication is that workload governance must distinguish between declared state and observed state, because the latter is where the attack actually lives.

Visible administration is not trustworthy when malware can rewrite the system mirror. The article’s description of command interception matters because it turns ordinary inspection into a false source of confidence. When malware can hide files, processes, or connections from standard checks, control ownership shifts from human review to instrumented runtime enforcement. Practitioners should stop assuming that a clean command-line snapshot equals a clean host.

Runtime enforcement should be treated as a containment layer, not a tuning preference. Audit mode has value for calibration, but Kaiji’s persistence profile means that passive detection may arrive after the attacker has already established repeated footholds. That makes enforcement semantics part of the security design, especially for workloads where one hidden restart path can reintroduce the threat. The practical conclusion is to reserve enforcement for the highest-confidence behaviours that represent confirmed malicious execution patterns.

Container drift is the named concept that best captures this threat pattern. Drift here is the gap between the workload that was approved and the workload that actually runs, including fileless activity, hidden startup logic, and tampered inspection output. Kaiji exploits that gap by surviving inside the runtime rather than the build artefact. Practitioners should measure security at execution time, because that is where the control failure becomes observable.

From our research:
1 in 4 organisations are already investing in dedicated NHI security capabilities, with an additional 60% planning to do so within the next twelve months, according to The State of Non-Human Identity Security.
Lack of credential rotation is cited as the top cause of NHI-related attacks by 45% of organisations, followed by inadequate monitoring and logging at 37%, according to Astrix Security & CSA.
For teams building runtime and lifecycle controls, the practical next step is to pair visibility with governance using Ultimate Guide to NHIs - Key Challenges and Risks.

What this signals

Container drift is becoming a useful governance concept for teams that need to separate approved state from observed state. In practice, malware that survives through startup entries or fileless execution forces security teams to measure the runtime, not the artifact. That makes runtime enforcement and behavioural baselines part of workload governance, not just response tooling.

With 45% of organisations citing lack of credential rotation as the top cause of NHI-related attacks, the broader lesson is that persistence problems rarely stay confined to one control layer. Teams should expect attackers to exploit whatever remains stable long enough to re-establish access, which is why runtime and lifecycle controls need to work together.

Workload security programmes should now assume that visible inspection can be manipulated. If the host can hide its own files, processes, or connections, then policy decisions must depend on instrumented enforcement and trusted telemetry rather than operator observation alone.

For practitioners

Block fileless execution at runtime Move from detection-only posture to policy enforcement for execution paths that never write clear artefacts to disk, especially on shared Linux workloads and exposed container nodes.
Treat startup persistence as an active indicator Alert on unexpected startup entries, scheduled tasks, and service registrations that can relaunch a payload after reboot, then validate whether the process tree matches the approved workload.
Use drift prevention to compare approved and observed state Compare the running workload against its expected baseline and investigate any mismatch in binaries, startup behaviour, or command visibility as a potential compromise path.
Promote high-confidence runtime policies to enforce mode Keep audit mode for tuning, but move confirmed malicious patterns into enforcement so the platform can interrupt execution before the malware completes persistence setup.
Instrument command tampering checks on hosts and containers Validate that standard administrative commands still report accurately, because malware that hides files or processes can make routine inspection look clean when it is not.

Key takeaways

Kaiji is a persistence problem as much as a malware problem, because its main value to attackers is surviving cleanup and reboot.
Runtime drift, fileless execution, and command tampering show why build-time checks alone cannot govern live workload behaviour.
Teams that want to contain threats like Kaiji need enforcement-level runtime controls, not just post-incident visibility.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 and NIST Zero Trust (SP 800-207) set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Non-Human Identity Top 10	NHI-03	Runtime persistence and hidden execution expose gaps in secret and identity control.
NIST CSF 2.0	DE.CM-8	Continuous monitoring is needed to spot tampering and drift at runtime.
NIST Zero Trust (SP 800-207)	PR.AC-4	Least-privilege and continuous verification help limit what compromised workloads can do.

Limit runtime permissions and verify workload behaviour continuously rather than at deploy time.

Key terms

Runtime Protection: Runtime protection is the set of controls that monitor and intervene while a workload is executing. It looks at live behaviour, not just approved code or images, so defenders can block malicious actions such as fileless execution, tampering, or persistence setup as they happen.
Container Drift: Container drift is the gap between the workload state that was approved and the state that actually runs. In practice, drift can include unexpected binaries, altered startup behaviour, or hidden execution paths, and it matters because build-time trust does not guarantee runtime trust.
Persistence Mechanism: A persistence mechanism is any technique that allows malware to survive interruption and return after reboot or process termination. In Linux and container environments, that can include startup scripts, service registrations, or other launch paths that recreate the malicious foothold.
Fileless Execution: Fileless execution is malicious activity that runs without leaving a conventional on-disk payload for scanners to inspect. It reduces obvious forensic artefacts and shifts detection toward behavioural controls, memory inspection, and runtime enforcement rather than file-based hygiene alone.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or NHI governance in your organisation, it is worth exploring.

This post draws on content published by Aqua Security: How to Set Up Runtime Protection Against Malware Like Kaiji. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2025-11-18.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org