Subscribe to the Non-Human & AI Identity Journal

How should security teams detect phishing emails that hide behaviour behind HTML and JavaScript?

Use behaviour-aware inspection that executes the attachment, waits for delayed redirects, and evaluates iframe loading and script deobfuscation. Static scanning alone misses attacks that only become malicious after a timeout or browser interaction. Security teams should combine sandboxing, URL analysis, and sender context so the decision is based on what the message does, not only what it contains.

Why This Matters for Security Teams

Phishing is no longer limited to a visible link or a bad attachment. Attackers increasingly hide the real payload behind HTML, JavaScript, iframe chaining, delayed redirects, and browser-triggered actions, which means a message can look harmless at delivery time and still become malicious moments later. That is why behaviour-aware inspection matters more than pattern matching alone, especially when a campaign is designed to evade static signatures. Guidance from the NIST Cybersecurity Framework 2.0 supports risk-based detection, but email teams still need message-level execution analysis to catch delayed logic. NHIMG’s research on the Top 10 NHI Issues shows the same operational pattern across identity abuse: defenders miss threats when they inspect what something is rather than what it does. In practice, many security teams encounter the malicious behaviour only after a user opens the message and the browser has already followed the hidden path.

How It Works in Practice

Behaviour-aware detection works by treating the email as a small execution environment, not just a file to scan. The inspection stack should parse HTML, unpack obfuscation, expand shortened or nested URLs, and render content in a sandbox that can observe delayed execution. If JavaScript is present, the engine should evaluate script flow long enough to catch timeout-based redirects, DOM changes, and iframe injection. If the message depends on user interaction, the sandbox needs to simulate clicks, hover events, or form submissions where policy allows.

Security teams usually get better results when they combine several signals:

  • Static parsing of HTML structure and suspicious tags
  • Script deobfuscation and URL chain expansion
  • Sandboxed rendering with timeout observation
  • Sender reputation, domain age, and authentication context
  • Disposition rules that weigh behaviour, not only content

This approach aligns with the operational direction described in Ultimate Guide to NHIs — Key Challenges and Risks, where hidden control paths and delayed misuse are a recurring concern. It also fits the NIST CSF emphasis on continuous monitoring, because the decision point moves from delivery time to runtime evidence. For teams building policy and triage logic, the key is to separate harmless HTML formatting from executable intent, then score the message on observed behaviour, not on the presence of JavaScript alone. These controls tend to break down in high-volume mail gateways that enforce very short analysis windows, because delayed redirects and post-load script execution never have time to reveal themselves.

Common Variations and Edge Cases

Tighter behavioural analysis often increases latency and false-positive review volume, so organisations have to balance deeper inspection against mail throughput and user friction. Current guidance suggests this is a tradeoff, not an all-or-nothing choice.

Some campaigns use simple HTML with no JavaScript but rely on remote images, CSS tricks, or link wrapping to trigger tracking and redirection. Others place the malicious logic in a second-stage page, so the email itself looks clean while the destination becomes hostile after one or two browser events. In those cases, inspection should extend beyond the message body to the final landing page and any embedded resources. The LLMjacking: How Attackers Hijack AI Using Compromised NHIs research is a useful reminder that attackers often chain small abuses into larger compromise paths, which is exactly why layered analysis matters. There is no universal standard for how long a sandbox should wait for delayed behaviour, but best practice is evolving toward adaptive timing based on risk scoring, domain trust, and campaign history. Teams also need to account for mobile clients and mail apps that do not render the same way as desktop browsers, because detection can vary sharply by client. When filtering is tuned only for obvious script signatures, campaigns that rely on delayed browser behaviour or remote content loading will still pass through.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST AI RMF set the governance and control requirements practitioners need to meet.

Framework Control / Reference Relevance
OWASP Agentic AI Top 10 L-06 Behaviour-based evaluation mirrors runtime analysis against hidden execution paths.
OWASP Non-Human Identity Top 10 NHI-05 Covers abuse of hidden credentials and delivery paths that evade static review.
NIST AI RMF Risk-based monitoring supports decisions based on observed behaviour and context.

Inspect rendered email behaviour and script actions at runtime, not just static HTML strings.