Claude.ai prompt injection exposed a silent data exfiltration chain

By NHI Mgmt Group Editorial TeamPublished 2026-05-27Domain: Agentic AI & NHIsSource: Oasis Security

TL;DR: Three Claude.ai flaws that chained invisible prompt injection, file upload abuse, and an open redirect into silent data exfiltration from default sessions were found in Oasis Security’s Claudy Day research, with broader blast radius when integrations are enabled. The core failure is that AI agent governance still assumes prompts are what users intended, not attacker-shaped execution paths.

At a glance

What this is: Oasis Security’s Claudy Day research shows that three Claude.ai flaws could be chained into silent prompt injection and data exfiltration from a default session.

Why it matters: IAM teams need to treat AI assistants as governed identities because prompt manipulation, hidden execution paths, and connected integrations can turn a chat session into a data-access channel.

👉 Read Oasis Security's analysis of Claudy Day and Claude.ai prompt injection

Context

Claude.ai is a widely used AI assistant that can accept prompts, remember prior conversation context, and connect to files or APIs through integrations. Claudy Day matters because the security assumption behind that experience is simple: the instructions processed by the assistant are the instructions the user meant to send.

The article shows how that assumption breaks when hidden instructions can be embedded in a pre-filled prompt, then combined with data export and redirect weaknesses. For IAM and security teams, the issue is not just prompt injection in the abstract. It is the governance gap between user intent, agent execution, and connected access paths.

Key questions

Q: How should security teams govern AI assistants that can access files and APIs?

A: Treat each assistant as a non-human identity with explicit owners, least privilege, and a documented lifecycle. Then review every file, API, and memory path it can reach. If a capability is not required for the business task, remove it. Governance only works when the assistant’s identity boundary is narrower than the data it can touch.

Q: Why do AI assistants complicate zero trust and least privilege?

A: Because the assistant can combine context, memory, and tools at runtime in ways that are hard to fully enumerate at provisioning time. Least privilege assumes you can define the needed access in advance. When the system can interpret hidden instructions and choose actions dynamically, that assumption weakens and continuous validation becomes essential.

Q: What do teams get wrong about prompt injection in AI assistants?

A: They treat it as a content safety issue instead of an access issue. Prompt injection becomes dangerous when the assistant can read sensitive history, call APIs, or write files on the user’s behalf. The risk is not only what the prompt says. It is what identity and egress permissions allow the prompt to trigger.

Q: How can organisations reduce the risk of data exfiltration through AI chat sessions?

A: Limit what the assistant can reach, remove unnecessary integrations, and log every high-risk action path. Then test whether a hidden instruction can still cause data export through a permitted service. The goal is to make exfiltration impossible through normal assistant capabilities, not just harder to spot after the fact.

Technical breakdown

Invisible prompt injection through URL parameters

Claude.ai allowed a new chat to be opened with a pre-filled prompt through a URL parameter. The issue was that certain HTML tags could be embedded invisibly in that parameter, so the user saw an ordinary prompt while Claude processed hidden instructions at runtime. That turns a chat link into an instruction-delivery channel, which matters because the visible interface no longer represents the actual execution payload. The risk is strongest where users trust prefilled content from search, shared links, or embedded workflows.

Practical implication: treat pre-filled assistant prompts as untrusted input and block or sanitize hidden markup before users can submit them.

Files API exfiltration from conversation history

The second weakness was abuse of the Anthropic Files API from within the sandboxed session. Although outbound network access was restricted, connections to api.anthropic.com were allowed, so a hidden prompt could instruct Claude to search conversation history, write sensitive content to a file, and upload it to an attacker-controlled account. Mechanically, this is not just data theft. It is agent-mediated exfiltration using permitted platform paths, which makes the abuse harder to detect than a simple outbound connection attempt.

Practical implication: review which APIs an assistant can reach from its sandbox and remove write or upload paths that are not operationally required.

Open redirect chaining with ad-delivered lures

The open redirect on claude.com let any URL under a redirect path forward the visitor to an arbitrary destination. Combined with Google Ads validation by hostname, an attacker could show a trusted claude.com link in search results and silently forward the user to the prompt-injection payload. This is a classic chain where trust in the displayed domain masks the real destination. The technical issue is less about the redirect alone and more about how it amplifies delivery reliability for the injection stage.

Practical implication: eliminate open redirects on identity or AI entry points, and test them as part of the same abuse chain as phishing and prompt injection.

Threat narrative

Attacker objective: The attacker’s objective is to quietly extract sensitive conversation content and related data from AI assistant sessions without obvious user-visible compromise.

Entry occurred when a victim clicked a trusted-looking claude.com redirect link that silently forwarded them to a prompt-injection payload.
Credential access and abuse followed when the hidden prompt instructed Claude to search conversation history and use allowed API paths to upload extracted content.
Impact was silent data exfiltration from the assistant session, with wider blast radius when integrations or enterprise connections were present.

Codefinger AWS S3 ransomware attack — Codefinger used compromised AWS credentials to encrypt S3 buckets via SSE-C.
DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Prompt injection is no longer a UI problem. It is an identity trust problem. Claudy Day shows that the assistant can receive one prompt on screen and a different instruction set at runtime. That breaks the governance assumption that user intent and executed intent are the same thing. The implication is that AI assistant governance must be treated as access governance, not just content filtering.

Hidden instructions create a named concept we call prompt-path trust debt: the gap between the path a user believes they are following and the path the assistant actually executes. In this case, invisible markup, file upload capability, and redirect abuse combined into a single trust failure. Practitioners should recognize that the debt accumulates wherever input, execution, and egress are not independently controlled.

Conversation history becomes a governed data store the moment the assistant can read and summarize it. Claudy Day shows that the assistant can be instructed to inspect prior chats, identify sensitive material, and package it for export. That means privacy, classification, and retention controls must apply to AI memory and conversational context with the same seriousness as any other sensitive repository.

AI assistants should be governed as non-human identities, not as applications with a chat box. The article explicitly describes authentication, credentials, autonomous actions, and integrations. That places the control problem squarely inside NHI governance, where identity, permissions, auditability, and lifecycle management determine whether an assistant can be safely trusted. Practitioners should align AI agent governance with OWASP NHI and zero trust principles.

Blast radius is now determined by connected integrations, not by the assistant alone. The default session already exposes sensitive context, but MCP servers, tools, and enterprise APIs expand the reachable surface dramatically. That is why connected access must be reviewed as a delegation chain, not as a point product feature. The field should treat every attached integration as part of the assistant’s identity boundary.

From our research:
85% of organisations lack full visibility into third-party vendors connected via OAuth apps, according to The State of Non-Human Identity Security.
Only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs, compared to nearly 1 in 4 for securing human identities.
Forward pivot: Analysis of Claude Code Security shows why agent governance now has to include connected tools, session memory, and delegated access paths.

What this signals

Prompt-path trust debt: security teams should expect more abuse of entry points where a link, prompt, or shared chat can carry hidden instructions into a governed session. The operational signal is simple. If users can enter data that changes execution without clear validation, the assistant is already outside a stable identity boundary.

As AI assistants gain broader access to files, memory, and enterprise systems, the control question shifts from prompt moderation to delegation control. That is why assistant inventory, integration review, and permission minimization should sit alongside NHI governance and zero trust controls, not inside a separate AI pilot.

The Claudy Day pattern is a reminder that AI assistant risk scales through connected access, not just model behavior. Teams that already struggle to see OAuth-connected services should assume similar blind spots will appear in assistant estates unless ownership, logging, and revocation are built in from the start.

For practitioners

Inventory every AI assistant and connected integration Map each assistant, MCP server, API, file store, and browser entry point that can influence or extend a session. Classify what data each path can read, write, summarize, or export so shadow AI does not hide inside normal productivity workflows.
Strip hidden-input delivery paths from assistant entry points Block or sanitize pre-filled prompts, shared links, and any content that can carry invisible instructions into a chat session. Test redirectors, URL parameters, and copied prompts as an abuse chain, not as isolated bugs.
Reduce sandbox egress to only required AI endpoints Limit the assistant’s ability to call upload or export APIs unless a business process explicitly requires them. Review file-writing and account-to-account transfer paths as exfiltration channels, not just convenience features.
Apply identity governance to AI assistants and agents Give each assistant an accountable owner, least-privilege access, session logging, and lifecycle offboarding rules. Treat credentials, memory, and tool access as governed entitlements that can be reviewed, revoked, and reassigned.

Key takeaways

Claudy Day shows that prompt injection becomes a governance failure when hidden instructions can drive assistant behavior beyond what the user sees.
The attack chain mattered because it combined delivery, exfiltration, and redirect abuse into a silent session-level compromise path.
The control that matters most is not a single filter but a narrower identity boundary around AI assistants, their integrations, and their egress paths.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 and OWASP Non-Human Identity Top 10 address the attack and risk surface, while NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	AGENTIC-2	Prompt injection and tool abuse are direct agentic application risks.
OWASP Non-Human Identity Top 10	NHI-03	The case hinges on overly broad assistant permissions and exfiltration paths.
NIST CSF 2.0	PR.AC-4	Least privilege and controlled access are central to the breach chain.

Sanitize instruction inputs, limit tool access, and validate assistant actions before execution.

Key terms

Prompt injection: Prompt injection is the act of embedding hidden instructions in content that an AI assistant later processes as if they were legitimate user intent. In practice, it turns ordinary input into a control channel for changing the assistant’s behavior, especially when the assistant can read memory, use tools, or reach external systems.
Non-human identity: A non-human identity is any machine or software identity that can authenticate, hold credentials, and access resources independently of a person. For AI assistants, that includes the credentials, permissions, memory access, and delegation paths needed to act at runtime, which makes lifecycle and privilege governance essential.
Open redirect: An open redirect is a web endpoint that forwards a visitor to a different destination without strict validation. In identity and AI attack chains, it is dangerous because it can make a malicious payload look like a trusted domain, increasing the chance that users will follow the link and trigger the next stage of abuse.
Assistant egress path: Assistant egress path is the set of outbound destinations an AI assistant can reach from its session, including APIs, file stores, and connected services. When these paths are broader than the business task requires, they become the easiest route for silent data extraction and session abuse.

Deepen your knowledge

Prompt injection, AI assistant governance, and non-human identity control are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are building controls for AI assistants with memory, tools, and file access, it is worth exploring.

This post draws on content published by Oasis Security: Claudy Day: Chaining Prompt Injection and Data Exfiltration in Claude.ai. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-05-27.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org