GEO manipulation can push harmful claims into AI answers

By NHI Mgmt Group Editorial TeamPublished 2026-06-28Domain: Agentic AI & NHIsSource: Lasso Security

TL;DR: Standard Generative Engine Optimization techniques can increase the chance that a harmful claim appears inside AI-generated answers, even when the attacker only controls a public webpage and has no model access, according to Lasso Security. The implication is that retrieval trust assumptions, not just prompt security, now shape answer integrity and downstream user risk.

At a glance

What this is: This is an analysis of how GEO can manipulate AI-generated answers by shaping the web content that retrieval-based assistants ingest.

Why it matters: It matters because IAM, NHI, and agentic AI programmes now need to treat retrieved content as part of the trust boundary, especially when AI systems act on answers rather than merely display them.

By the numbers:

OpenAI reports more than 800 million weekly ChatGPT users and over 1 million paying business customers.
A Columbia Journalism Review / Tow Center study found leading AI search tools answered more than 60% of test queries incorrectly and fabricated citation links.

👉 Read Lasso Security's analysis of GEO manipulation in AI-generated answers

Context

Generative Engine Optimization, or GEO, is the practice of shaping web content so AI assistants cite it, repeat it, or rely on it inside generated answers. The governance problem is that retrieval-based systems do not only rank information, they ingest it into the reasoning path that produces the final answer, which means manipulated public content can affect user decisions before any human review happens.

For identity and access teams, the issue is not just misinformation. Retrieval content becomes an upstream trust input for AI agents, copilots, and decision workflows that may already have access to internal data or business actions. That makes the integrity of external web sources part of the identity security boundary, especially when users and systems act on answers without checking the provenance behind them.

The article shows that no account compromise is required for this attack path. A public page, ordinary editorial writing, and standard optimization techniques are enough to bias what the model says, which is a serious warning for programmes that still treat AI answers as downstream of search rather than as a distinct trust surface.

Key questions

Q: How should security teams handle retrieval content that influences AI answers?

A: Treat retrieval content as part of the trust boundary. Assistants can turn a public page into authoritative-sounding guidance, so teams should validate source provenance, corroboration, and citation quality before allowing an answer to feed a workflow, recommendation, or decision. When the system cannot explain why a source was selected, the output should not be trusted.

Q: Why do GEO attacks matter for identity and access programmes?

A: Because AI answers increasingly shape actions, and actions are governed by identity. If an assistant can be influenced by external content, then the issue is not only misinformation but also unsafe delegation, weak provenance, and unverified automation. Identity teams need controls for what the system is allowed to trust, not just what it is allowed to do.

Q: What breaks when AI assistants rely on fluent but unverified web sources?

A: The model can present a harmful claim with the tone and structure of reliable guidance, which lowers user skepticism and increases the chance of bad decisions. Fluent formatting, fake citations, and editorial polish can all act as false trust signals. Without provenance controls, the assistant may treat persuasion as evidence.

Q: Who is accountable when an AI-generated answer causes harm?

A: Accountability usually sits with the organisation that designed the retrieval, validation, and action-gating workflow, not with the model itself. If external content can influence a high-stakes answer, the programme needs clear ownership for source review, escalation, and human override. For regulated decisions, the ability to explain source use becomes part of governance.

Technical breakdown

How GEO changes retrieval trust in AI answers

GEO works by influencing the content that retrieval systems prefer when assembling an answer. Unlike classic SEO, the goal is not just visibility in a results list. It is citation inside the model’s response, which means structured wording, confident tone, lists, and repeated claims can increase the chance that a page is pulled into the answer context. Once that page is in the retrieval set, the model may paraphrase it, quote it, or treat it as corroborating evidence alongside other sources. Practical implication: teams need to understand that answer integrity depends on source selection, not only prompt safety.

Practical implication: Treat retrieval sources as part of the security boundary and review how AI systems select, rank, and cite external pages.

Why fabricated authority signals are effective in GEO attacks

The attack pattern depends on familiar persuasion cues. Fake statistics, invented quotations, fabricated citations, and editorial-style endorsements make a page look more credible to both retrieval engines and downstream summarizers. The content does not need hidden instructions or malware. It only needs to resemble trustworthy reporting closely enough that the model extracts it as if it were relevant evidence. That is why content quality signals can be abused at scale. Practical implication: retrieval filters should not rely on surface credibility markers alone, because those markers are easy to manufacture.

Practical implication: Validate source provenance and corroboration instead of assuming fluent, structured content is trustworthy.

Why AI assistants are vulnerable to harmful answer shaping

AI assistants change the user’s verification model. In search, the user sees multiple links and can compare sources. In GEO-driven answers, the system chooses the narrative and often compresses source diversity into a single response. That makes harmful claims easier to deliver, especially when the user trusts the model’s tone more than the underlying evidence. The risk is amplified in health, finance, and other high-consequence domains where users are primed to act on the answer. Practical implication: the more an assistant can influence action, the more tightly source integrity and answer provenance need to be governed.

Practical implication: Require visible source provenance and human validation for any AI answer that can influence decisions or actions.

Threat narrative

Attacker objective: The attacker wants a fabricated or harmful claim to appear inside AI-generated answers and influence user decisions without compromising any internal system.

Entry occurs when an attacker publishes a public webpage containing a harmful claim and optimizes it for Generative Engine Optimization so retrieval systems are more likely to ingest it.
Escalation happens when the page is selected as a retrieved source and the model incorporates the claim into a fluent, confident answer that appears authoritative to the user.
Impact is realised when the user acts on the AI-generated answer, treating the manipulated claim as credible guidance in a high-stakes domain.

DeepSeek breach — DeepSeek breach exposed 1M+ log lines and sensitive secret keys.
Schneider Electric credentials breach — exposed credentials gave attackers access to Schneider Electric Jira, exfiltrating 40GB.

Read our 52 NHI Breaches Analysis report for a comprehensive view of breaches impacting Non-Human Identities including AI Agents.

NHI Mgmt Group analysis

Retrieval content is now a trust boundary, not just a ranking surface. GEO turns public web pages into inputs that can shape AI answers before any human user sees the underlying sources. That breaks the older assumption that search optimization only affects discoverability, not decision quality. For identity programmes, the implication is that answer provenance must be treated as part of the control environment, especially where agents or assistants can trigger action.

Source credibility signals can be manufactured faster than governance can inspect them. The article’s fake statistics, fabricated citations, and editorial-style endorsements show that fluent content can imitate authority without any account compromise. This is a control gap in content trust, but also a broader programme issue because AI systems often over-weight polished structure. Practitioners need to recognise that human-readable trust signals are not equivalent to verified provenance.

Human-agent trust exploitation is the core failure mode here. OWASP ASI09 and LLM09 are directly relevant because the attack succeeds when users and systems trust model output more than source quality. The named concept is answer-level trust poisoning: content is engineered so that the AI, not the attacker’s site, becomes the delivery vehicle for the false claim. That changes the governance conversation from site security to answer integrity.

AI agents make this pattern more dangerous because they can act on the answer, not just read it. A human can pause, compare sources, and challenge the result. An autonomous or semi-autonomous workflow may route the answer into a tool, workflow, or decision process immediately. That means retrieval trust, source validation, and action gating need to be considered together, not as separate problems.

The practical boundary is provenance, not publication volume. The article shows that a single well-placed page can influence many outputs if the retrieval layer trusts it. That is why governance teams should focus on how sources enter the answer path, how they are corroborated, and whether the model can explain why a source was used. The right response is to govern answer inputs as rigorously as access to systems.

From our research:
85% of organisations lack full visibility into third-party vendors connected via OAuth apps, according to the State of Non-Human Identity Security.
That same research found only 1.5 out of 10 organisations are highly confident in their ability to secure NHIs, which shows how quickly trust gaps become governance gaps in machine-facing systems.
For a practical next step, review Ultimate Guide to NHIs , Key Research and Survey Results for the wider identity risk picture behind external access and delegated trust.

What this signals

Answer integrity will become a governance control, not a content-quality concern. As assistants move from passive search to decision support, the security question shifts from whether a page is visible to whether the model should trust it at all. Programmes that already struggle with delegated access and third-party visibility should expect the same blind spots to appear in retrieval-driven AI workflows.

Answer-level trust poisoning: GEO shows that a single public page can influence many outputs if the retrieval layer treats it as authoritative. That means AI governance teams need inventory, provenance, and escalation paths for external knowledge sources, just as IAM teams need lifecycle controls for third-party access.

The operational signal to watch is whether your assistants can explain source selection in a way a practitioner can audit. If they cannot, the organisation is already relying on hidden trust decisions. That is where controls from NIST SP 800-63 Digital Identity Guidelines and source-validation discipline start to matter in AI-adjacent workflows.

For practitioners

Map the retrieval trust boundary Identify every external source class your assistants can ingest, then classify which of those sources can influence high-stakes decisions without human review. Pay special attention to pages that are optimized for citation, structured summaries, or answer extraction.
Add provenance checks before answer reuse Require the assistant to surface source identity, retrieval order, and corroborating evidence before any answer is reused in a workflow, ticket, or case note. If the source cannot be verified, the answer should be treated as untrusted.
Gate action on high-consequence outputs For health, finance, access, or operational decisions, separate answer generation from action execution so a human or policy check occurs before the output can trigger a tool call or business decision.
Test for answer-level manipulation Red-team the retrieval path with fabricated but plausible pages, then measure whether the model repeats the claim, cites the page, or uses it to justify a recommendation. Include structured pages, listicles, and Q&A formats in the test set.

Key takeaways

GEO turns public web pages into answer-shaping inputs, which means AI reliability now depends on source trust as much as model quality.
The strongest attacks do not need hidden prompts or system access, only fluent content that looks authoritative enough for retrieval to reuse.
Practitioners should govern source provenance, corroboration, and action gating before AI answers are allowed to influence decisions.

Standards & Framework Alignment

This section maps relevant standards and security frameworks to the operational risks and controls described in this guidance.

OWASP Agentic AI Top 10 address the attack and risk surface, while NIST AI RMF and NIST CSF 2.0 set the governance and control requirements practitioners need to meet.

Framework	Control / Reference	Relevance
OWASP Agentic AI Top 10	A5	Retrieval trust and answer manipulation are core agentic AI risks.
NIST AI RMF		Governance and trustworthiness apply directly to AI-assisted decision paths.
NIST CSF 2.0	PR.AA-01	Identity-aware access and provenance controls support trusted AI operations.

Map AI source handling to governance processes and require auditable provenance for external inputs.

Key terms

Generative Engine Optimization: Generative Engine Optimization is the practice of shaping content so AI systems are more likely to cite, summarise, or rely on it in a generated answer. In security terms, it is an influence technique aimed at the retrieval layer, where content structure and credibility cues affect what the model treats as usable evidence.
Retrieval Trust Boundary: The retrieval trust boundary is the point at which external content becomes part of the evidence an AI system uses to produce an answer. It matters because the model may treat public web pages, internal documents, and approved sources as interchangeable unless governance rules define which inputs are trusted and why.
Answer-Level Trust Poisoning: Answer-level trust poisoning is the deliberate shaping of source material so an AI system outputs a false or harmful claim with convincing authority. The attack does not need model access. It relies on the assistant trusting manipulated retrieval content enough to present it as credible guidance.
Human-Agent Trust Exploitation: Human-Agent Trust Exploitation is the failure mode where people place too much confidence in AI output and act on it without sufficient verification. The risk rises when the system sounds authoritative, when the source trail is hidden, and when the answer affects decisions that carry real-world impact.

Deepen your knowledge

NHI governance, agentic AI identity, and machine identity lifecycle are core topics in our NHI Foundation Level course, the industry's only accredited NHI security programme. If you are responsible for identity security strategy or governance in your organisation, it is worth exploring.

This post draws on content published by Lasso Security: Exploiting GEO to Push Harmful Claims into AI-Generated Answers. Read the original.

NHIMG Editorial Note
Published by the NHIMG editorial team on 2026-06-28.
NHI Mgmt Group — the independent authority on Non-Human Identity, IAM, and Agentic AI security. nhimg.org