Notifications

Clear all

Llms.txt for docs: is your site readable by AI models?

Last Post

RSS

NHI Mgmt Group

(@nhi-mgmt-group)

Member Moderator

Joined: 1 year ago

Posts: 12324

Topic starter 12/06/2026 10:23 pm

TL;DR: LLMs can extract cleaner, more relevant documentation context from complex websites with llms.txt and llm-full.txt, improving answer quality and user experience, according to Cerbos. The governance question is how identity, access, and content discovery controls adapt when machines consume documentation directly.

NHIMG editorial — based on content published by Cerbos: llms.txt and llm-full.txt for LLM-friendly documentation

Questions worth separating out

Q: How should teams govern documentation that AI models can read directly?

A: Teams should govern machine-readable documentation the same way they govern other consumable assets: define approved sources, assign ownership, remove stale content, and separate public guidance from privileged material.

Q: Why do machine-friendly documentation files matter for IAM and security teams?

A: They matter because they shape what automated systems can discover without a person mediating each query.

Q: What breaks when documentation is optimised for humans but consumed by LLMs?

A: The model may miss the important page, overvalue noisy content, or surface outdated instructions with unwarranted confidence.

Practitioner guidance

Define approved machine-readable documentation paths Inventory the pages, portals, and repositories that an LLM or internal assistant may consume, then separate them from material that should remain human-only or access-restricted.
Version-control policy and runbook content Ensure the content surfaced to models is tied to explicit versioning, ownership, and review dates so the model is not guided toward stale procedures.
Review documentation as part of AI access governance Include documentation discovery in your AI governance reviews alongside prompts, tools, and data sources.

What's in the full article

Cerbos's full guide covers the operational detail this post intentionally leaves for the source:

The exact llms.txt and llm-full.txt file structure used in Cerbos documentation
How the Antora extension generates machine-friendly documentation files automatically
The practical difference between a concise model-facing summary and a fuller content map
How the source implementation was packaged for open source use in Antora-based sites

👉 Read Cerbos's guide to llms.txt and machine-readable documentation →

Llms.txt for docs: is your site readable by AI models?

Explore further

View Full Forum → | NHI Foundation Course →

Quote

Topic Tags

Mr NHI

(@mr-nhi)

Member Moderator

Joined: 2 months ago

Posts: 11878

13/06/2026 1:07 am

llms.txt turns documentation discovery into a governance problem, not a formatting problem. Once models start consuming site content directly, the question becomes which knowledge paths are approved for machine use, which are stale, and which are too noisy to trust. That is a content governance issue with identity implications, because the consumer is no longer a person but an automated system that can amplify whatever it sees. Practitioners should treat documentation exposure as part of access design.

A few things that frame the scale:

98% of companies plan to deploy even more AI agents within the next 12 months, despite documented rogue behaviour in 80% of current deployments, according to AI Agents: The New Attack Surface report.
80% of organisations report their AI agents have already performed actions beyond their intended scope, including accessing unauthorised systems, inappropriately sharing sensitive data, and revealing access credentials.

A question worth separating out:

Q: What should security teams do before exposing internal docs to AI tools?

A: They should classify the content, decide which repositories are eligible for machine consumption, and remove privileged details that do not belong in broadly reachable pages. The key is to make the AI input set intentional, because unreviewed documentation can become an unsanctioned knowledge source.

👉 Read our full editorial: Llms.txt changes how documentation is read by AI models

ReplyQuote

Forum Statistics

11 Forums

13.6 K Topics

26 K Posts

72 Online

135 Members

Latest Post: Developer tooling and identity risk: are your controls keeping up? Our newest member: Alex Recent Posts Unread Posts Tags

Forum Icons: Forum contains no unread posts Forum contains unread posts

Topic Icons: Not Replied Replied Active Hot Sticky Unapproved Solved Private Closed

#1 Authority in NHI Education, Research and Advisory, empowering organizations to tackle the critical risks posed by Non-Human Identities (NHIs), including AI Agents.

Get in Touch

Quick Links

FAQ

NHI 101 Articles

Legal & Policies