llms.txt explained: the AI-era site map
Last updated: May 18, 2026
TL;DR
/llms.txt is a single Markdown file at your domain root that hands AI assistants a curated map of your most important pages, in the order you want them read. It does not replace robots.txt or sitemap.xml — it complements them. Adoption is still early, the cost is near-zero, and the format is strict but tiny. Add one, keep it short, point it at your best pages.
What llms.txt actually is
llms.txt is a community proposal (llmstxt.org) for a Markdown file served at https://yourdomain.com/llms.txt. Where robots.txt says whichbots may crawl and sitemap.xml lists every URL for indexing, llms.txt does something neither does: it gives a language model a short, human-curated tour of the pages that actually matter, with one-line context on each. Think of it as the README you would hand an analyst who has thirty seconds to understand your site.
The format (it is strict, and small)
A valid file is plain Markdown with a specific shape. Parsers are unforgiving about the first two elements, so get them right:
- An H1 on the first non-blank line — your site or product name.
- A blockquote summary directly under it — one or two sentences on what the site is and who it is for.
- Optional free-text paragraphs giving extra context (no headings here).
- One or more H2 sections (e.g.
## Docs,## Product), each a bulleted list of- [Title](https://url): one-line descriptionlinks.
Common mistakes our auditor flags: no H1 on line one, multiple H1s, an H2 section with no link bullets, or headings used before the first section. Keep it boring and exact.
Where it goes
Domain root, exactly /llms.txt, served as text/plain ortext/markdown, HTTP 200, no auth wall. Subpaths and meta tags do not count. Some sites also publish a fuller /llms-full.txt with expanded content — optional, and only if you genuinely have the depth to justify it.
How it differs from robots.txt and sitemap.xml
They solve different problems and you want all three. robots.txt is access control — see our companion post on robots.txt for AI crawlers. sitemap.xml is exhaustive discovery for indexers. llms.txt is editorial: a small, ranked, described shortlist for a model that will only ever skim. A page can be in your sitemap, allowed in robots.txt, and still be invisible to assistants because nothing told them it was the page that matters. That is the gap llms.txt fills.
Is it worth adding in 2026?
Honest answer: no major vendor enforces llms.txt, and adoption is still early. But the cost is a few minutes for a file that never changes much, and the downside is zero. As assistants increasingly lean on curated signals over brute crawling, a clean llms.txt is cheap insurance with real upside. It is also a forcing function — writing the one-line descriptions makes you decide what your best pages actually are, which is the same discipline that gets you cited by AI assistants in the first place.
Generate it, then verify it
You can write llms.txt by hand from your sitemap in fifteen minutes, or generate a spec-compliant draft and hand-tune it. Either way, verify the result the way an AI crawler sees it: run a free auditand it checks the file's presence, the H1/blockquote shape, and that every section has real links — the same rules above. Pair it with IndexNow so changes propagate fast, and see the plans if you want the generator and monitoring.
FAQ
Will llms.txt hurt my SEO?
No. It is an additional file, not a directive that changes how search engines crawl or rank you. Worst case a crawler ignores it; best case assistants get a cleaner picture of your site.
How many links should it have?
Few enough to stay curated — typically 2–4 sections of 3–10 links. It is a shortlist, not a second sitemap. If everything is important, nothing is.
Does it need updating often?
Only when your best pages change. It is intentionally low-maintenance — that is the point. Re-audit after major site changes to make sure the links still resolve.
Want this checked on your domain? Run a free audit or take the 2-minute readiness quiz.