AI discovery

llms.txt Guide

What llms.txt is, what it can and cannot do, and how to create a useful example file without treating it as magic SEO.

What llms.txt is

llms.txt is a proposed plain-text file that gives AI tools a curated map of a website's most useful resources. It usually lives at /llms.txt and links to documentation, guides, product references, policies, and other pages that explain the site clearly.

It is best understood as guidance, not a command. Robots.txt controls crawler permissions. Sitemap.xml lists canonical URLs for discovery. llms.txt tries to summarize what matters most for language model consumption.

What llms.txt can help with

For sites with deep documentation or editorial libraries, llms.txt can make the best resources easier to identify. Instead of forcing a tool to infer your important pages from navigation alone, you can list the pages that represent your strongest explanations.

A useful llms.txt file is curated. It should not be a dump of every URL. Group resources by topic, use descriptive link text, and prefer canonical, maintained guides over thin posts or temporary pages.

What llms.txt cannot do

llms.txt does not guarantee crawling, indexing, ranking, citation, or inclusion in AI answers. Many AI systems may ignore it. Search engines do not treat it as a replacement for technical SEO fundamentals.

Do not use llms.txt as an excuse to leave pages uncrawlable, duplicated, or thin. If your public routes return an empty shell, a beautiful llms.txt file will not solve the underlying problem.

A simple example

A practical file might include:

# Web Traffic Agents

Independent editorial publication about traffic intelligence, SEO, AI search visibility, crawler behavior, analytics, content strategy, and conversion.

## Core guides
- AI Search Visibility: https://webtrafficagents.com/ai-search-visibility/
- AI Crawler Optimization: https://webtrafficagents.com/ai-crawler-optimization/
- Technical SEO Audit: https://webtrafficagents.com/technical-seo-audit/
- Crawler Analytics: https://webtrafficagents.com/crawler-analytics/

## Policies
- About: https://webtrafficagents.com/about/
- Editorial Policy: https://webtrafficagents.com/editorial-policy/

That example is short, stable, and editorially meaningful. It tells a tool what the site is about and where the strongest explanations live.

Implementation checklist

Create /llms.txt as a static text file if your platform supports it.
Link only canonical public URLs.
Keep descriptions factual and concise.
Update it when pillar pages change.
Do not list private, duplicate, draft, or low-quality URLs.
Keep sitemap.xml and robots.txt correct first.

How to choose what belongs in llms.txt

The file should represent the best explanation of your site, not the largest possible URL list. Start with durable resources: pillar guides, documentation, reference pages, topic hubs, editorial standards, and support pages that explain how the organization works. Leave out thin articles, tag archives, search results, landing pages with little context, and temporary campaign pages.

Write descriptions as if a busy researcher were using the file. "Technical SEO Audit" is more useful than "Page 3." Group links by topic so a tool or person can understand the map quickly. When a guide is replaced, update llms.txt at the same time you update internal links and sitemap.xml.

For editorial sites, the strongest pattern is a concise overview, then core guides, then topic indexes, then policies. That mirrors how a human would learn whether the site is a credible source.

llms.txt and governance

Because llms.txt is public, review it with the same care as navigation. Do not include private documentation, unpublished URLs, client-only resources, or anything you would not want crawlers and competitors to notice. If legal or licensing concerns affect AI access, coordinate llms.txt with robots.txt and terms pages instead of letting each file tell a different story.

Track ownership. Someone should be responsible for updating the file when pillar pages launch, categories change, or a major guide is retired. An outdated llms.txt can send tools to weak or redirected pages, which defeats the purpose of curation.

How llms.txt fits with other files

Think of robots.txt, sitemap.xml, and llms.txt as three different signals. Robots.txt says what compliant crawlers may request. Sitemap.xml says which canonical URLs you want discovered. llms.txt says which resources best explain the site. They should agree, but they should not duplicate each other mechanically.

If a URL is blocked in robots.txt, do not feature it in llms.txt. If a guide is important enough to list in llms.txt, it probably belongs in internal links and the sitemap too. If a page is temporary, thin, or not meant to represent the publication, leave it out.

Review cadence

Review llms.txt quarterly or whenever the editorial map changes. Check every URL for a 200 response, canonical consistency, current copy, and clear page purpose. Remove guides that have been superseded and add new pillars only after they are polished enough to represent the site.

llms.txt may become more useful over time, but it should sit on top of good publishing hygiene. Treat it like a curated reading list for machines and humans, not a secret lever.

Practical examples

List your best maintained guides and documentation pages instead of every URL on the site.
Group pages by topic so AI tools can understand the editorial map quickly.
Keep llms.txt aligned with sitemap.xml but do not use it as a replacement.

FAQ

Common questions

What is llms.txt?

llms.txt is a proposed text file that summarizes important site resources for AI systems and tools. It is not an official ranking standard.

Does llms.txt replace sitemap.xml?

No. Sitemap.xml remains the standard way to list canonical URLs for search discovery. llms.txt is best treated as optional guidance.

Should every site have llms.txt?

Sites with documentation, editorial libraries, or technical resources may benefit. Small brochure sites should fix normal crawlability first.