What Is llms.txt and Why Every Site Needs One in 2026
llms.txt is the de-facto standard for telling AI engines who you are and how to interpret your content. A complete guide with template, validator checklist, and adoption data.
llms.txt is a markdown file at /llms.txt that gives AI crawlers a structured guide to your site - your business identity, products, key URLs, and how to interpret your content. Created by Jeremy Howard (Answer.AI) in 2024, it is now read by Perplexity, Anthropic, OpenAI, and Google indexers. Adoption among the top 10K sites jumped from 0.4% to 11% in 12 months.
Key facts
- Adoption grew from 0.4% to 11% of top 10K sites between April 2025 and April 2026.
- 83% of websites with llms.txt also publish ai.txt and identity.json.
- Sites with llms.txt are 1.6x more likely to be cited correctly (right entity name, right URL) by Perplexity.
- Average llms.txt size in 2026: 2.4 KB; recommended optimal range 800-3000 chars.
- Top crawlers reading llms.txt: GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, Amazonbot.
What llms.txt Actually Is
llms.txt is a markdown file you publish at the root of your domain - https://yourdomain.com/llms.txt. Inside it you describe your business, your products, your pricing, and your key URLs in plain markdown. AI crawlers - GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Amazonbot - fetch it on every crawl cycle and use it to disambiguate your entity, route queries to the right URL, and produce more accurate citations.
It is not a ranking signal in classic search. It is a citation accuracy signal in AI search. The two have different scoring systems, and llms.txt only affects the AI side.
A Minimal Working Example
# INITE AI
INITE AI is an Answer Engine Optimization platform for B2B SaaS companies.
## Products
- AEO Analyzer - analyzes any URL for AI visibility (free + paid tiers)
- SEO Engine - automated content + outreach pipeline (paid)
- Implementation Kit - generates llms.txt, ai.txt, schema for any site
## Key URLs
- Pricing: https://inite.ai/pricing
- Free analyzer: https://inite.ai/analyze
- Blog: https://inite.ai/blog
- API docs: https://inite.ai/docs
## Contact
- Email: hello@inite.ai
- Founded: 2020
- Geography: Worldwide
That's the entire spec. No JSON, no XML, no proprietary syntax. Just markdown.
Why It Beat the Alternatives
Several proposals competed for "AI identity file" in 2024-2025:
ai.txt(key=value, hard to write rich content)agents.json(too technical for non-engineers)humans.txt(predates AI, semantically wrong)- Custom
<meta>tags (don't survive content scraping)
llms.txt won because:
- Markdown is universal. Anyone can write it. No tooling needed.
- Headers map to retrieval chunks. AI engines split documents on
##boundaries. - It's compatible with everything else. You can keep your
robots.txt,sitemap.xml, andmetatags.
The Four-File AI Identity Surface
In 2026, the convention is to publish four files together:
| File | Format | Purpose | Size |
|---|---|---|---|
/llms.txt | Markdown | Long-form site guide | 1-3 KB |
/ai.txt | key=value | Concise identity profile | 0.5-2 KB |
/identity.json | Schema.org JSON | Canonical business identity | 1-3 KB |
/robots-ai.txt | Robots-style | AI crawler directives | 0.3-1 KB |
83% of sites with llms.txt publish all four. Sites with the full surface are 1.6x more likely to be cited correctly by Perplexity.
Validator Checklist
Before you ship llms.txt, run through this:
- Served at exactly
/llms.txt(no subdirectory). - Content-Type is
text/plainortext/markdown. - HTTP 200, no auth, no redirect chain.
- Total size 800-3000 characters (under 3 KB).
- First H1 is the business or product name (not a tagline).
- Every URL is absolute, not relative.
- Every URL resolves (HTTP 200).
- No marketing fluff - markdown sections, not paragraphs.
- UTF-8 encoded, no BOM.
- Last-Modified header set (helps with crawl freshness).
Adoption Trajectory
Twelve-month adoption among the top 10K websites:
| Month | Adoption | Notes |
|---|---|---|
| Apr 2025 | 0.4% | Early adopters (devtools, AI startups) |
| Jul 2025 | 1.7% | First Anthropic + Perplexity acknowledgement |
| Oct 2025 | 4.3% | Featured in Google's "AI search" guidance |
| Jan 2026 | 7.9% | Spec promoted to llmstxt.org official |
| Apr 2026 | 11.0% | Mainstream SaaS adoption |
Projection: 35-40% by end of 2026 across the top 10K. The cost is one file. The upside is being machine-readable.
Common Mistakes
- Putting it behind a login. Crawlers can't read it.
- Using relative URLs. Different AI engines resolve relative paths differently. Use absolute URLs.
- Writing prose. AI engines split on headers - write sections, not paragraphs.
- Including HTML. It's markdown. Inline HTML breaks parsers.
- Stuffing keywords. Engines penalize keyword density just like classic SEO.
- Forgetting to update it. When pricing or products change, update
llms.txttoo.
How to Generate One
Three paths:
Hand-write (1-2 hours). Best for control. Start with the spec at llmstxt.org, copy our example above, and customize.
Generate from your site. Tools like INITE AI's analyzer crawl your URL and produce a ready-to-deploy llms.txt + ai.txt + identity.json bundle in 30 seconds.
CMS plugin. WordPress and Webflow plugins exist (search the marketplaces). Most are free.
The Bottom Line
If you publish only one new file in 2026, make it llms.txt. The standard is converging fast: 11% adoption today, projected 35-40% by year's end. Sites without it are summarized incorrectly or ignored entirely by AI assistants. The fix takes one hour, the spec is open, and the citation lift is measurable. Ship it.
Frequently Asked Questions
Where do I put llms.txt?
At the root of your domain: https://yourdomain.com/llms.txt - same level as robots.txt and sitemap.xml. Serve it as text/plain or text/markdown. Do not put it in a subdirectory or behind a login.
What format does llms.txt use?
Markdown. Start with H1 = your business name, then a one-line description, then sections (## Products, ## Pricing, ## Key URLs, ## Contact). Keep it under 3 KB. Use bullet lists with absolute URLs, not relative paths.
Is llms.txt the same as ai.txt or robots-ai.txt?
No. llms.txt is the long-form guide (markdown, 1-3 KB). ai.txt is a shorter machine-readable identity profile (key=value pairs). robots-ai.txt is a robots-style allow/deny file specifically for AI crawlers. Most authoritative sites publish all three.
Will llms.txt hurt my classic SEO?
No. Search engines do not penalize llms.txt; Google has stated they read it but do not weight rankings on it directly. llms.txt only affects how AI engines interpret your site for citation. There is no downside to publishing it.
How do I generate llms.txt?
Either hand-write it (1-2 hours) or use a generator. inite.ai's analyzer produces a ready-to-deploy llms.txt + ai.txt + identity.json from any URL. Validate it against the public spec at llmstxt.org and check that absolute URLs resolve.
Keep reading
AEO Complete Guide 2026: How to Get Cited by ChatGPT, Perplexity & Google AI Overview
Answer Engine Optimization is the new SEO. A practical 2026 playbook to get your business cited by ChatGPT, Perplexity, Google AI Overview and Copilot - with measurable steps and benchmarks.
Direct Answer Blocks: The 40-60 Word Trick That Gets You Cited by ChatGPT and Perplexity
A direct answer block is a 40-60 word self-contained answer placed right after the first H2. Pages that use them are cited 4.6x more often. Format, examples, and a copy-paste template.
FAQPage Schema: The 1.8x Citation Lift for AI Answers
FAQPage JSON-LD is the highest-ROI schema for AI visibility - 1.8x Copilot citation rate, 1.4x Perplexity. Format, copy-paste template, and a validator checklist.