
What Is llms.txt and Why Every Site Needs One in 2026
llms.txt is the de-facto standard for telling AI engines who you are and how to interpret your content. A complete guide with template, validator checklist, and adoption data.
What llms.txt Actually Is
llms.txt is a markdown file you publish at the root of your domain - https://yourdomain.com/llms.txt. Inside it you describe your business, your products, your pricing, and your key URLs in plain markdown. AI crawlers - GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Amazonbot - fetch it on every crawl cycle and use it to disambiguate your entity, route queries to the right URL, and produce more accurate citations.
It is not a ranking signal in classic search. It is a citation accuracy signal in AI search. The two have different scoring systems, and llms.txt only affects the AI side.
A Minimal Working Example
# INITE AI
INITE AI is an Answer Engine Optimization platform for B2B SaaS companies.
## Products
- AEO Analyzer - analyzes any URL for AI visibility (free + paid tiers)
- SEO Engine - automated content + outreach pipeline (paid)
- Implementation Kit - generates llms.txt, ai.txt, schema for any site
## Key URLs
- Pricing: https://inite.ai/en/pricing
- Free analyzer: https://inite.ai/en/analyze
- Blog: https://inite.ai/en/blog
## Contact
- Email: hello@inite.ai
- Founded: 2020
- Geography: Worldwide
That's the entire spec. No JSON, no XML, no proprietary syntax. Just markdown.
Why It Beat the Alternatives
Several proposals competed for "AI identity file" in 2024-2025:
ai.txt(key=value, hard to write rich content)agents.json(too technical for non-engineers)humans.txt(predates AI, semantically wrong)- Custom
<meta>tags (don't survive content scraping)
llms.txt won because:
- Markdown is universal. Anyone can write it. No tooling needed.
- Headers map to retrieval chunks. AI engines split documents on
##boundaries. - It's compatible with everything else. You can keep your
robots.txt,sitemap.xml, andmetatags.
The Four-File AI Identity Surface
In 2026, the convention is to publish four files together:
| File | Format | Purpose | Size |
|---|---|---|---|
/llms.txt | Markdown | Long-form site guide | 1-3 KB |
/ai.txt | key=value | Concise identity profile | 0.5-2 KB |
/identity.json | Schema.org JSON | Canonical business identity | 1-3 KB |
/robots-ai.txt | Robots-style | AI crawler directives | 0.3-1 KB |
83% of sites with llms.txt publish all four. Sites with the full surface are 1.6x more likely to be cited correctly by Perplexity.
Validator Checklist
Before you ship llms.txt, run through this:
- Served at exactly
/llms.txt(no subdirectory). - Content-Type is
text/plainortext/markdown. - HTTP 200, no auth, no redirect chain.
- Total size 800-3000 characters (under 3 KB).
- First H1 is the business or product name (not a tagline).
- Every URL is absolute, not relative.
- Every URL resolves (HTTP 200).
- No marketing fluff - markdown sections, not paragraphs.
- UTF-8 encoded, no BOM.
- Last-Modified header set (helps with crawl freshness).
Adoption Trajectory
Twelve-month adoption among the top 10K websites:
| Month | Adoption | Notes |
|---|---|---|
| Apr 2025 | 0.4% | Early adopters (devtools, AI startups) |
| Jul 2025 | 1.7% | First Anthropic + Perplexity acknowledgement |
| Oct 2025 | 4.3% | Featured in Google's "AI search" guidance |
| Jan 2026 | 7.9% | Spec promoted to llmstxt.org official |
| Apr 2026 | 11.0% | Mainstream SaaS adoption |
Projection: 35-40% by end of 2026 across the top 10K. The cost is one file. The upside is being machine-readable.
Common Mistakes
- Putting it behind a login. Crawlers can't read it.
- Using relative URLs. Different AI engines resolve relative paths differently. Use absolute URLs.
- Writing prose. AI engines split on headers - write sections, not paragraphs.
- Including HTML. It's markdown. Inline HTML breaks parsers.
- Stuffing keywords. Engines penalize keyword density just like classic SEO.
- Forgetting to update it. When pricing or products change, update
llms.txttoo.
How to Generate One
Three paths:
Hand-write (1-2 hours). Best for control. Start with the spec at llmstxt.org, copy our example above, and customize.
Generate from your site. Tools like INITE AI's analyzer crawl your URL and produce a ready-to-deploy llms.txt + ai.txt + identity.json bundle in 30 seconds.
CMS plugin. WordPress and Webflow plugins exist (search the marketplaces). Most are free.
The Bottom Line
If you publish only one new file in 2026, make it llms.txt. The standard is converging fast: 11% adoption today, projected 35-40% by year's end. Sites without it are summarized incorrectly or ignored entirely by AI assistants. The fix takes one hour, the spec is open, and the citation lift is measurable. Ship it.
Frequently Asked Questions
01Where do I put llms.txt?+
At the root of your domain: https://yourdomain.com/llms.txt - same level as robots.txt and sitemap.xml. Serve it as text/plain or text/markdown. Do not put it in a subdirectory or behind a login.
02What format does llms.txt use?+
Markdown. Start with H1 = your business name, then a one-line description, then sections (## Products, ## Pricing, ## Key URLs, ## Contact). Keep it under 3 KB. Use bullet lists with absolute URLs, not relative paths.
03Is llms.txt the same as ai.txt or robots-ai.txt?+
No. llms.txt is the long-form guide (markdown, 1-3 KB). ai.txt is a shorter machine-readable identity profile (key=value pairs). robots-ai.txt is a robots-style allow/deny file specifically for AI crawlers. Most authoritative sites publish all three.
04Will llms.txt hurt my classic SEO?+
No. Search engines do not penalize llms.txt; Google has stated they read it but do not weight rankings on it directly. llms.txt only affects how AI engines interpret your site for citation. There is no downside to publishing it.
05How do I generate llms.txt?+
Either hand-write it (1-2 hours) or use a generator. inite.ai's analyzer produces a ready-to-deploy llms.txt + ai.txt + identity.json from any URL. Validate it against the public spec at llmstxt.org and check that absolute URLs resolve.
Keep reading

AI Crawler Allowlist 2026: Which Bots to Let In, Block, or Ignore

AEO Complete Guide 2026: How to Get Cited by ChatGPT, Perplexity & Google AI Overview
