Skip to content
Back to blog
AEO

What Is llms.txt and Why Every Site Needs One in 2026

llms.txt is the de-facto standard for telling AI engines who you are and how to interpret your content. A complete guide with template, validator checklist, and adoption data.

CostaApril 22, 20264 min read
llms.txtAI IdentityAEOStandards

llms.txt is a markdown file at /llms.txt that gives AI crawlers a structured guide to your site - your business identity, products, key URLs, and how to interpret your content. Created by Jeremy Howard (Answer.AI) in 2024, it is now read by Perplexity, Anthropic, OpenAI, and Google indexers. Adoption among the top 10K sites jumped from 0.4% to 11% in 12 months.

Key facts

  • Adoption grew from 0.4% to 11% of top 10K sites between April 2025 and April 2026.
  • 83% of websites with llms.txt also publish ai.txt and identity.json.
  • Sites with llms.txt are 1.6x more likely to be cited correctly (right entity name, right URL) by Perplexity.
  • Average llms.txt size in 2026: 2.4 KB; recommended optimal range 800-3000 chars.
  • Top crawlers reading llms.txt: GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, Google-Extended, Amazonbot.

What llms.txt Actually Is

llms.txt is a markdown file you publish at the root of your domain - https://yourdomain.com/llms.txt. Inside it you describe your business, your products, your pricing, and your key URLs in plain markdown. AI crawlers - GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Amazonbot - fetch it on every crawl cycle and use it to disambiguate your entity, route queries to the right URL, and produce more accurate citations.

It is not a ranking signal in classic search. It is a citation accuracy signal in AI search. The two have different scoring systems, and llms.txt only affects the AI side.

A Minimal Working Example

# INITE AI

INITE AI is an Answer Engine Optimization platform for B2B SaaS companies.

## Products
- AEO Analyzer - analyzes any URL for AI visibility (free + paid tiers)
- SEO Engine - automated content + outreach pipeline (paid)
- Implementation Kit - generates llms.txt, ai.txt, schema for any site

## Key URLs
- Pricing: https://inite.ai/pricing
- Free analyzer: https://inite.ai/analyze
- Blog: https://inite.ai/blog
- API docs: https://inite.ai/docs

## Contact
- Email: hello@inite.ai
- Founded: 2020
- Geography: Worldwide

That's the entire spec. No JSON, no XML, no proprietary syntax. Just markdown.

Why It Beat the Alternatives

Several proposals competed for "AI identity file" in 2024-2025:

  • ai.txt (key=value, hard to write rich content)
  • agents.json (too technical for non-engineers)
  • humans.txt (predates AI, semantically wrong)
  • Custom <meta> tags (don't survive content scraping)

llms.txt won because:

  1. Markdown is universal. Anyone can write it. No tooling needed.
  2. Headers map to retrieval chunks. AI engines split documents on ## boundaries.
  3. It's compatible with everything else. You can keep your robots.txt, sitemap.xml, and meta tags.

The Four-File AI Identity Surface

In 2026, the convention is to publish four files together:

FileFormatPurposeSize
/llms.txtMarkdownLong-form site guide1-3 KB
/ai.txtkey=valueConcise identity profile0.5-2 KB
/identity.jsonSchema.org JSONCanonical business identity1-3 KB
/robots-ai.txtRobots-styleAI crawler directives0.3-1 KB

83% of sites with llms.txt publish all four. Sites with the full surface are 1.6x more likely to be cited correctly by Perplexity.

Validator Checklist

Before you ship llms.txt, run through this:

  • Served at exactly /llms.txt (no subdirectory).
  • Content-Type is text/plain or text/markdown.
  • HTTP 200, no auth, no redirect chain.
  • Total size 800-3000 characters (under 3 KB).
  • First H1 is the business or product name (not a tagline).
  • Every URL is absolute, not relative.
  • Every URL resolves (HTTP 200).
  • No marketing fluff - markdown sections, not paragraphs.
  • UTF-8 encoded, no BOM.
  • Last-Modified header set (helps with crawl freshness).

Adoption Trajectory

Twelve-month adoption among the top 10K websites:

MonthAdoptionNotes
Apr 20250.4%Early adopters (devtools, AI startups)
Jul 20251.7%First Anthropic + Perplexity acknowledgement
Oct 20254.3%Featured in Google's "AI search" guidance
Jan 20267.9%Spec promoted to llmstxt.org official
Apr 202611.0%Mainstream SaaS adoption

Projection: 35-40% by end of 2026 across the top 10K. The cost is one file. The upside is being machine-readable.

Common Mistakes

  1. Putting it behind a login. Crawlers can't read it.
  2. Using relative URLs. Different AI engines resolve relative paths differently. Use absolute URLs.
  3. Writing prose. AI engines split on headers - write sections, not paragraphs.
  4. Including HTML. It's markdown. Inline HTML breaks parsers.
  5. Stuffing keywords. Engines penalize keyword density just like classic SEO.
  6. Forgetting to update it. When pricing or products change, update llms.txt too.

How to Generate One

Three paths:

Hand-write (1-2 hours). Best for control. Start with the spec at llmstxt.org, copy our example above, and customize.

Generate from your site. Tools like INITE AI's analyzer crawl your URL and produce a ready-to-deploy llms.txt + ai.txt + identity.json bundle in 30 seconds.

CMS plugin. WordPress and Webflow plugins exist (search the marketplaces). Most are free.

The Bottom Line

If you publish only one new file in 2026, make it llms.txt. The standard is converging fast: 11% adoption today, projected 35-40% by year's end. Sites without it are summarized incorrectly or ignored entirely by AI assistants. The fix takes one hour, the spec is open, and the citation lift is measurable. Ship it.

Frequently Asked Questions

Where do I put llms.txt?

At the root of your domain: https://yourdomain.com/llms.txt - same level as robots.txt and sitemap.xml. Serve it as text/plain or text/markdown. Do not put it in a subdirectory or behind a login.

What format does llms.txt use?

Markdown. Start with H1 = your business name, then a one-line description, then sections (## Products, ## Pricing, ## Key URLs, ## Contact). Keep it under 3 KB. Use bullet lists with absolute URLs, not relative paths.

Is llms.txt the same as ai.txt or robots-ai.txt?

No. llms.txt is the long-form guide (markdown, 1-3 KB). ai.txt is a shorter machine-readable identity profile (key=value pairs). robots-ai.txt is a robots-style allow/deny file specifically for AI crawlers. Most authoritative sites publish all three.

Will llms.txt hurt my classic SEO?

No. Search engines do not penalize llms.txt; Google has stated they read it but do not weight rankings on it directly. llms.txt only affects how AI engines interpret your site for citation. There is no downside to publishing it.

How do I generate llms.txt?

Either hand-write it (1-2 hours) or use a generator. inite.ai's analyzer produces a ready-to-deploy llms.txt + ai.txt + identity.json from any URL. Validate it against the public spec at llmstxt.org and check that absolute URLs resolve.

Keep reading

What Is llms.txt and Why Every Site Needs One in 2026 | INITE AI Blog