LogoSu Jiang
  • Blog
  • Knowledge Base
  • About Me
The Complete Guide to llms.txt: The New robots.txt for the AI Era
2025/12/12

The Complete Guide to llms.txt: The New robots.txt for the AI Era

What is llms.txt? Is it useful for SEO? Will Google use it? This comprehensive guide covers the technical specification, real-world examples, and industry debates around this emerging AI standard.

1. Is 10 Minutes Worth It?

llms.txt is a Markdown file placed in your website's root directory, telling AI models which content matters most.

Creating it takes only 10 minutes.

The question is: Does it work?

Google says no. John Mueller says no AI uses it. Yet server logs show OpenAI crawls it every 15 minutes.

Semrush says lacking llms.txt risks AI misunderstanding your site. Anthropic, Cloudflare, and Zapier have all deployed it.

On one side, explicit denial. On the other, quiet action.

This article will tell you: what llms.txt is, who uses it, where the controversy lies, and what you should do.


2. What is llms.txt

robots.txtBlock Accesssitemap.xmlFull Indexllms.txtPriority Guide

2.1 Core Concept

llms.txt is a Markdown file placed in your website's root directory.

Location: https://yoursite.com/llms.txt

Its purpose is to tell AI models: where to find the most important content on your website.

Traditional websites have two standard files:

  • robots.txt: Tells search engine crawlers "which pages NOT to crawl"
  • sitemap.xml: Tells search engines "what pages exist on the site"

llms.txt serves a different purpose. It doesn't tell AI "where not to go," but rather "where to look first."

2.2 Why This File is Needed

Jeremy Howard explained the background in his proposal:

"Large language models increasingly rely on website information, but face a critical limitation: context windows are too small to handle most websites in their entirety. Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise."

In other words, AI models face three problems when "reading" web pages:

  1. Limited context windows: Even the most powerful models can't process an entire website at once
  2. Messy HTML: Ads, navigation bars, and JavaScript code mixed together make it hard for AI to extract core content
  3. No priority system: AI doesn't know which pages are important and which are secondary

llms.txt aims to solve these three problems.

2.3 Technical Specification

The llms.txt format is remarkably simple, using standard Markdown syntax:

# Project Name

> Brief project introduction (one or two sentences)

Important notes and background information

## Documentation
- [Quick Start](https://example.com/docs/quickstart.md): Essential tutorial for beginners
- [API Reference](https://example.com/docs/api.md): Complete API documentation

## Examples
- [Complete Demo](https://example.com/examples/demo.md): A full application example

## Optional
- [Advanced Features](https://example.com/docs/advanced.md): Optional advanced content

The file structure contains four parts:

# Project NameRequired> Brief descriptionRecommendedAdditional notes...Optional## Documentation- Link Title: Description- Link Title: DescriptionCore Content## Optional- Optional Link: DescriptionCan Skip
  1. H1 Header: Project or website name (required)
  2. Blockquote: One or two sentence summary
  3. Body paragraphs: Additional explanatory information
  4. H2 sections + Link lists: Content links organized by category

Important note: The Optional H2 header has special meaning. Links under it are "optional" and can be skipped when AI needs a shorter context.

2.4 Two File Types

The standard actually includes two files:

FilePurpose
llms.txtConcise version with links and descriptions only
llms-full.txtComplete version with full text content

llms.txt is the navigation map; llms-full.txt is the complete content.

Some websites provide only llms.txt, others provide both. The choice depends on your content volume and use case.


3. Real Examples: Who's Using llms.txt

3.1 Official Example: FastHTML

Jeremy Howard's own FastHTML project serves as the canonical example:

# FastHTML

> FastHTML is a Python library that brings together Starlette, Uvicorn, HTMX, and fastcore's FT "FastTags" for creating server-rendered hypermedia applications.

Important notes:
- Although parts of its API are inspired by FastAPI, it is NOT compatible with FastAPI syntax
- FastHTML is compatible with JS-native web components and vanilla JS libraries, but not with React, Vue, or Svelte

## Docs
- [FastHTML Quick Start](https://fastht.ml/docs/tutorials/quickstart_for_web_devs.html.md): Overview of FastHTML core features
- [HTMX Reference](https://github.com/bigskysoftware/htmx/blob/master/www/content/reference.md): Complete HTMX attributes, CSS classes, events reference

## Examples
- [Todo App](https://github.com/AnswerDotAI/fasthtml/blob/main/examples/adv_app.py): Detailed walkthrough of a complete CRUD app

## Optional
- [Starlette Documentation](https://gist.githubusercontent.com/.../starlette-sml.md): Subset of Starlette docs useful for FastHTML development

This example demonstrates several key points:

  • Clear hierarchical structure
  • Each link has a description
  • Appropriate use of Optional to distinguish primary from secondary content

3.2 Major Company Adoption

The following companies have deployed llms.txt:

Anthropic (Creator of Claude)

  • URL: https://docs.anthropic.com/llms.txt
  • Content: API documentation, Prompt libraries, SDK references

Cloudflare

  • Organized by service: AI Gateway, Workers, Pages, etc.

Perplexity

  • URL: https://docs.perplexity.ai/llms-full.txt

Zapier

  • URL: https://docs.zapier.com/llms-full.txt
  • Focused on API endpoints and automation workflows

3.3 Turning Point: November 2024

llms.txt adoption experienced an explosion.

In November 2024, documentation platform Mintlify announced: automatic llms.txt generation for all documentation sites hosted on their platform.

This meant thousands of documentation sites instantly supported llms.txt, including Anthropic's and Cursor's documentation.

This is a classic case of "platform-driven adoption."


4. Industry Debate: Useful or Useless

AgainstGoogle: Not SupportedJohn Mueller: No AI Uses ItSEJ: Security Risks ExistReddit: Might Be Marketing HypeForJeremy Howard: Standard ProposalAnthropic/Cloudflare: AdoptedLogs Show: OpenAI is CrawlingMinimal Cost: Worth Trying

4.1 Google's Position: Explicit Rejection

Google Search Liaison Gary Illyes explicitly stated at the 2025 Search Central Deep Dive event:

"Google doesn't support llms.txt, and has no plans to support it."

His colleague John Mueller was even more direct on Reddit:

"AFAIK, none of the AI services have said they're using llms.txt. You can tell from your server logs that they don't even check for it."

Google's official recommendation: If you want to rank in AI Overviews, just do normal SEO.

No need for GEO, no need for LLMO, no need for llms.txt.

4.2 SEO Tools' Position: Aggressive Promotion

Interestingly, while AI platforms don't support it, SEO tools are heavily promoting llms.txt.

Semrush's audit feature warns:

"If your site lacks a clear llms.txt file, it risks being misrepresented by AI systems."

Rank Math's description is even more exaggerated:

"When AI chatbots try to summarize or answer questions about your site, they won't guess. They'll refer to the curated version you've provided."

This doesn't match reality at all.

4.3 The Truth: No Official Support, But Crawling Activity

SEO expert Ray Martinez shared his server log analysis:

"OpenAI crawls my llms.txt file roughly every 15 minutes."

GEO monitoring company Profound also reported: Models from Microsoft, OpenAI, and others are indeed crawling and indexing llms.txt files.

What does this mean?

AI companies haven't publicly committed to using llms.txt, but their systems may be quietly experimenting.

4.4 Why AI Platforms Might Choose Not to Use It

Search Engine Journal's Roger Montti offered a profound analysis:

"llms.txt is inherently untrustworthy."

The reason is simple: llms.txt can be completely different from webpage content.

An unethical SEO could add content to llms.txt that doesn't exist on the webpage, specifically to deceive AI.

A 2024 research paper "Adversarial Search Engine Optimization for Large Language Models" proved this:

"Attackers can trick LLMs into promoting their content over competitors. We verified the effectiveness of this attack on production LLM search engines like Bing and Perplexity."

If webpage HTML content and llms.txt can be inconsistent, AI platforms have difficulty trusting llms.txt.


5. llms.txt vs robots.txt vs sitemap.xml

The differences between these three need to be clear:

FileTargetFunction
robots.txtSearch engine crawlersTells crawlers which pages NOT to access
sitemap.xmlSearch engine crawlersLists ALL indexable pages on the site
llms.txtAI modelsTells AI which content to prioritize

Key differences:

  1. robots.txt is "block," llms.txt is "recommend"
  2. sitemap.xml is "comprehensive," llms.txt is "curated"
  3. robots.txt targets crawlers, llms.txt targets AI at inference time

Another important difference: robots.txt has legal force (though weak), while llms.txt relies entirely on voluntary compliance.


6. How to Create llms.txt

6.1 Manual Creation

The simplest method is to manually write a Markdown file.

Steps:

  1. Create a file named llms.txt
  2. Fill in content according to the specification format
  3. Upload to your website's root directory

Template example:

# Your Website Name

> One sentence describing what your website does.

## Core Content
- [Homepage](https://yoursite.com/): Main website page
- [About Us](https://yoursite.com/about): Company introduction
- [Products](https://yoursite.com/products): All products

## Blog
- [Latest Posts](https://yoursite.com/blog/latest): Recently published content

## Optional
- [Terms of Service](https://yoursite.com/terms): Usage terms

6.2 Using Generator Tools

Firecrawl

Firecrawl is the most well-known llms.txt generation tool.

API access:

http://llmstxt.firecrawl.dev/{YOUR_URL}
http://llmstxt.firecrawl.dev/{YOUR_URL}/full

It crawls your website, uses GPT-4o-mini to extract key information, and generates llms.txt and llms-full.txt.

WordPress Plugins

If you use WordPress, several plugins are available:

  1. Website LLMs.txt: 3000+ downloads, integrates with Yoast and Rank Math
  2. LLMs.txt and LLMs-Full.txt Generator: Automatically generates both files

These plugins will:

  • Automatically scan your posts and pages
  • Exclude content set to noindex
  • Periodically update the files

6.3 Documentation Platform Native Support

The following platforms natively support llms.txt generation:

  • Mintlify: Automatically generates for all documentation sites
  • VitePress: Via vitepress-plugin-llms plugin
  • Docusaurus: Via docusaurus-plugin-llms plugin
  • Drupal: Via LLM Support module

7. Implementation Recommendations

7.1 Should You Deploy llms.txt?

My recommendation: You can deploy it, but don't expect miracles.

Reasons:

  1. Minimal cost: Creating a Markdown file doesn't take much time
  2. No downside: Even if AI doesn't use it, it won't affect your website
  3. Potentially useful: Though not officially acknowledged, logs show crawling activity
  4. Content organization byproduct: Creating llms.txt forces you to organize your site structure

7.2 What NOT to Do

  1. Don't cheat in llms.txt: Adding content that doesn't exist on the page is short-sighted
  2. Don't replace traditional SEO: Google explicitly said to focus on basic SEO
  3. Don't expect ranking improvements: Currently no evidence llms.txt improves AI search rankings

7.3 Best Practices

  1. Content consistency: Descriptions in llms.txt should accurately reflect webpage content
  2. Regular updates: Update llms.txt when adding important new content
  3. Use Optional wisely: Put secondary content in the Optional section
  4. Monitor logs: Observe which AI crawlers are accessing your llms.txt

7.4 Google's Recommendation

Google recommends: Add a noindex tag to llms.txt.

Why? Because llms.txt is for AI, not for users. If indexed by Google, users searching might see this technical file, creating a poor experience.

Implementation: Add to HTTP response headers:

X-Robots-Tag: noindex

8. The Future of llms.txt

8.1 Will It Become a Standard?

Currently, llms.txt is far from being a "standard."

robots.txt took over a decade to become widely accepted. llms.txt is just getting started.

Key variables:

  • Official AI platform support: If OpenAI or Google announces support, adoption will explode
  • Alternative solutions: Other AI optimization standards may emerge
  • Content security issues: If llms.txt is widely abused, AI platforms may simply ignore it

8.2 The Bigger Picture

llms.txt is just the tip of the iceberg.

The real trend is: Content discovery is shifting from search engines to AI assistants.

Users increasingly ask ChatGPT, Claude, and Perplexity directly, rather than Google.

What does this mean for content creators?

  1. Structured content matters more: AI understands well-structured content more easily
  2. Professional depth matters more: AI prioritizes authoritative sources
  3. Direct answers matter more: AI prefers content that directly answers questions

llms.txt is a small tool for adapting to this trend. The file itself isn't important. What's important is the mindset shift it represents:

We're no longer writing content just for human readers. We also need to optimize for AI readers.


9. Summary

llms.txt is a standard proposal introduced by Jeremy Howard in September 2024.

What it is: A Markdown file that tells AI models where to find the most important content on a website.

Who's using it: Anthropic, Cloudflare, Zapier, and thousands of documentation sites hosted on Mintlify.

The controversy: Google explicitly doesn't support it, major AI platforms haven't officially endorsed it, but logs show they're quietly crawling it.

My recommendation: Spend 10 minutes creating one, but don't expect it to boost rankings. Focus on traditional SEO.

Finally, remember: llms.txt itself isn't important. What's important is understanding how AI consumes content, and optimizing your content strategy accordingly.


References

  1. llms.txt Official Specification
  2. Jeremy Howard's Proposal Announcement
  3. Search Engine Land: Meet llms.txt
  4. Search Engine Journal: LLMs.txt For AI SEO
  5. Google says normal SEO works for AI Overviews
  6. Firecrawl llms.txt Generator
  7. llms.txt Directory
  8. Jeremy Howard's Original Tweet
All Posts

Author

avatar for Su Jiang
Su Jiang

Categories

  • AI探索
1. Is 10 Minutes Worth It?2. What is llms.txt2.1 Core Concept2.2 Why This File is Needed2.3 Technical Specification2.4 Two File Types3. Real Examples: Who's Using llms.txt3.1 Official Example: FastHTML3.2 Major Company Adoption3.3 Turning Point: November 20244. Industry Debate: Useful or Useless4.1 Google's Position: Explicit Rejection4.2 SEO Tools' Position: Aggressive Promotion4.3 The Truth: No Official Support, But Crawling Activity4.4 Why AI Platforms Might Choose Not to Use It5. llms.txt vs robots.txt vs sitemap.xml6. How to Create llms.txt6.1 Manual Creation6.2 Using Generator Tools6.3 Documentation Platform Native Support7. Implementation Recommendations7.1 Should You Deploy llms.txt?7.2 What NOT to Do7.3 Best Practices7.4 Google's Recommendation8. The Future of llms.txt8.1 Will It Become a Standard?8.2 The Bigger Picture9. SummaryReferences

More Posts

One-Person Company: The Most Viable Business Model in 2025
AI探索

One-Person Company: The Most Viable Business Model in 2025

Sam Altman predicts a billion-dollar one-person company is coming. How can one person use AI and automation to outperform an entire team?

avatar for Su Jiang
Su Jiang
2025/12/18
Su's E^F Law
生活随记

Su's E^F Law

Deriving a unified value formula from four axioms. Why do some people get poorer the busier they are, while others produce effortlessly? The answer lies in the exponent.

avatar for Su Jiang
Su Jiang
2025/12/14
Hello, Welcome to Su Jiang's Blog
AI探索

Hello, Welcome to Su Jiang's Blog

Welcome to my new blog where I share AI applications, startup notes, investment insights and life stories.

avatar for Su Jiang
Su Jiang
2024/12/09

Need a Custom Solution?

Still stuck or want someone to handle the heavy lifting? Send me a quick message. I reply to every inquiry within 24 hours—and yes, simple advice is always free.

100% Privacy. No spam, just solutions.

Newsletter

Join the community

Subscribe to our newsletter for the latest news and updates

LogoSu Jiang

AI Developer · Writer · Investor | Exploring AI Applications

TwitterX (Twitter)Email

WeChat: iamsujiang

WeChat QR Code
Scan to add WeChat
Product
  • Features
  • Pricing
  • FAQ
Resources
  • Blog
  • Knowledge Base
Company
  • About Me
  • Contact
  • Waitlist
Legal
  • Cookie Policy
  • Privacy Policy
  • Terms of Service
© 2026 Su Jiang All Rights Reserved.