Just as robots.txt became essential the moment search engines started crawling the web, LLMs.txt is becoming essential as AI crawlers index the web for large language model training and real-time retrieval. It is a simple Markdown-formatted text file placed at your domain root that explicitly guides AI systems to your best content and away from content you don't want indexed.
What Is LLMs.txt?
LLMs.txt is an open standard proposed by Answer.AI (Jeremy Howard et al.) in 2024 and rapidly adopted by AI-aware organizations. The file lives at yoursite.com/llms.txt and contains:
A brief description of your site and its purpose
A curated list of your most important URLs with one-line descriptions
Optional sections organizing content by type or topic
An optional
llms-full.txtlink for systems that want to ingest complete page content
Unlike robots.txt (which uses allow/disallow directives), LLMs.txt is a positive guide - it tells AI systems what to prioritize, not just what to avoid.
Why LLMs.txt Matters for AEO
AI crawlers face the same challenge traditional crawlers did: a site may have thousands of pages but only a fraction represent high-quality, authoritative content. Without guidance, AI crawlers must infer quality from signals that may not capture your best work. LLMs.txt solves this by letting you explicitly signal which pages are your canonical, authoritative content.
Websites with LLMs.txt files report faster AI crawl coverage of their priority pages compared to sites without one, as AI crawlers can immediately identify and prioritize the linked content.
The file also helps prevent AI engines from indexing low-quality legacy content, draft pages, or thin content that could dilute your overall authority signal.
LLMs.txt File Format
The file uses Markdown syntax. Here is a minimal working example:
# AI Rank Lab
> AI Rank Lab is a SaaS platform for optimizing website visibility in
> AI search engines (ChatGPT, Gemini, Perplexity, Claude) through AEO
> and GEO analysis, content optimization, and citation tracking.
## Core Documentation
- [AEO Complete Guide](/blog/what-is-aeo-complete-guide-answer-engine-optimization-2026): Comprehensive guide to Answer Engine Optimization
- [GEO vs SEO 2026](/blog/geo-vs-seo-2026-what-changed-what-stays-what-matters): Comparison of GEO and SEO strategies
## Product Documentation
- [Dashboard Overview](/docs/dashboard): How to use the AI Rank Lab dashboard
- [Citation Tracking](/docs/citation-tracking): Setting up citation monitoring
## Optional: Full Content
llms-full.txt: https://airanklab.com/llms-full.txt
Step-by-Step Setup
Create the file: Create a plain text file named
llms.txtusing the Markdown format aboveSelect your URLs: Choose your 10–30 most authoritative and representative pages - pillar posts, key product pages, core documentation
Write descriptions: Add a concise, accurate one-line description for each URL explaining what the page covers
Upload to root: Place the file at
yoursite.com/llms.txt(not a subdirectory)Verify access: Visit
yoursite.com/llms.txtin a browser to confirm it's publicly accessible with Content-Type: text/plainKeep it updated: Update the file whenever you publish new high-priority content or retire old content
Platform Support
AI System | LLMs.txt Support | Status |
|---|---|---|
ChatGPT (OpenAI) | GPTBot respects it | Supported |
Claude (Anthropic) | ClaudeBot reads it | Supported |
Perplexity | PerplexityBot reads it | Supported |
Google Gemini | Google-Extended partially | Partial |
Meta AI | Not confirmed | Emerging |
LLMs.txt vs robots.txt vs sitemap.xml: The Full Picture
File | Purpose | Format | Primary Audience |
|---|---|---|---|
robots.txt | Block/allow crawler access | Plain text directives | All crawlers |
sitemap.xml | List all indexable URLs for comprehensiveness | XML | Search engine crawlers |
llms.txt | Curate and prioritize best content for AI | Markdown | AI language model crawlers |
llms-full.txt | Full text content of key pages for direct AI ingestion | Markdown with full content | AI systems needing full context |
Advanced LLMs.txt Strategies
Section Organization for Large Sites
For sites with more than 50 high-quality pages, organize your LLMs.txt into thematic sections. AI crawlers process the file top-to-bottom - place your most authoritative content first:
# Your Site Name
> Brief site description (1–2 sentences max)
## Most Important Content
- [URL 1](link): Description of most authoritative page
- [URL 2](link): Description of second most important page
## By Topic: Category A
- [URL 3](link): Description
- [URL 4](link): Description
## By Topic: Category B
- [URL 5](link): Description
The llms-full.txt Companion File
The llms-full.txt file is an optional companion that contains the full Markdown text of your key pages. This allows AI systems to ingest your complete content without making individual page requests. It is most valuable for:
Documentation sites where AI tools like Cursor and GitHub Copilot query your docs
Sites with React/Next.js or JavaScript-heavy pages that may be challenging for AI crawlers to render
High-priority content that you want AI systems to have in complete context
Known AI Crawlers and Their LLMs.txt Behavior
Crawler | User Agent | LLMs.txt Support | robots.txt Respected |
|---|---|---|---|
GPTBot (OpenAI) | GPTBot | Yes | Yes |
ClaudeBot (Anthropic) | ClaudeBot | Yes | Yes |
PerplexityBot | PerplexityBot | Yes | Yes |
Google-Extended | Google-Extended | Partial | Yes |
OAI-SearchBot (ChatGPT Search) | OAI-SearchBot | Yes | Yes |
Applebot-Extended | Applebot-Extended | Emerging | Yes |
Measuring LLMs.txt Effectiveness
To measure whether your LLMs.txt is working:
Server log analysis: Check for requests to
/llms.txtfrom AI crawler user agents (GPTBot, ClaudeBot, PerplexityBot)Subsequent page visits: After LLMs.txt requests, check if the listed URLs receive crawler visits - this indicates the file is being followed
Citation rate change: Track AI citation rates for your LLMs.txt-listed pages before and after deployment; expect 4–8 week lag for effects to appear
AI Rank Lab monitoring: Use AI Rank Lab's crawler activity dashboard to visualize which AI crawlers are accessing which pages
Common LLMs.txt Mistakes
Placing it in a subdirectory: The file must be at
yoursite.com/llms.txt, notyoursite.com/docs/llms.txtIncluding low-quality content: Only list your best pages - including thin or outdated content dilutes the signal quality
Never updating it: LLMs.txt should be updated when you publish new high-priority content or deprecate old content
Wrong Content-Type: Ensure the file is served as
text/plain, nottext/htmlBlocking LLMs.txt in robots.txt: Accidentally blocking AI crawlers from reading the LLMs.txt file itself defeats its purpose
Key Takeaways
LLMs.txt was proposed by Jeremy Howard of Answer.AI in 2024 and is now supported by GPTBot, ClaudeBot, and PerplexityBot
Unlike robots.txt (restrict access) or sitemap.xml (list all pages), LLMs.txt is a curated positive guide for AI systems
Setup takes under 30 minutes; maintenance requires occasional updates when publishing new priority content
Measure effectiveness by monitoring AI crawler server logs and tracking citation rate changes for listed URLs
The llms-full.txt companion file is especially valuable for documentation sites and JavaScript-heavy pages
For a detailed implementation tutorial, see our step-by-step LLMs.txt creation guide. Manage and monitor your LLMs.txt impact with AI Rank Lab.
Frequently Asked Questions
What is LLMs.txt?▾
Does LLMs.txt actually improve AI citations?▾
Is LLMs.txt an official standard?▾
What should I include in my LLMs.txt file?▾
Where exactly should I place the LLMs.txt file?▾
How is LLMs.txt different from a sitemap?▾
Written by
Devanshu
AI Search Optimization Expert



