---
title: "LLMs.txt: Why AI Crawlers Ignore It (2025 Audit)"
date: "2025-08-11"
author: "Flavio Longato"
tags: ["llms.txt"]
categories: ["GEO"]
url: "https://www.longato.ch/llms-recommendation-2025-august/"
---

```
&lt;mark class=&quot;has-inline-color&quot; style=&quot;background-color:#FFEE58&quot;&gt;&lt;a href=&quot;Updated: June 2026 · A new article has been published on the subject about LLMS.txt and extends my earlier write-up, llms.txt&quot;&gt;Updated: June 2026 · A new article has been published on the subject about LLMS.txt and extends my earlier write-up, llms.txt&lt;/a&gt;&lt;/mark&gt;
```

This analysis aims to review the usage of LLMs.txt files in the context of LLMs.

How was the analysis performed: I audited 30 days of raw CDN logs for **1,000 Adobe Experience Manager domains** to see who actually requests the file. The results were, frankly, brutal.

Findings of the LLMs.txt audit:
-------------------------------

- **LLM-specific bots stayed away.** No GPTBot, ClaudeBot, PerplexityBot, or similar were seen at all.
- **Google still probes everything.** Its desktop crawler accounted for 95% of all hits.
- **Bing is curious but inconsistent.** Only seven requests—concentrated on one domain (out of one-thousand)
- **OpenAI’s search bot was minimal.** Ten calls from `OpenAIBotSearch`. GPTBot itself was absent.
- **SEO tools inflated the logs.** Tools like Semrush Mobile and SiteAudit caused many hits, unrelated to LLMs.
 
 &lt;figure class=&quot;wp-block-table&quot;&gt;&lt;table class=&quot;has-fixed-layout&quot;&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Rank&lt;/th&gt;&lt;th&gt;User-agent&lt;/th&gt;&lt;th&gt;Share of all llms.txt hits&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;**GoogleBotDesktop**&lt;/td&gt;&lt;td&gt;94.9%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;OpenAIBotSearch&lt;/td&gt;&lt;td&gt;1.1%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;ScanPire&lt;/td&gt;&lt;td&gt;0.8%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;BingBot&lt;/td&gt;&lt;td&gt;0.8%&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;…&lt;/td&gt;&lt;td&gt;*Eight other bots*&lt;/td&gt;&lt;td&gt;&lt;1% each&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;

&lt;/figure&gt;Why Aren’t LLMs going to the llms.txt file?
-------------------------------------------

1. **The spec is still unofficial.** No LLM lab has committed to honoring it yet.
2. **Most training uses pre-built datasets.** Like Common Crawl or books, not live fetches.
3. **Robots.txt already covers them.** Major labs honor standard tokens like `GPTBot` and `ClaudeBot`.
4. **It’s not cost-effective.** Probing `llms.txt` on every domain wastes crawl budget.
 
What are my recommendations for site owners in relation to llms.txt
-------------------------------------------------------------------

This really depends on the difficulty of implementing the llms.txt file, if you feel that it would be relatively easy to create the file then go for it. If it requires a large amount of resources, then I’d recommend you hold-back until we clearly see benefits.

&gt; For example, this domain uses the llms.txt file at https://www.longato.ch/llms.txt because it was easy to implement

- **Use robots.txt instead.** It’s the only widely respected barrier today
- **Watch your logs.** Use tools like Grafana or BigQuery to detect AI crawlers directly 
 - Remember, if you use a CDN you’d need to look into the logs within the CDN
 
What Might Change Soon for LLMs.txt
-----------------------------------

As of now (2025 August) there are no major announcements from LLMs in relation to llms.txt

 &lt;figure class=&quot;wp-block-table&quot;&gt;&lt;table class=&quot;has-fixed-layout&quot;&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Provider&lt;/th&gt;&lt;th&gt;Current stance on llms.txt&lt;/th&gt;&lt;th&gt;Signal to watch&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;OpenAI&lt;/td&gt;&lt;td&gt;No support announced&lt;/td&gt;&lt;td&gt;GPTBot documentation updates&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Google / Gemini&lt;/td&gt;&lt;td&gt;Monitors but uses Google-Extended&lt;/td&gt;&lt;td&gt;Revisions to Google-Extended policy&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Microsoft / Copilot&lt;/td&gt;&lt;td&gt;Silent&lt;/td&gt;&lt;td&gt;BingBlog crawler updates&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Meta&lt;/td&gt;&lt;td&gt;No mention&lt;/td&gt;&lt;td&gt;Meta crawler policy changes&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Anthropic&lt;/td&gt;&lt;td&gt;No mention&lt;/td&gt;&lt;td&gt;ClaudeBot UA policy&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;

&lt;/figure&gt;Are there any external validation of my findings?
-------------------------------------------------

 &lt;figure class=&quot;wp-block-table&quot;&gt;&lt;table class=&quot;has-fixed-layout&quot;&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Date&lt;/th&gt;&lt;th&gt;Key development&lt;/th&gt;&lt;th&gt;Who said / did it&lt;/th&gt;&lt;th&gt;Take‑away&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;**17 Jun 2025**&lt;/td&gt;&lt;td&gt;“FWIW **no AI system currently uses llms.txt**.”&lt;/td&gt;&lt;td&gt;John Mueller, Google, on Bluesky&lt;/td&gt;&lt;td&gt;Google confirms zero support and no immediate plans. ([Search Engine Roundtable](https://www.seroundtable.com/google-ai-llms-txt-39607.html))&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;**19 Jun 2025**&lt;/td&gt;&lt;td&gt;ScaleMath publishes an adoption‑tracker deep‑dive.&lt;/td&gt;&lt;td&gt;Independent analysts&lt;/td&gt;&lt;td&gt;Finds early enthusiasm among dev‑doc sites but **no proof of LLM consumption**. ([ScaleMath](https://scalemath.com/blog/llms-txt/))&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;**02 Jul 2025**&lt;/td&gt;&lt;td&gt;PPC Land headline – “llms.txt adoption stalls as major AI platforms ignore proposed standard”.&lt;/td&gt;&lt;td&gt;Industry press&lt;/td&gt;&lt;td&gt;OpenAI, Google, Anthropic still not honoring the file. ([PPC Land](https://ppc.land/llms-txt-adoption-stalls-as-major-ai-platforms-ignore-proposed-standard/?utm_source=chatgpt.com))&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;**22 Jul 2025**&lt;/td&gt;&lt;td&gt;Mueller advises **adding `X‑Robots‑Tag: noindex`** to llms.txt to avoid clutter in Google results.&lt;/td&gt;&lt;td&gt;Google&lt;/td&gt;&lt;td&gt;Tactical hygiene tip; doesn’t affect crawling behaviour. ([Stan Ventures](https://www.stanventures.com/news/noindex-llms-txt-google-recommendation-3674/))&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;**24 Jul 2025**&lt;/td&gt;&lt;td&gt;Logs show **OpenAI’s crawler fetching llms.txt every ~15 min** on some sites. Google’s Gary Illyes repeats “we won’t support it.”&lt;/td&gt;&lt;td&gt;Search Engine Roundtable&lt;/td&gt;&lt;td&gt;Anecdotal evidence OpenAI is *testing* discovery, not an official endorsement. ([Search Engine Roundtable](https://www.seroundtable.com/openai-crawling-llms-txt-files-39811.html))&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;**Late Jul 2025**&lt;/td&gt;&lt;td&gt;Server‑log studies detect sporadic hits from other AI bots but no sustained utilisation.&lt;/td&gt;&lt;td&gt;ArcherEdu analytics&lt;/td&gt;&lt;td&gt;Suggests experiments, not production use. ([archeredu.com](https://www.archeredu.com/hemj/are-llms-txt-files-being-implemented-across-the-web/?utm_source=chatgpt.com))&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;

&lt;/figure&gt;Where to Go from Here
---------------------

- **Automate deployment** of llms.txt across all properties using your CMS or server configuration.
- **Audit quarterly.** LLM behavior evolves fast—track what’s changed.
 
---

**Bottom line:** `llms.txt` is a good idea in theory, but today’s bots don’t read it. Until adoption improves, your best defense remains `robots.txt` and a clear content policy backed by logs.

FAQ: Understanding llms.txt
---------------------------

### What is llms.txt and who proposed it?

`llms.txt` is a proposed text file format that website owners can place at the root of their domain https://example.com/llms.txt. The goal is to help LLMs to improve discovery and indexation.

&gt; Large language models increasingly rely on website information, but face a critical limitation: context windows are too small to handle most websites in their entirety. Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise.
&gt; Source: https://llmstxt.org/

In addition to this, MD files are used to create raw text versions of pages which allows llm bots to faster crawl and read the content. This is especially important for JS heavy / client side sites.

### Why are they wrong?

While well-meaning, this recommendation overestimates its real-world effect. As shown in our log analysis, **none of the major LLM crawlers** (OpenAI’s GPTBot, Anthropic’s ClaudeBot, PerplexityBot, Meta’s crawler, etc.) currently request the `llms.txt` file. Only traditional SEO crawlers like GoogleBot or BingBot made any contact—and not for training purposes.

So while it may feel proactive, adding `llms.txt` today does almost nothing.

Continue the conversation:
--------------------------

- [Reddit -https://www.reddit.com/r/SEO/comments/1moss0s/llmstxt\_why\_almost\_every\_ai\_crawler\_ignores\_it\_as/](https://www.reddit.com/r/SEO/comments/1moss0s/llmstxt_why_almost_every_ai_crawler_ignores_it_as/)