Category: Generative Engine Optimization Course

ChatGPT Referral Traffic Increased ~60% Per Site: What I Found Across Three Analytics Sources

What changed, in one sentence

Over the last few weeks I have observed an increase in referral traffic coming from ChatGPT to websites: users who run a prompt, see a response, and then click a link that takes them to a brand’s site. Once normalised by the number of active sites, the lift converges on roughly 60% per site across three independent measurement sources.

This article lays out exactly what I measured, how I measured it, where the sources agree and disagree, and what I think it means for brands. I have kept the distinction between what the data shows and what I am inferring explicit throughout.

The headline numbers

The first thing to be clear about is that I am reporting a per-site daily traffic metric, not a raw total. The raw totals were distorted by changes in the customer base during the period (more on that below), so the per-site figure is the cleaner signal.

Per-site daily traffic change, April vs May 2026

Source	Per-site daily traffic change
Adobe Analytics	+60%
GA4	+97%
Optel	+61%

Comparison period: April vs May 2026. Metric: per-site daily pageviews, normalised by active sites. Source: internal Adobe analysis, June 2026 deck, slide 2.

The two figures I would anchor on are the Adobe Analytics and Optel numbers, which sit at +60% and +61%. GA4 is higher at +97%. I treat the aggregate, weighted across sources, as approximately a 60% improvement per site. That is the number I am most comfortable defending.

Why I report per-site and not raw totals

This is the part that matters most for anyone trying to reproduce or sanity-check the result, so I want to be precise about it.

During April to May 2026, the customer base behind each source changed. More customers were onboarded into some sources, and the active site count dropped in another. That movement inflated the raw totals for Adobe Analytics and GA4, and deflated them for Optel. If you look at raw total pageview change alone, you get a misleading picture. Normalising by active site count removes that distortion.

Raw total vs per-site change, by source

Source	Sites/day change	Raw total PV change	PV per site per day change
Adobe Analytics	+28%	+105%	+60%
GA4	+52%	+200%	+97%
Optel	−20%	+28%	+61%

Comparison period: April to May 2026. Metric: percentage change in daily average pageviews. Source: internal Adobe analysis, June 2026 deck, slide 4.

How to read this table

Take Optel as the clearest example. Its raw total pageviews rose only +28%, which looks modest. But its active site count fell by 20% over the same window. Once you divide pageviews by the number of sites actually reporting, the per-site figure is +61%. The raw number understated the real per-site lift because there were fewer sites generating it.

The opposite happened with GA4 and Adobe Analytics: their site counts grew (+52% and +28% respectively), so part of the raw total increase (+200% and +105%) was simply more sites being measured, not more traffic per site. Stripping that out leaves +97% and +60% per site.

The finding

Despite three different customer counts and three different tracking methodologies, the per-site signal converges. Adobe Analytics and Optel land within one percentage point of each other (+60% and +61%), and the aggregate sits around 60%. I consider convergence across independent tools to be the strongest evidence here, because it is unlikely that three separate measurement systems would produce the same artefact for unrelated reasons.

Caveat I want to flag: GA4’s +97% is meaningfully higher than the other two. The deck presents the convergence as “~60% per-site growth,” and I stand by that as the conservative read, but I am not able, from the data provided, to fully explain why GA4 runs higher. It may relate to how GA4 attributes referral sessions versus the other two systems. I am noting this as an open item rather than resolving it.

The uplift was a step-change, not gradual growth

The shape of the increase matters as much as its size. This was not a slow climb. It was a step-change that happened within a single week.

I looked at weekly OpenAI referral pageviews from Optel for January through May 2026. I used Optel for this view specifically because it gave me the largest sample and the longest history.

Weekly OpenAI referral pageviews from Optel (illustrative, indexed)

The chart in the deck plots weekly values on a relative scale from roughly 1.1 to 3.0. The table below reproduces the approximate weekly pattern read from that chart. These are indexed/relative values as shown in the source visualisation, not absolute counts.

Week commencing	Approx. relative value	Phase
Jan 5	1.7	Baseline
Jan 12	1.8	Baseline
Jan 19	1.9	Baseline
Jan 26	1.9	Baseline
Feb 2	1.85	Baseline
Feb 9	1.8	Baseline
Feb 16	1.75	Baseline
Feb 23	1.85	Baseline
Mar 2	1.75	Baseline
Mar 9	2.0	Baseline
Mar 16	2.05	Baseline
Mar 23	2.25	Baseline
Mar 30	2.1	Baseline
Apr 6	2.25	Baseline
Apr 13	2.25	Baseline
Apr 20	1.75	Dip (unexplained)
Apr 27	1.1	Dip (unexplained)
May 4	2.0	Transition
May 11	2.9	Step-change
May 18	3.0	Step-change

Source: internal Adobe analysis, June 2026 deck, slide 3. Values are approximate, read from the source chart, and indexed rather than absolute.

What the chart shows

Three things stand out, and they match the key insights called out in the deck:

A stable baseline from January through April. In absolute terms the deck describes this baseline as roughly 1.7M to 2.3M referral pageviews per week. The click-through rate from responses, and the referral amount, stayed more or less flat across this window.
An unexplained dip in late April. The weeks of April 20 and April 27 show a decrease. I do not have an explanation for this in the data. My honest read, stated as opinion and not fact, is that this looks like experimentation on OpenAI’s side during that window. I cannot confirm that.
A step-change in May. Pageviews jumped to roughly 3 times the average April week by the week commencing May 11. This was a step-change, not gradual growth.

A dating discrepancy I need to flag

There is an inconsistency in my own source materials about exactly when the step-change began, and I would rather surface it than paper over it:

Source	Date attributed to the change
Deck, slide 3 (key insight)	“May 4+” step-change
Video walkthrough	Increase clearly visible “after May eleventh”
Deck, slide 7 (URL Inspector tooltip)	“May 7, 2026”

These three references point to the same event but cite slightly different start dates within early-to-mid May. The weekly granularity of the data is part of the reason: a weekly bar for the week commencing May 4 can capture a change that began mid-week. I am confident the step-change occurred in early-to-mid May 2026 and sustained through late May. I am not able to pin it to a single day from the data I have, and I would not claim a precise date.

Why the traffic increased: brand mentions became clickable links

The natural question is whether users simply started clicking citations more often at random. Based on what I observed, that is not what happened. The mechanism is a product behaviour change in ChatGPT.

Previously, when your brand was cited in a ChatGPT response, you would get your brand name in bold but with no link. The user had to go and search for you separately. Now, ChatGPT is adding a hyperlink to the brand mention itself, pointing to the brand’s official website.

Before vs after

	Before (plain text mention)	After (hyperlinked mention, May 2026)
Answer text	Brand name appears, bolded	Same answer text, brand name now linked
Link behaviour	No link generated; user must search separately	Brand mention is a clickable link to the brand’s home page
Referral result	0 referral sessions	+60% to +97% referral sessions per site

Source: internal Adobe analysis, June 2026 deck, slide 5. Example brand used in the deck illustration: a brand mentioned in an enterprise content management answer.

The way I describe it: this is the same dynamic as the blue links in a Google search results page. When the mention is clickable, the user is far more likely to click through. The change is not limited to formal citations; it now extends to the brand mention in the body of the answer. That is the structural reason the per-site referral numbers moved.

What this means for brands

Here are my conclusions. I have separated each into the finding, the evidence behind it, and the implication, so you can weigh them independently.

Finding 1: ChatGPT is shifting from a brand channel to a traffic channel

Evidence: Before May 2026, brand mentions in ChatGPT generated awareness but produced 0 referral sessions (deck, slide 5). Hyperlinked mentions now convert that brand presence into measurable referral traffic, observed across three sources (deck, slides 2 and 4).
Implication: Presence in ChatGPT answers is no longer only a visibility play. It can now be tracked and reported as a referral acquisition channel. This is structurally new behaviour, not a continuation of the prior pattern.

Finding 2: The signal is robust and confirmed across three independent tools

Evidence: Adobe Analytics, GA4, and Optel all show the same underlying per-site lift once normalised by active site count, converging around 60% (deck, slide 4). The step-change occurred in the week of early-to-mid May and sustained through late May (deck, slide 3). The sample spans thousands of websites.
Implication: This is not a quirk of one dataset or one customer. The convergence across independent methodologies, on a large site sample, is what gives me confidence in reporting it.
Caveat: GA4’s per-site figure (+97%) sits above the other two (+60% and +61%). I report ~60% as the conservative aggregate and flag the GA4 gap as unexplained from the available data.

Finding 3: Most brand mentions are still unlinked, so this is the beginning, not the peak

Evidence: Hyperlinked mentions currently remain a minority of total brand mentions, per the broader analysis referenced in the deck. The ~60% per-site growth therefore reflects a partial rollout (deck, slide 6).
Implication: If hyperlinking continues to broaden across more mentions, referral volume should continue to grow, potentially beyond 60%. I want to be careful here: this is a forward-looking inference based on the direction of the rollout, not a measured result. I am stating it as an expectation, not a fact.

Tracking this in Adobe LLM Optimizer

Because of these traffic increases, we added a feature to Adobe LLM Optimizer inside the URL Inspector. It surfaces Referral Hits from LLMs, drawn from Optel and CDN logs, so that when an event like this release happens on a given date, you can see right away what it was and why it affected your site.

In the product, the May event is annotated directly on the timeline with the note that ChatGPT referral traffic to brand websites increased approximately 60% because ChatGPT started automatically linking brand names in its responses to the brands’ official websites. The goal of the feature is to connect a movement in your referral traffic to the underlying cause without you having to reconstruct it manually.

Disclosure repeated for clarity: this is an Adobe product I work on.

Practical steps you can take

Following the spirit of ending on something actionable, here is what I would check if I were responsible for a brand’s LLM visibility right now:

Segment your referral traffic by LLM source. If you have not already isolated ChatGPT and other LLM referrers in your analytics, do that first. You cannot manage what you are not measuring separately.
Compare April against May 2026 on a per-site or per-property basis. Look at the per-site or per-property figure, not the raw total, especially if your tracked-property count changed during the period. The Optel example above shows how a raw total can hide the real per-site movement.
Check whether your brand mentions in ChatGPT are linked. Run representative prompts for your category and observe whether your brand name appears as a clickable link or as plain bolded text. Linked mentions are still a minority, so there is likely headroom.
Annotate the early-to-mid May 2026 step-change in your own reporting. If you see a jump in LLM referral traffic in that window, this rollout is the most probable explanation based on the data here.
Re-check periodically. Because hyperlinking appears to be a partial, ongoing rollout, the picture is likely to keep changing. Treat this as a moving baseline rather than a settled number.

What I am not claiming

In keeping with separating fact from opinion:

I am not claiming a precise start date for the change. The data supports early-to-mid May 2026, and my own sources disagree on the exact day.
I am not claiming the GA4 +97% figure is the right headline. I report ~60% as the conservative aggregate and treat the GA4 gap as unexplained.
I am not claiming referral volume will definitely exceed 60% in future. That is a forward-looking expectation based on the rollout being partial, not a measurement.
I am not disclosing customer identities, and the figures are normalised and aggregated rather than tied to any single named site.

Summary

Across Adobe Analytics, GA4, and Optel, I observed an increase in ChatGPT referral traffic in May 2026 that converges on roughly 60% per site once normalised by active site count. The increase was a step-change in early-to-mid May, not gradual growth, and the mechanism behind it is ChatGPT turning brand mentions into clickable hyperlinks rather than plain text. Because most brand mentions are still unlinked, I expect this channel to keep developing. I have flagged the open questions, the dating discrepancy, and the limits of what the data supports so that anyone can challenge or reproduce the analysis.

Methodology: comparison of per-site daily pageviews, April vs May 2026, normalised by active site count, across three measurement sources (Adobe Analytics, GA4, Optel). Weekly trend view drawn from Optel, January to May 2026, chosen for the largest sample and longest history. Sample spans thousands of websites. Figures are aggregated and normalised; no individual customers are identified. Source materials: internal Adobe analysis deck “ChatGPT referral traffic increase” (June 2026) and accompanying video walkthrough.

If you spot an error in this analysis, please get in touch via longato.ch and I will investigate and publish a visible correction once verified.

Disclosure. I work at Adobe on the LLM Optimizer team. The data, screenshots, and product references in this article come from my work there. All views expressed are my own and do not represent those of my employer. This article is based on an internal analysis I presented in June 2026 (deck: ChatGPT referral traffic increase, Adobe LLM Optimizer) and the accompanying walkthrough video. Where I infer beyond the data, I say so. Customer names are not disclosed.

June 8, 2026

LLMs.txt – What You Need to Know: The Largest Audit to Date from Adobe AEM

Published: June 2026 · longato.ch Companion piece: this article updates and extends my earlier write-up, llms.txt: my recommendation, August 2025.

The five findings you can quote

“Create llms.txt because it is cheap and Google is now looking at it, not because it will get you cited in ChatGPT today.”

“Across 22,494 recorded requests to /llms.txt over a 30-day window, agents that are verifiably large language models accounted for 258 hits, which is 1.1% of all traffic to the file.”

“The single biggest change since my August 2025 audit is Googlebot. It is now the largest named crawler hitting /llms.txt, with 1,219 recorded requests.”

“92.2% of all /llms.txt traffic came from agents that are neither mainstream search engines nor verifiable LLMs. The file’s main audience today is SEO tooling, monitoring services, and AI-readiness auditors inspecting the file, not models consuming it.”

“OpenAI’s user-facing and search agents, OAI-SearchBot and ChatGPT-User, generated 209 hits across roughly 69 hosts. That is the totality of OpenAI’s interest in /llms.txt in this dataset.”

“In a direct referrer analysis I found zero requests anywhere in the logs, search bots included, that carried /llms.txt as their referrer. Whatever crawlers do after reading the file, they do not arrive at other URLs from it in any way the logs can see.”

What changed since August 2025

My August 2025 analysis examined the same question on the same kind of footprint. The qualitative shift over the intervening period is best shown side by side.

August 2025 against June 2026

Dimension	August 2025 (prior analysis)	June 2026 (this audit)	Direction of change
Googlebot hitting `/llms.txt`	Not a meaningful presence	1,219 hits, the largest named crawler at the file	Major increase
Verifiable LLM hits to `/llms.txt`	Negligible	258 hits, 1.1% of all traffic	Still negligible as a share
OpenAI-specific interest	Minimal	209 hits from OAI-SearchBot and ChatGPT-User, about 69 hosts	Slightly up, still tiny
Dominant traffic source	Already non-LLM	Other / unverified tooling at 92.2%	The bucket has grown and professionalised
Self-labelled audit and readiness bots	Emerging	60.1% of all traffic	New, large category
Referrals originating from `llms.txt`	None observed	Still none observed	Unchanged
Crawler entry point	Homepage-led	Homepage-led	Unchanged

Sources: my prior published analysis from August 2025 for the “before” column, and Datasets C and D plus the referrer analysis for the “after” column.

The most material change is Googlebot’s arrival at /llms.txt in volume. This is consistent with a wider observation in the SEO community. Martina Raissle has noted publicly on LinkedIn that Google has begun including llms.txt in its Lighthouse checks, which is itself a signal that the file is at least on Google’s radar.

I want to be careful about what this does and does not prove. Googlebot fetching a URL is not proof that the content is used for ranking, AI Overviews, or AI Mode. A fetch is a fetch. But it is a clear change from a year ago, and combined with the Lighthouse inclusion, it is the first concrete sign from a major provider that llms.txt is being looked at rather than ignored. I weight this as worth acting on cheaply, not as proven to work, and my recommendation below reflects that.

My recommendation

This is my professional judgement, grounded in the data above.

Recommendation summary

#	Recommendation	Supporting evidence	Confidence
1	Create the `llms.txt` file	Googlebot is now the largest named crawler at the file, 1,219 hits; Google has added it to Lighthouse checks	Moderate
2	Treat it as low-effort insurance, not a growth lever	Generating the file is cheap; the return is asymmetric if providers do begin to use it	High, on the cost logic
3	Do not expect it to move LLM brand visibility or citations today	Verifiable LLMs account for 1.1% of hits; no referrer trail exists	High
4	Keep investing in homepage strength and internal linking	Crawlers enter via the homepage and follow links	High
5	Watch Google AI Mode and AI Overviews specifically	Google’s fetching plus Lighthouse inclusion is the only mover in a year; impact there is plausible but unproven	Low, speculative

In plain terms: create the file, because Google is now hitting it, and that alone changes the calculus from a year ago. The effort is minimal, so the return on investment is favourable if the providers do in fact consume it; you are buying a cheap option on an uncertain upside. Will it move LLM brand visibility or citations? Probably not, not yet. The traditional consumer LLMs such as ChatGPT are not meaningfully using the file on this evidence, and the honest answer is that the consumption simply is not there at the scale that would move citations. Will it affect Google’s AI Mode? Maybe. Google is the one provider showing changed behaviour. I would not bet the strategy on it, but I would not ignore it either.

What llms.txt is?

llms.txt is a proposed Markdown file placed at the root of a domain, for example https://example.com/llms.txt. The llmstxt.org proposal frames it as a curated, machine-readable map: a short summary of the site plus a hand-picked list of the most important pages, often with companion .md versions of those pages, so that a large language model can find and ingest the high-value content without crawling the entire site or fighting through navigation, scripts, and boilerplate. The analogy its proponents draw is to robots.txt and sitemap.xml: a small, conventional file at a predictable path that machines can rely on. The crucial difference is that robots.txt and sitemap.xml are honoured by documented, identifiable crawlers, whereas llms.txt only delivers value if the LLM providers choose to read it. Whether they do is precisely the question this audit set out to answer with logs rather than opinion.

Why I ran this LLMs.txt audit

Two pressures converged.

The first was a recurring question from customers. I was being asked, on a roughly weekly cadence, whether llms.txt was actually being used, and whether it was worth the effort of generating and maintaining. That is a fair question, and it deserves a data-backed answer rather than a shrug.

The second was the state of the GEO and AEO conversation. The generative-engine-optimisation and answer-engine-optimisation community has been circulating a lot of confident, contradictory, and frequently unsourced claims about llms.txt: that the major models definitely read it, that it definitely boosts citations, or conversely that it is completely ignored. Both extremes tend to be asserted without server logs to back them. The only responsible move was to look at what bots actually do at the file, at scale.

This is, to my knowledge, the largest single llms.txt server-log and crawl audit conducted to date by number of distinct domains and by volume of bot traffic examined. The domains analysed are real customer sites hosted on Adobe Experience Manager, and they include some of the world’s largest websites, which is what makes the bot behaviour observed here representative rather than anecdotal.

“Most public claims about llms.txt are made without real analysis. This audit is my attempt to replace assertion with measurement, at the largest domain scale I am aware of.”

Methodology, scope, and caveats

Here is the setup in full so that the findings can be challenged or replicated.

Working with a server log file analysis tool, plus a large-scale crawl of /llms.txt paths, I assembled four datasets:

Dataset	Purpose	Rows	Key fields
A, domain scope log	Which hosts received bot traffic, and how many distinct bots and agents each saw	6,122 hosts	`origin_host`, `hits`, `distinct_bots`, `distinct_agents`, `first_seen`, `last_seen`
B, llms.txt existence crawl	Whether `/llms.txt` actually resolves on each host, and what it returns	5,553 crawl rows (4,819 distinct URLs, 4,685 distinct hosts)	`Address`, `Status Code`, `Content Type`, `Word Count`, `Size (Bytes)`, `Crawl Timestamp`
C, llms.txt hits by host and agent	Every recorded request to `/llms.txt`, split by host and full user-agent string	6,749 rows	`Host`, `request_user_agent`, `hits`
D, llms.txt hits by agent type	The same hit volume, pre-classified by agent family	237 rows	`User Agent Type`, `User Agent Name`, `Full User Agent`, `Hits`

The hit data in Datasets C and D covers a 30-day window. The crawl in Dataset B carries crawl timestamps dated 29 May 2026.

The four questions I set out to answer were:

How many domains have a live llms.txt file?
When an LLM reads llms.txt, does it then crawl the .md files it lists?
How are LLMs actually finding the pages they crawl?
Are there any referrals coming from llms.txt?

A few caveats, stated openly:

User-agent strings are self-declared. Any bot can claim to be anything. I classify “verifiable LLM” conservatively, counting only agents that match the documented user agents of known model providers such as OpenAI, Anthropic, Perplexity, and You.com. Hits in the “Other / unverified” bucket may include real AI activity behind generic strings, but I will not count what I cannot verify.

Datasets C and D contain no per-event timestamp column. The 30-day window is the query window the data was extracted under; it is not re-derivable from inside the files.

Dataset A’s first_seen and last_seen values span a short capture interval, about five minutes on 28 May 2026, which tells me these are sampling markers from one extract rather than the full 30-day span. I therefore use Dataset A only for structural facts such as host counts and bot diversity per host, and never to infer time-based volume.

The tables below are summary tables. I am not releasing the raw logs. The figures are reproducible in principle by anyone running the same crawl and the same log query.

How many domains actually have an llms.txt file?

This is where precision matters most, because “has an llms.txt” is not a single thing. A request to /llms.txt can return a real Markdown file, a redirect, a 404, a soft-200 HTML page, or an empty 200. I broke Dataset B down by HTTP status.

HTTP status of /llms.txt across 4,685 distinct hosts

Status code	Meaning	Crawl rows	Share of rows
404	Not found (no file)	4,270	76.9%
301	Permanent redirect	606	10.9%
200	OK (file served)	175	3.2%
403	Forbidden	174	3.1%
302	Temporary redirect	149	2.7%
0	No response or connection failure	90	1.6%
401	Unauthorised	47	0.8%
406	Not acceptable	28	0.5%
(blank)	No status captured	12	0.2%
410	Gone	1	under 0.1%
307	Temporary redirect	1	under 0.1%
Total		5,553	100%

Source: Dataset B, Status Code column. The row count includes 734 duplicate URLs, which I deduplicated before counting hosts.

A 200 response is necessary but not sufficient to call something a real llms.txt. Many 200s are HTML catch-all pages or empty bodies. So I tightened the definition in two further steps.

How many of the 200 responses are genuinely an llms.txt file?

Definition (progressively stricter)	Distinct hosts	Share of 4,685 probed	Share of 6,122 scope-file hosts
Any HTTP 200 at `/llms.txt`	137	2.92%	2.24%
200 and `Content-Type: text/plain`	111	2.37%	1.81%
200 and word count above zero	20	0.43%	0.33%

Source: Dataset B, Status Code plus Content Type plus Word Count columns.

Depending on how strictly you define “has a working llms.txt“, the answer ranges from 137 hosts for any 200, down to 111 hosts for files served as plain text, and as low as 20 hosts for plain-text files with actual measurable content. The 23 responses that returned a 200 with an HTML content type are almost certainly not real llms.txt files at all.

“Of 4,685 domains probed, only 137 returned a 200 at /llms.txt. Tighten the definition to plain text with real content and the number collapses to 20. Adoption is not just low, much of the apparent adoption is hollow.”

Data-quality notes for the existence crawl

Issue	Detail	How I handled it
Duplicate URLs	5,553 rows but 4,819 distinct addresses, so 734 duplicate rows	Deduplicated to distinct hosts before counting
Soft-200 HTML	23 of 175 200-responses were `text/html`, not a text file	Excluded from the strict definitions
Empty 200s	155 of 175 200-responses had a word count of zero	Reported separately and flagged as likely empty or placeholder
Word-count range on real files	The 20 non-empty files ran from 2 to 69 words	Reported; even the “real” files are extremely short

A word count between 2 and 69 on the files that do have content tells me most of these are minimal stubs, a title and a couple of links, rather than the rich, curated index the llmstxt.org proposal envisions. Adoption is shallow on both axes: few sites have the file, and few of those have populated it meaningfully.

Do LLMs crawl the .md files, and are there any referrals from llms.txt?

These two questions share one answer, and it comes from a direct analysis of the referrer field in the logs.

I did not find a single request anywhere in the server logs whose referrer was a /llms.txt URL. This held across all bot types, search engines included, not only LLM agents.

There are two possible explanations, and the logs alone cannot distinguish between them. Either the bots do not crawl immediately: they may read llms.txt, archive or queue what they find, and crawl later in a separate session that carries no referrer linking back to the file. Or the referrer is simply not preserved: bots may crawl the listed .md files but not populate the referrer header with the llms.txt URL.

Either way, the practical consequence is the same. There is no observable evidence in the logs that llms.txt is functioning as a crawl-routing hub. If llms.txt were doing the job its proposal describes, feeding models a list of URLs that they then fetch, I would expect to see at least some referrer trail. I see none.

How are LLMs actually finding pages to crawl?

From the same referrer analysis: when bot requests did carry a referrer, it was, in the overwhelming majority of cases, the homepage of the domain.

The behavioural picture is that crawlers, including AI crawlers, predominantly enter a site at the homepage and discover the rest of the site by following links from there, exactly as classical web crawlers always have. They are not, on this evidence, entering via llms.txt and fanning out from its curated list. The homepage and its internal linking remain the primary discovery surface. This is a strong argument that the fundamentals of crawlability and internal linking still matter far more than a curated llms.txt for getting your content seen.

“On the referrer evidence, AI crawlers behave like classical crawlers. They enter at the homepage and follow links. llms.txt is not the front door.”

Who is actually hitting llms.txt? The 22,494-hit breakdown

This is the heart of the audit. Dataset D pre-classifies every recorded hit by agent family, and Dataset C lets me verify that classification against the raw user-agent strings. The two reconcile to the same total, 22,494 against 22,493, a one-hit difference from how the two extracts were generated.

/llms.txt hits by agent type, 30-day window

User-agent type	Hits	Share
Other / unverified	20,746	92.2%
Search engine	1,434	6.4%
LLM / AI (verifiable)	258	1.1%
SEO / crawlers (declared)	36	0.2%
Dataset / training	13	0.1%
Social / preview	7	under 0.1%
Total	22,494	100%

Source: Dataset D, User Agent Type by Hits.

Hits by named agent (the agents that are identifiable)

Named agent	Operator family	Hits
Googlebot	Search engine	1,219
OAI-SearchBot	OpenAI	153
BaiduSpider	Search engine	127
ChatGPT-User	OpenAI	56
Amazonbot	E-commerce / AI	38
Bingbot	Search engine	36
GPTBot	OpenAI (training)	33
AhrefsBot	SEO tool	28
Applebot	Search / AI	13
Bytespider	ByteDance	12
ClaudeBot	Anthropic	10
SemrushBot	SEO tool	6
Facebook External Hit	Social preview	5
PerplexityBot	Perplexity	4
Meta ExternalAgent	Meta	2
Perplexity-User	Perplexity	1
YouBot	You.com	1
CCBot	Common Crawl	1

Source: Dataset D, User Agent Name by Hits, excluding the “Unknown” aggregate of 20,746.

The verifiable LLM/AI agents in full

LLM/AI agent	Hits
OAI-SearchBot (OpenAI)	153
ChatGPT-User (OpenAI)	56
GPTBot (OpenAI training)	33
ClaudeBot (Anthropic)	10
PerplexityBot (Perplexity)	4
Perplexity-User (Perplexity)	1
YouBot (You.com)	1
Total verifiable LLM/AI	258

Source: Dataset D, User Agent Type = LLM / AI.

“Strip out the search engines and the unverifiable bots, and the entire verifiable-LLM interest in llms.txt, across a 30-day window on thousands of domains, amounts to 258 requests. Anthropic, Perplexity, and You.com combined: 16.”

What is the 92% actually made of?

The unverified bulk deserves scrutiny rather than a dismissive label. Using Dataset C’s raw user-agent strings, I found that it is dominated by a long tail of self-described tooling: site-statistics bots, monitoring bots, SEO site-audit crawlers, and a striking number of agents whose own user-agent strings advertise that they exist to audit or check llms.txt and AI-readiness.

Composition of /llms.txt traffic by operator family (raw-string classification)

Operator family	Hits	Share	Distinct hosts touched
Other / unverified (tooling, monitors, auditors)	20,772	92.3%	3,134
Google	1,227	5.5%	319
OpenAI	242	1.1%	69
Baidu	127	0.6%	36
Amazon	38	0.2%	12
Microsoft / Bing	35	0.2%	20
Apple	13	0.1%	13
ByteDance	12	0.1%	5
Anthropic	12	0.1%	11
Meta	8	under 0.1%	4
Perplexity	5	under 0.1%	5
You.com	1	under 0.1%	1
Common Crawl	1	under 0.1%	1

Source: Dataset C, full user-agent strings classified by operator. Minor differences from the agent-type table reflect the raw-string method counting AdsBot-Google and similar agents under their parent family.

Two concentration facts stand out. The top ten user-agent strings alone accounted for 17,569 of 22,493 hits, which is 78.1% of all traffic to the file. And agents whose user-agent string self-labels with terms such as audit, monitor, readiness, llms.txt, crawler, GEO, or research represented 105 distinct agents and 13,508 hits, which is 60.1% of all traffic.

“60% of all traffic to llms.txt came from agents that openly describe themselves as auditors, monitors, or readiness-checkers. The file’s biggest use case right now is being inspected to see whether it exists, a self-referential market rather than consumption by models.”

This is the most under-reported reality of llms.txt in mid-2026. Raw hit counts on the file are rising, and it is tempting to read that as LLMs adopting it. The composition says otherwise. A large share of the traffic is the GEO ecosystem checking itself: tools verifying that a customer has the file, monitors polling for changes, readiness-scanners selling the idea that the file matters. That activity is real, but it is not evidence that any model is using the file to answer questions.

Host-level reality check

Beyond raw hits, I cross-referenced which hosts have a real file against which hosts received any /llms.txt traffic.

Hosts: file presence against received traffic

Measure	Count
Hosts returning 200 at `/llms.txt` (www-normalised)	130
Hosts that received at least one `/llms.txt` request (www-normalised)	2,649
Hosts that both have a file and received a hit	80
Hosts that have a file but recorded no hit	50
Distinct hosts receiving any `/llms.txt` hit (raw)	3,236

Source: Datasets B and C, joined on www-normalised host.

Two things stand out. First, the vast majority of /llms.txt requests land on hosts that do not even have the file: bots and tools are probing for it speculatively and hitting 404s. Second, of the hosts that do have a real file, more than a third, 50 of 130, saw no recorded hit at all in the window. Presence and attention are only loosely coupled.

Limitations and an invitation to challenge

Here is where this audit stops short.

User agents are self-declared, so the 92.2% Other bucket could hide real AI activity behind generic strings. I have deliberately under-counted LLM activity rather than over-claim it. The hit datasets carry no per-event timestamps, so the 30-day window is the extraction window rather than a field I can re-derive. Fetched does not mean used: nothing in server logs can prove that any provider used llms.txt content in a model output, because logs show requests, not downstream use. This is a snapshot, a single 30-day window compared qualitatively to a prior one, not a continuous time series. And referrer behaviour is provider-dependent, so the absence of a referrer trail is strong evidence of no observable routing rather than absolute proof that no provider ever crawls from the file.

If you can replicate, extend, or contradict any of this with your own logs, I want to hear about it. I will investigate and publish a visible correction if anything here proves wrong.

Frequently asked questions

How many websites actually have an llms.txt file? In this audit, of 4,685 domains probed, 137 returned a working 200 response at /llms.txt, which is about 2.9%. If you require the file to be served as plain text the number is 111, and if you require it to contain real content it drops to 20.

What percentage of websites have llms.txt? On this AEM-hosted sample, between 0.4% and 2.9% depending on how strictly you define a working file. The headline figure of 2.9% counts any 200 response; the strict figure of 0.4% counts only plain-text files with measurable content.

Do large language models actually read llms.txt? Rarely, on this evidence. Verifiable LLM agents accounted for 258 of 22,494 requests to the file, which is 1.1% of all traffic, over a 30-day window across thousands of domains.

Does ChatGPT use llms.txt? OpenAI’s search and user agents, OAI-SearchBot and ChatGPT-User, made 209 requests across roughly 69 hosts. That is real but tiny, and there is no evidence in the logs that the file drives any onward crawling.

Does Google use llms.txt? Googlebot is now the single largest named crawler hitting the file, with 1,219 requests. Google has also begun including llms.txt in Lighthouse checks. A fetch is not proof of use in ranking or AI features, but it is a clear change from a year ago.

Does Gemini or Google AI Mode use llms.txt? I cannot confirm this from the data. What I can confirm is that Googlebot is fetching the file. Whether that content feeds AI Mode or AI Overviews is plausible but unproven on these logs.

Does Claude use llms.txt? Anthropic’s ClaudeBot made 10 requests to the file across the entire dataset. That is negligible.

Does Perplexity use llms.txt? Perplexity’s agents made 5 requests in total, PerplexityBot and Perplexity-User combined. That is negligible.

Is llms.txt worth creating in 2026? My view is yes, but as cheap insurance rather than a growth lever. It costs little to create, Google is now hitting it, and the upside is asymmetric if providers begin to consume it. Do not expect it to move LLM citations today.

Will llms.txt improve my rankings? There is no evidence in this data that it does. Crawlers enter via the homepage and follow internal links. Classical crawlability and internal linking remain far more important.

Will llms.txt get my brand cited in AI answers? Probably not at present. The models that drive consumer AI answers are barely touching the file, and there is no observable crawl activity downstream of it.

Do LLMs crawl the .md files listed in llms.txt? There is no evidence that they do so directly from the file. I found zero requests whose referrer was an llms.txt URL, so either crawlers do not crawl immediately after reading it, or they do not preserve the referrer.

How do LLMs and AI crawlers find pages to crawl? Predominantly via the homepage. When requests carried a referrer it was almost always the domain homepage, indicating crawlers enter there and follow internal links, exactly as classical crawlers do.

Should llms.txt be plain text or HTML? Plain text. In this audit, 23 of the 175 200-responses were served as HTML, and those are almost certainly catch-all pages rather than real llms.txt files. A real file should return text/plain.

Why do so many llms.txt requests return a 404? Because most sites do not have the file. In this crawl, 76.9% of probed URLs returned a 404. Many bots and tools probe for /llms.txt speculatively and simply hit a missing file.

What bots hit llms.txt the most? The largest single sources are unverified tooling and monitoring bots, followed by Googlebot as the largest named crawler. The top ten user-agent strings alone made up 78.1% of all traffic to the file.

Are most llms.txt hits really from AI models? No. 92.2% of traffic came from agents that are neither mainstream search engines nor verifiable LLMs, largely SEO tools, monitors, and AI-readiness auditors. Only 1.1% came from verifiable LLMs.

What is an llms.txt auditor bot? It is a crawler, often from a GEO or SEO tool, whose purpose is to check whether a site has an llms.txt file and report on it. In this dataset, agents that self-label as auditors, monitors, or readiness-checkers accounted for 60.1% of all traffic to the file.

Does having an llms.txt file guarantee bots will read it? No. Of the 130 hosts with a real file, 50 recorded no hit at all in the window. Presence and attention are only loosely coupled.

How big should an llms.txt file be? The proposal envisions a curated index, but in practice the files that had content in this audit were very short, between 2 and 69 words, suggesting most are minimal stubs. Aim for a genuinely useful, curated list of your most important pages rather than a token file.

Is llms.txt the same as robots.txt or sitemap.xml? It is similar in concept, a small conventional file at a predictable path, but different in standing. robots.txt and sitemap.xml are honoured by documented crawlers, whereas llms.txt only delivers value if model providers choose to read it, and on this evidence most do not yet.

Did anything change with llms.txt between 2025 and 2026? The biggest change is Google. Googlebot went from a non-presence to the largest named crawler at the file, and Google added it to Lighthouse. Everything else stayed roughly the same: verifiable LLM usage remained negligible, and no referrer trail from the file appeared.

Is this the largest llms.txt study? To my knowledge, yes, by number of distinct domains and by volume of bot traffic examined. The data comes from real customer domains hosted on Adobe Experience Manager, including some of the world’s largest websites.

Where does the data in this analysis come from? From server-log and crawl data across customer domains hosted on Adobe Experience Manager, analysed with a server log file analysis tool over a 30-day window, with a companion crawl of /llms.txt paths dated 29 May 2026.

How was the data anonymised? No customer, brand, or third-party vendor names appear anywhere in this article. Every identifier has been removed and replaced with a neutral category label, and only aggregate summary figures are published.

Can I reproduce these findings myself? Yes, in principle. Crawl /llms.txt across your domain set and record status, content type, and word count; query 30 days of server logs for requests to /llms.txt grouped by host and user-agent string; classify user agents conservatively; and separately query the referrer field for any request whose referrer is /llms.txt.

What is the single most important takeaway? That raw hit counts on llms.txt are misleading. Most of the traffic is the GEO ecosystem checking itself, not models consuming the file. Create the file because it is cheap and Google is now looking at it, but keep your real investment in homepage strength and internal linking.

A note on the data and on disclosure. The findings below come from server-log and crawl data across customer domains hosted on Adobe Experience Manager (AEM). I analysed this data directly using a server log file analysis tool. I work in this field, and all views expressed here are my own and do not represent those of my employer. No customer, brand, or third-party vendor names appear anywhere in this article. Every identifier has been removed and replaced with a neutral category label.

Written by Flavio Longato and published June 2026 on longato.ch. All views my own and not those of my employer. Companion analysis: llms.txt, my recommendation, August 2025. Spotted an error? Get in touch via longato.ch and I will publish a visible correction.

June 1, 2026

What Is LLM Crawling and Why Does It Matter?
Large language models now crawl websites much like search engines do. But many site owners have no idea their pages are invisible to these systems. If your content cannot be read by an LLM, you lose a growing source of traffic and citations.

I have spent years working on technical SEO, and I can tell you that the overlap between search engine optimisation and LLM readability is huge. The same foundations that help Google read your site also help ChatGPT, Perplexity, and other AI tools find and reference your content. Yet there are key differences that catch people off guard.

How LLMs Crawl and Process Web Content

LLM crawling follows a familiar pattern. A bot visits your site, fetches your pages, and reads the content. In traditional SEO, we talk about crawling, indexing, and ranking. With LLMs, the steps are crawling, tokenisation, and rendering. The bot arrives, collects the text, breaks it into tokens, and stores it for later use in responses.

If a page cannot be crawled or read, no AI system will use it as a source. That means no citations, no referrals, and no visibility in AI-generated answers. This is a real problem for businesses that rely on organic discovery. According to Google’s crawler documentation, the basic principles of making content accessible to bots have not changed much. But LLMs add a few new wrinkles.

Common Technical Blockers

Several technical issues stop LLMs from seeing your content. The most common one is robots.txt. When LLMs first appeared around 2023 and 2024, many website owners blocked AI crawlers out of fear. They worried that models would absorb their content without giving credit. Now it is 2026, and that stance is counterproductive. More people use LLMs every day. Blocking these bots means you opt out of a real traffic channel.

Another blocker that surprised many site owners was CDN default settings. Cloudflare, for example, started blocking LLM bots by default for new customers in late 2025. If you use a CDN, check your bot management settings. You might be blocking AI crawlers without knowing it. In your server logs or monitoring tools, this shows up as a string of 403 or 404 errors for known LLM user agents.

Other blockers include:
- Inconsistent canonical tags that waste crawl budget
- URL parameters creating duplicate pages
- Content behind logins or paywalls
- Heavy interstitials that block the page content
These are familiar problems in SEO. But with LLMs, the tolerance is even lower. A search engine might still manage to parse a messy page. An LLM bot often will not bother. As Search Engine Journal explains, crawl budget matters for every type of bot, not just Googlebot.

Why JavaScript Rendering Is the Biggest Problem

Here is my contrarian take: the single biggest barrier to LLM visibility is not robots.txt or CDN settings. It is client-side JavaScript rendering. Most people in the SEO world stopped worrying about JavaScript a couple of years ago because Google got very good at rendering it. That gave everyone a false sense of security.

LLMs do not render JavaScript the way Google does. When an LLM bot visits a page, it typically reads the raw HTML without executing scripts. If your content loads through React, Angular, Vue, or any other client-side framework, the bot may see an empty shell. I have personally audited sites where only 70 to 75 percent of the page content was visible to LLM crawlers. That is a huge chunk of missing information.

From my own experience building and managing websites early in my career, I know how painful it is to fix rendering issues at the infrastructure level. You need developer resources, time, and tickets that sit in a backlog for months. Server-side rendering or static site generation is the proper fix, but it is slow to implement. Edge rendering solutions offer a faster workaround. They pre-render your pages and serve the full HTML to LLM bots, pushing visibility from partial to complete.

How to Check Your LLM Visibility

You should not guess whether LLMs can see your content. Test it. One practical method is to compare the word count of a fully rendered page (what a human browser sees) against what an LLM bot receives (the raw HTML response). A large gap means you have a rendering problem.

Browser extensions and specialised tools can automate this comparison. They highlight exactly which sections of your page are invisible to AI crawlers. This gives you hard data to bring to your development team. Instead of saying “we think there is a problem,” you can say “42 percent of our product page content is hidden from LLM bots, and here is the proof.”

You should also review your robots.txt file and check for any directives that block known LLM user agents like GPTBot, ClaudeBot, or PerplexityBot. A quick audit of your CDN settings is equally important.

Looking Ahead

LLM crawling is not a passing trend. It is becoming a standard part of how people find information online. The sites that treat LLM readability as a first-class concern today will have a clear advantage as AI-driven search grows. Those that ignore it will watch their content disappear from an increasingly important channel.

The good news is that most fixes are straightforward. Unblock your robots.txt, check your CDN, and address JavaScript rendering gaps. These are not exotic tasks. They are the same kind of technical hygiene that good SEO has always demanded. The difference now is that the audience includes machines that summarise, cite, and recommend your content to millions of users.
February 5, 2026
What Is the Difference Between AI Mentions and Citations?

If you have been paying attention to how AI tools like ChatGPT or Google Gemini respond to user queries, you have probably noticed that some brands appear in the text while others get a clickable link at the bottom. These are two very different things. One is a mention. The other is a citation. And the distinction matters more than most marketers realise.

I have spent the past year studying how large language models reference brands and websites in their outputs. What I have found is that many SEO professionals conflate mentions and citations, treating them as interchangeable. They are not. Understanding the gap between the two is essential if you want your brand to show up properly in AI-generated answers.

What Counts as a Mention in AI Answers

A mention happens when an AI model includes your brand name or product name in the body of its response. For example, if you ask ChatGPT “how to edit PDFs,” it might write something like “Adobe Acrobat is a popular tool for editing PDF files.” That is a mention. Adobe and Acrobat appear in the text, but there is no link pointing back to Adobe’s website.

Mentions come from the model’s training data. The AI has processed billions of web pages and learned associations between brands and topics. It knows that Adobe is connected to PDF editing because that relationship appeared thousands of times across the data it was trained on. The model is not fetching this information live from the web. It is recalling patterns from its training.

This is an important point. A mention does not mean the AI visited your website or verified your content. It simply means your brand is strongly associated with a given topic in the model’s learned knowledge. You could have zero indexable pages and still get mentioned if your brand is well-known enough.

How Citations Work Differently

A citation is something else entirely. It occurs when the AI links to your page as a supporting source for its answer. This typically happens through retrieval-augmented generation (RAG), where the model actively searches the web or a defined index to pull in fresh information before composing its response.

When a system like Bing Chat or Google’s AI Overview performs a live search, it retrieves web pages, extracts relevant information, and then weaves that into its answer. The pages it pulled from get listed as citations, usually with clickable links. This is a much stronger signal than a mention because it means the AI treated your content as evidence.

Think of it this way. A mention says “this brand exists and is relevant.” A citation says “this specific page helped me answer the question.” The difference in value is significant for anyone thinking about generative engine optimisation.

Why Most Marketers Get This Wrong

Here is my contrarian take. Most of the current discourse around “AI SEO” focuses too heavily on getting mentioned. People celebrate when ChatGPT name-drops their brand. But a mention without a citation is a bit like being talked about at a party without anyone knowing your address. It builds awareness, sure. But it does not drive traffic or prove authority in the way a citation does.

I have seen brands with strong mentions but almost no citations. Their names appear in AI answers, yet the models never link back to their actual content. This usually happens when a brand is famous but its web pages are not structured well enough to be retrieved by RAG systems. The opposite also exists. Smaller, well-optimised sites earning citations despite having lower brand recognition.

The practical lesson here is that optimising for citations requires a different approach than optimising for mentions. Mentions grow from brand awareness and PR. Citations grow from having well-structured, authoritative, and schema-marked content that RAG systems can easily retrieve and verify.

What This Means for Your Strategy

If you are serious about showing up in AI-generated results, you need to work on both fronts. For mentions, focus on building genuine brand authority across the web. Get covered by reputable publications. Build a strong presence on platforms that LLMs are trained on. This is long-term brand building.

For citations, the work is more technical. Make sure your pages are crawlable, fast, and clearly structured. Use proper headings. Include factual, verifiable claims. According to Google’s own E-E-A-T framework, content that demonstrates first-hand experience and expertise is more likely to be deemed trustworthy. RAG systems appear to follow similar logic when selecting which sources to cite.

From my own testing, pages that answer specific questions clearly and concisely tend to earn more citations than long, rambling guides. The AI is looking for evidence, not filler. Give it a clean answer it can point to.

The brands that will win in this new era are the ones that understand both signals and treat them as complementary. Mentions build the top of funnel. Citations build the trust. Get both right, and you are well positioned regardless of how AI search evolves from here.

January 23, 2026
What Is AI Visibility Score and How Do You Measure It

If you have been working on getting your brand visible inside AI-generated answers, you have probably come across the term “visibility score.” It sounds straightforward, but the reality is messier than most people expect. I have spent a fair amount of time testing different AI visibility tools, and I want to share what I have learned about what this metric actually means and which supporting numbers you should watch alongside it.

What a Visibility Score Actually Tells You

A visibility score is an aggregate metric. It rolls up several signals into a single number that represents how often and how prominently your brand appears across a set of AI prompts. The inputs typically include whether you were mentioned, whether a citation pointed back to your site, where in the answer your brand appeared, and the sentiment of the mention.

The trouble is that every tool calculates it differently. There is no universal standard. LLM Optimizer, for instance, weights mentions, citations, URL presence, position (first, second, third, fourth), and sentiment into a composite figure. A brand that gets mentioned first with a positive tone and a backlink scores far higher than one that appears third with no citation and a neutral tone. Other platforms may skip sentiment entirely or weigh position differently.

This lack of standardisation is something I think the industry needs to address quickly. If you compare your score across two different tools, you might get wildly different numbers for the same set of prompts. That makes benchmarking against competitors tricky unless everyone agrees on one platform.

A Real-World Example of How Scores Break Down

Let me walk through a practical case. Take the prompt “how to make the perfect espresso shot.” In LLM Optimizer, a brand tracking that prompt might see a visibility score of around 22. Why so low? Because the brand was mentioned but had no citation link. The sentiment was neutral, not negative, which helps, but the absence of a URL pointing back to the site drags the score down considerably.

The ideal scenario would be a mention in the first position, a direct citation to your website, and positive sentiment. That combination pushes you towards 100%. In my experience, very few brands consistently hit that ceiling across a broad prompt set. The ones that do tend to have strong topical authority and structured data that AI models find easy to reference. According to research from Search Engine Land, brands that invest in entity-based SEO tend to perform better in AI-generated results precisely because large language models favour well-structured, authoritative sources.

Why Visibility Score Alone Is Not Enough

Here is where I hold a view that goes against the grain. Many marketers treat visibility score as the single north-star metric for AI search performance. I think that is a mistake. The score is too broad to act on directly. If your visibility score drops by ten points this week, what exactly do you fix? The number itself does not tell you.

You need to pair it with more granular metrics. Brand mentions over time show you whether your presence is growing or shrinking. Citation tracking tells you if AI models are actually linking back to your content. Agentic traffic and referral data from tools like Google Analytics reveal whether those AI mentions translate into real visits. Without these supporting signals, you are flying blind with a single number that could move for a dozen different reasons.

I have been doing SEO and digital marketing for over fifteen years, and every time a new “single metric” emerges, teams fixate on it at the expense of nuance. Visibility score is useful for board-level reporting, but the actual optimisation work happens when you drill into the components beneath it.

Do Not Forget AI Features in Traditional Search

One detail that often gets overlooked is that AI features inside traditional search results, such as Google’s AI Overviews, are frequently counted as part of your overall search performance reports. This means your visibility score and your standard SEO metrics are not entirely separate worlds. If you are tracking performance in Google Search Console, some of those impressions may already include AI-generated snippets.

The practical takeaway is that you need to blend your AI visibility data with your existing search analytics. Looking at either in isolation gives you an incomplete picture. A high visibility score in ChatGPT or Perplexity means little if those mentions never convert into site traffic, and a dip in organic impressions might partly be explained by shifts in AI feature placement rather than a ranking penalty.

Picking the Right Metrics for Your Situation

If I had to recommend a starting dashboard for AI visibility, it would include four things: the aggregate visibility score for trend monitoring, citation count with URLs to see which pages AI models prefer, sentiment breakdown to catch reputation issues early, and referral traffic from AI sources to measure actual business impact.

Start with those four and expand as your understanding deepens. The tools are evolving quickly and standardisation will come eventually. Until then, pick one platform, learn its methodology inside out, and resist the temptation to chase a perfect score. The brands that win in AI search will be the ones that understand what sits behind the number, not just the number itself.

January 23, 2026
How to Map Prompts to Personas for Better LLM Visibility
Most businesses treat their audience as one big group when optimising for large language model visibility. They write a single set of prompts, test them broadly and call it a day. The trouble is, averaging your visibility across an entire audience hides the gaps where you are invisible to the people who matter most. Mapping prompts to specific personas is the fix, and it is simpler than you might think.

Why One-Size-Fits-All Prompting Falls Short

When I first started testing how brands appear inside AI-generated answers, I made the same mistake everyone else does. I wrote prompts from my own point of view and assumed the results spoke for the whole market. They did not. A procurement director searching for manufacturing software asks questions nothing like those a graduate engineer would type. Their vocabulary differs, their intent differs and the depth of answer they expect differs. If you only test with generic prompts, you will see a comfortable average that masks real blind spots.

Research from the Search Engine Land guide on GEO confirms that generative engine optimisation requires thinking about user intent at a granular level. Generic content may rank, but it rarely gets cited when an LLM assembles a tailored response for a specific user need.

What Persona-Based Prompt Mapping Actually Means

Persona-based prompt mapping means grouping your test prompts by a real user type. Not a fictional marketing avatar with a name and a stock photo, but a practical profile built on genuine differences in intent, language and expectations. Think of categories like these:
- Decision makers who need ROI figures and comparisons.
- Practitioners who want step-by-step technical detail.
- Beginners who ask broad, exploratory questions.
- Troubleshooters who arrive with a specific problem to solve.
Each group phrases questions differently and expects a different shape of answer. A decision maker might prompt an LLM with “best enterprise CRM for mid-market manufacturers,” while a practitioner asks “how to configure lead scoring rules in HubSpot.” Testing both tells you where your content actually performs and where it vanishes.

How I Build Persona Prompt Clusters

Inside LLM Optimizer, the workflow I recommend starts with listing your ideal customer profiles. For each profile, brainstorm the questions that person would realistically put to ChatGPT, Gemini or Perplexity. Group those questions into topic clusters, then run them as tracked prompts.

Here is a contrarian take that might raise eyebrows: I believe most SEO professionals over-invest in keyword volume data and under-invest in prompt diversity. Volume tells you what people typed into Google last month. Prompt mapping tells you what people will ask an AI model tomorrow. The two data sets overlap, but they are not the same, and the gap is growing as conversational search behaviour evolves. A study published by researchers at IIT Delhi and Princeton showed that GEO tactics like authoritative language and citation inclusion boosted visibility in generative engines by up to 40 percent, but only when the content matched the query intent closely.

Once your clusters are running, compare visibility scores across personas. You will almost certainly find that your brand shows up well for one audience segment and poorly for another. That gap is your opportunity.

Filling the Gaps Your Data Reveals

After identifying weak spots, the content work becomes targeted. If decision makers see your brand but beginners do not, you likely lack introductory explainer content. If troubleshooters find you but practitioners do not, your how-to guides may need more technical depth. This is where first-hand experience matters. I have spent the past two years auditing LLM outputs for clients across manufacturing, SaaS and professional services, and the pattern repeats: brands that write for a single reader profile leave entire personas on the table.

The Google helpful content guidelines stress demonstrating experience and expertise. That principle applies just as strongly to LLM visibility. Models trained partly on web content inherit the same quality signals. If your page reads like it was written by someone who has genuinely done the work, it stands a better chance of being surfaced in an AI-generated answer.

Where This Is Heading

Persona-based prompt mapping is not a one-off audit. As LLMs update their training data and refine how they select sources, the prompts that matter will shift too. I run my clusters on a rolling monthly cycle so that changes surface quickly. The brands that build this habit now will have a structural advantage as AI-driven search grows. Those still relying on a single averaged visibility score will keep wondering why their traffic from generative engines stays flat.

Start small. Pick two or three personas, write ten prompts for each and track the results for a month. The data will speak for itself, and you will never go back to treating your audience as a single block again.
January 23, 2026
What Is AI Brand Monitoring and Why Does It Matter

I have spent the past year watching how large language models talk about brands, products and services. What I have found is both fascinating and slightly unsettling. AI systems do not pull from a single, frozen database. They update, they re-crawl, and they change their answers without warning. If you are not keeping an eye on what they say about you, you are flying blind.

Why AI Answers About Your Brand Keep Shifting

Most people assume that once an AI gives a correct answer about their company, the job is done. That is wrong. Models get retrained. The web changes daily. Even a small tweak to a user’s prompt can produce a wildly different output. I have seen cases where a brand was cited accurately on Monday and dropped entirely by Thursday. Three weeks later it reappeared. This is not a bug; it is how these systems work.

Think of it as quality assurance for external narratives. You already monitor your Google rankings, your social mentions and your review scores. AI brand monitoring is simply the next layer. According to Gartner’s overview of generative AI, these models are reshaping how consumers discover and evaluate products. If you ignore that channel, someone else will fill the gap with information you cannot control.

The Real Cost of Incorrect AI Responses

Here is where my experience diverges from the usual optimism. Many marketers treat AI visibility as a nice-to-have. I would argue it is closer to a reputational risk. I have personally encountered third-party websites carrying outdated or flat-out wrong product descriptions. When an LLM picks up that misinformation and serves it to a potential customer, the damage is real. The customer might buy the wrong product, receive a service that does not match expectations, or simply lose trust in the brand.

Returns, complaints and negative word of mouth all follow. A BrightLocal consumer survey found that the majority of consumers trust online information as much as personal recommendations. When that information comes from an AI chatbot, the stakes are even higher because users often treat it as a single authoritative source rather than one result among many.

How Weekly Monitoring Catches Problems Early

Daily checks are available, but from what I have seen, a weekly cadence strikes the right balance between vigilance and practicality. Tools like LLM Optimize let you track how and when your brand appears in AI-generated answers over time. You get a historical view that shows patterns rather than snapshots.

A weekly review lets your team spot factual errors before they spread. Maybe your website is missing key product specifications. Maybe a competitor comparison on an external site is misleading. Maybe your opening hours changed six months ago and nobody updated the third-party listing. These are exactly the sorts of gaps that LLMs surface, and fixing them improves not just your AI visibility but your overall online accuracy.

I keep a simple checklist: run the monitoring report, flag any new errors or omissions, trace each issue back to its source, and fix it there. Most weeks there is nothing urgent. But when something does slip through, catching it in seven days rather than seven months can save a significant amount of revenue and reputation.

A Contrarian View on Chasing AI Visibility

I should be honest about something. Not every business needs to obsess over AI brand mentions right now. If your customers are not yet using ChatGPT, Gemini or Copilot to research your type of product, pouring resources into LLM optimisation may be premature. The people selling AI monitoring tools have an obvious incentive to tell you otherwise. Start by checking whether AI-generated answers actually appear for queries relevant to your industry. If they do not, focus your energy on the channels that already drive revenue and revisit AI monitoring in six months.

That said, for any brand operating in a space where consumers do turn to AI for recommendations, comparisons or how-to guidance, monitoring is not optional. The information gap between what you publish and what AI tells users will only widen if left unchecked. A study from the Reuters Institute Digital News Report highlights how quickly AI-driven search is changing information discovery habits, and the trend shows no sign of slowing.

Getting Started Without Overcomplicating It

You do not need a massive budget or a dedicated team. Pick two or three prompts that a potential customer might type into an AI chatbot about your brand. Run them yourself across ChatGPT and at least one other model. Note what comes back. Is it accurate? Is your brand mentioned at all? Are competitors positioned more favourably?

Do this once a week for a month. You will quickly see whether the answers are stable or volatile, correct or misleading. From there you can decide whether a paid monitoring tool is worth the investment or whether manual checks are enough for your scale. The important thing is to start looking, because what AI says about your brand is already shaping how people perceive you, whether you are watching or not.

January 23, 2026
How to Improve What AI Says About Your Brand

AI assistants are quickly becoming the first place people turn when researching a product or service. If what ChatGPT, Gemini or Perplexity says about your brand is wrong, outdated or vague, you are losing trust before a prospect ever visits your website. The good news is that you can shape these answers, not by flipping a secret switch, but by improving the information environment that AI models pull from.

Why AI Answers Matter More Than You Think

When someone asks an AI assistant about your business, the model does not make things up from thin air. It synthesises information from web pages, documentation, reviews and third-party mentions. If those sources contain conflicting details, the AI will either pick one at random or hedge with a vague summary. Neither outcome helps your business.

I have seen this first-hand with clients who updated their product line months ago but never revised the copy on their own website. The old specs kept appearing in AI-generated summaries because the model had no reason to prefer the new information over the old. Consistency across every touchpoint is not optional; it is the foundation of accurate AI representation.

Make Your Website the Clearest Source of Truth

The single most effective step is to turn your own site into the most authoritative, up-to-date reference for everything about your brand. That means reviewing every page for outdated claims, conflicting prices, retired features and broken links. If your About page says one thing and your FAQ says another, an AI model has no reliable way to decide which is correct.

Start with the basics. Make sure product descriptions, service offerings and company details match across every page. Add supporting evidence wherever you can: methodology notes, data points, case studies and structured documentation. According to Google’s structured data guidelines, well-organised markup helps crawlers understand content faster and more accurately. The same principle applies to the large language models that now index your pages.

One thing many guides skip is crawlability. If important pages sit behind JavaScript tabs, login walls or lazy-loading scripts that block bots, AI systems simply will not see the content. Check your robots.txt and make sure the pages you care about most are fully accessible.

Align Third-Party Sources With Your Message

Your website alone is not enough. AI models weigh third-party mentions heavily because independent sources signal credibility. If a well-known review site describes your service differently from how you describe it yourself, the AI may favour the external version.

Audit what others say about you. Search for your brand on major directories, review platforms and industry publications. Where the information is wrong, reach out and request corrections. Where it is simply thin, consider contributing guest posts or providing updated media kits that journalists and bloggers can reference. Tools like LM Optimizer let you inspect which citations AI models are pulling for specific prompts, so you can see exactly where the gaps are.

Here is where I hold a contrarian view: most marketers focus on creating new content to influence AI answers. I believe the higher-return activity is fixing existing content. A single contradictory page on an authoritative domain can override ten blog posts on your own site. Correcting that one page often does more than a month of fresh publishing.

Use Prompt-Based Auditing to Track Progress

You would not run a paid ad campaign without checking the metrics. The same logic applies here. Regularly query AI assistants with the prompts your customers are likely to use. Note what the model says, which sources it cites, and whether the answer has improved since your last check.

In the video above, I walk through a practical example using an espresso machine brand. The company wanted AI assistants to recommend a specific brewing time. By ensuring their own site stated the same figure that appeared on reputable coffee review sites, the AI answer converged on the correct recommendation. It was not instant, but over a few weeks the results shifted noticeably.

Document your findings in a simple spreadsheet: prompt, AI response, cited sources, date. Over time this gives you a clear picture of which changes moved the needle and which did not. Search Engine Land’s guide on influencing AI answers offers a useful framework for structuring this kind of audit.

What This Means Going Forward

AI-generated answers are only going to become more prominent. As models improve and more people rely on them for purchase decisions, the brands that maintain clean, consistent and well-sourced information will have a structural advantage. Those that ignore this shift risk being misrepresented in the very conversations that drive buying decisions.

The work is not glamorous. It is auditing old pages, emailing webmasters and updating product specs. But it is the kind of steady, evidence-based effort that compounds over time. Start with your own site, expand to third-party sources and measure the results. The brands that treat AI accuracy as an ongoing discipline, rather than a one-off project, will be the ones that earn the most accurate and favourable mentions in the months ahead.

January 23, 2026
What Are AI Citations and Why They Can Be Wrong

Most people assume that when an AI assistant provides a link, it must be real. After all, the tool searched the web and found a source, so the citation should be trustworthy. The truth is far less reassuring. AI citations are a mixture of genuine references and fabricated URLs, and the difference between the two is not always obvious.

In this article, I explain what AI citations actually are, how AI assistants decide when to fetch outside sources, and why you should verify every link before trusting it.

How AI Assistants Choose Between Memory and Search

An AI assistant can respond in two fundamentally different ways. The first is answering from its training data, the vast body of text it was exposed to during training. When you ask something general, such as how to edit a PDF, the model often has enough stored knowledge to produce a useful answer without looking anything up. The second approach involves a retrieval step. The model searches the web or pulls documents from an index, then writes an answer grounded in those documents.

I have tested this myself many times. A question like “how do I edit a PDF” typically gets answered from memory. But a time-sensitive question like “what is the weather in Zurich today” forces the model to search, because its training data cannot possibly contain today’s forecast. The decision between these two paths is not random. It depends on whether the model judges the query to require fresh or external information.

What surprises many users is that the model does not always get this judgment right. Sometimes it answers from memory when a search would have been more accurate. Other times it searches unnecessarily. This inconsistency is part of why AI-generated citations can be unreliable, and it is something most providers are still working to improve.

What AI Citations Actually Are

Citations in AI assistants appear as clickable links or small reference boxes alongside the generated text. In tools like ChatGPT, they often show up as numbered grey boxes that you can click to visit the source. When the assistant performs a retrieval step, these citations point to the web pages or documents it consulted. They serve a similar purpose to footnotes in academic writing: they tell you where the information supposedly came from.

However, there is a critical distinction between citations produced after a genuine search and links that the model generates from memory. According to OpenAI’s documentation on browsing, ChatGPT uses a browsing tool to fetch real-time information. When the browsing tool is active, the citations are grounded in actual retrieved pages. When it is not, any URLs in the response come from the model’s training data, and those links may no longer exist or may never have existed at all.

This is the core problem. The visual presentation of a citation looks identical whether the link is real or invented. There is no label that says “this one was actually retrieved” versus “this one was generated from patterns in the training data.”

Why AI Citations Are Often Wrong

Here is where my view parts from the optimists. I believe the citation problem in AI assistants is more serious than most people acknowledge. The term for invented references is “hallucination,” and it affects URLs just as much as it affects factual claims. A model might generate a URL that looks plausible, follows the correct domain structure, and even includes a realistic page slug, yet leads to a 404 error when you actually click it.

I have seen this repeatedly in my own server logs. Hallucinated URLs from AI tools generate real HTTP requests that hit real websites and return 404 responses. If you run a website and notice a sudden increase in 404 errors with oddly specific paths, AI-generated links could be the cause. A study published on arXiv confirmed that large language models frequently produce non-existent references, especially when generating academic citations.

The risk is not just inconvenience. If you are researching a medical question, a legal issue, or a financial decision, a hallucinated citation can lend false authority to bad information. The link looks credible. The surrounding text reads confidently. But the source does not exist.

How to Verify AI Citations Before Trusting Them

The practical solution is straightforward: open every citation in a new tab before you rely on it. If the page loads and contains the information the AI referenced, you can have some confidence in that particular claim. If you get a 404 or the page content does not match the AI’s summary, discard it.

Beyond manual checking, look for signals that a retrieval step actually happened. In Google’s Gemini, for instance, you can sometimes see a “search” indicator that confirms the model queried the web. If no such indicator is present, treat any links with extra caution. I also recommend cross-referencing important claims with a traditional search engine. It takes an extra minute, but it can save you from citing a source that does not exist.

Some users assume that paid tiers or newer models are immune to this problem. They are not. While retrieval-augmented generation has improved citation accuracy, no current system guarantees that every link is valid. Trusting AI citations blindly is a habit worth breaking now, before it costs you credibility.

Where AI Citations Are Heading

The next generation of AI tools will likely separate retrieved citations from generated ones more clearly. I expect we will see explicit labelling, confidence scores, and perhaps even automated link-checking built into the response pipeline. Some early experiments with attributed question answering at Google Research point in this direction.

Until those safeguards arrive, the responsibility sits with the user. Every AI citation is a claim, not a fact. Treat it accordingly, verify before you share, and remember that a confident-sounding answer with a broken link is worse than no answer at all.

January 23, 2026
What Is Sentiment in AI Answers and Why Does It Matter

When you ask ChatGPT or Google’s AI Overview a question, the words it chooses carry emotional weight. That emotional direction, whether positive, negative or neutral, is what we call sentiment. Most people never think about it, but sentiment in AI-generated answers quietly shapes how users feel about brands, products and even medical advice.

What Sentiment Actually Means in AI Responses

Sentiment is the emotional tone embedded in language. In a traditional search result, you click through to a webpage and form your own opinion from the content you read. With AI answers, the model has already done that work for you. It has synthesised sources, picked specific words and delivered a response that leans in a particular emotional direction.

This matters because the AI’s word choices influence perception at scale. If an AI assistant describes a brand as “reliable and well-regarded,” that is a positive sentiment signal. If it says a product “has faced criticism for quality issues,” that is negative. The user did not visit any website. They simply absorbed the AI’s framing as fact.

I have spent the past year building and refining sentiment tracking inside our LLM Optimizer tool, and the patterns we see are striking. The same brand can shift from mostly negative AI mentions to positive ones over a matter of weeks, depending on what new content the models ingest.

Where Sentiment Analysis Gets It Wrong

Here is my contrarian take: most off-the-shelf sentiment analysis is not good enough for AI answer monitoring. Standard NLP classifiers were trained on product reviews and social media posts. They struggle badly with the nuanced, synthesised language that large language models produce.

We hit this problem early on. Take the query “best protein for weight loss.” The word “loss” is typically flagged as negative by basic sentiment models. But in a health and fitness context, weight loss is the desired outcome. It is entirely positive. We saw the same issue with pharmaceutical queries where terms like “drug,” “side effects” and “withdrawal” kept triggering false negatives even when the AI answer was recommending a product favourably.

Sarcasm is another blind spot. If an AI response says something like “sure, if you enjoy waiting three weeks for delivery, this is the brand for you,” a naive classifier might score that as positive because of the word “enjoy.” According to research from Stanford’s NLP group, sarcasm detection remains one of the hardest unsolved problems in sentiment analysis, and AI-generated text adds another layer of complexity.

Domain-specific language trips things up constantly. You need classifiers that understand industry context, not just generic positive and negative word lists.

Why the Same Prompt Can Produce Different Sentiment

One thing that surprises people is how inconsistent AI sentiment can be for identical queries. I have tested the same prompt on consecutive days and received answers with noticeably different tones. Sometimes the response is enthusiastic and recommending. Other times it is cautious and hedging.

There are a few reasons for this. Large language models have a degree of randomness built into their generation process through temperature settings. Personalisation also plays a role. If the model has context about you from previous interactions, it may adjust its tone accordingly. And as models get updated with fresh training data, the underlying sentiment towards a topic can shift entirely.

This variability is exactly why point-in-time sentiment checks are not enough. You need to track sentiment over time to see real trends rather than reacting to a single snapshot.

How I Track Sentiment for Brands in Practice

In our tool, we monitor sentiment across AI platforms on an ongoing basis. For each brand we track, the system logs whether individual AI responses are positive, neutral or negative. Over weeks and months, this builds into a trend line that tells a clear story.

For one client in the nutrition space, we watched their sentiment score climb from mostly red (negative) to predominantly green (positive) over about six weeks. The shift correlated directly with a content strategy we had implemented: publishing more expert-authored articles, earning mentions on authoritative health sites and ensuring consistent brand messaging across platforms that AI models tend to reference.

The breakdown at the prompt level is just as useful. You can see exactly which queries trigger negative sentiment and work backwards to understand why. Often it comes down to a single problematic source that the AI keeps citing, or outdated information that still lingers in the model’s training data.

What This Means for Your Brand Going Forward

AI answers are becoming a primary information channel for millions of users. The sentiment those answers carry about your brand is not something you can afford to ignore. Unlike traditional search where you control your own page’s messaging, AI responses are generated from a mix of sources you may not even know about.

My recommendation is simple. Start monitoring how AI models talk about you. Look beyond just whether you are mentioned and examine the emotional tone of those mentions. Build a content strategy that feeds positive, accurate, expert-backed information into the ecosystem that these models draw from.

Sentiment in AI answers is still a young field, and the tools for measuring it are improving rapidly. The brands that pay attention to this now will have a significant advantage as AI-generated answers become the default way people discover and evaluate products and services. The question is not whether AI sentiment matters. It is whether you are measuring it yet.

January 23, 2026
What Is Agentic Traffic in SEO?
AI agents are now browsing websites on behalf of users. They click buttons, fill in forms, and pull content back to chatbots like ChatGPT and Perplexity. This new wave of automated visits is called agentic traffic, and most site owners have no idea it is happening.

I have spent the last few months tracking how these AI agents interact with client websites. What I found surprised me. Many sites are accidentally blocking or confusing the very bots that could send them qualified visitors. In this article I will explain what agentic traffic is, why it matters, and how you can optimise for it.

How Agentic Traffic Differs from Traditional Crawlers

Traditional crawlers like Googlebot visit your pages, read the HTML, and index the content. They do not interact with the page. Agentic traffic is different. AI agents actively browse your site. They render JavaScript, attempt to click elements, and even try to submit forms.

Think of it this way. Googlebot is a reader. An AI agent is a user. It behaves more like a real person would, except it makes decisions based on page structure rather than visual cues. If your site has confusing overlays, misplaced buttons, or poorly labelled form fields, the agent will struggle. And when it struggles, it leaves.

Tools like Search Engine Land’s coverage of agentic SEO confirm that this shift is already well under way. The agents visiting your site today come from GPT, Perplexity, and a growing list of AI-powered browsers.

Why Most Websites Fail the Agentic Test

Here is a stat that caught my attention. When I audited around a hundred websites for agentic compatibility, roughly 95% had issues with form submissions. The AI agents could not fill in and submit forms properly. That is not a small problem if your business depends on lead generation.

The most common issues I see are:
- CDN or firewall rules blocking AI user agents entirely
- Heavy client-side rendering that hides content from bots
- Overlays and pop-ups that confuse agent navigation
- Form fields without clear labels or accessible markup
On one client site, the AI agents could only see 49% of the page content. The rest was locked behind JavaScript rendering that the bots could not process. Half the page was invisible. That is a huge missed opportunity, and the client had no idea until we measured it.

Tracking and Measuring Agentic Traffic

You cannot optimise what you do not measure. The first step is to identify which AI agents visit your site and whether they succeed. I track metrics like hit counts, success rates per agent, and the percentage of content visible to non-JavaScript renderers.

For example, on one project I monitored 2,600 agentic hits over four weeks. ChatGPT’s agent had a 73% success rate. That means 27% of the time it failed to retrieve what it needed. By drilling into the failing URLs, I found specific pages where the CDN was blocking requests. A simple configuration change fixed it overnight.

Google’s own documentation on crawlers and user agents is a good starting point for understanding bot identification. But keep in mind that agentic bots behave differently from traditional search crawlers, so your monitoring needs to go further.

How to Optimise Your Site for AI Agents

My honest take: most of the advice floating around about AI optimisation focuses on prompt engineering and content formatting. That matters, but it misses the bigger problem. If the agent cannot even access your page or interact with it, your content is irrelevant. Fix the plumbing before you worry about the words.

Here is what I recommend based on my own testing:
- Audit your server logs for AI user agents. Check if they get 200 responses or errors.
- Test your pages with JavaScript disabled. If critical content disappears, you have a rendering problem.
- Remove or defer overlays that appear before the main content loads.
- Ensure all form fields have proper labels and that forms work without JavaScript where possible.
- Review your CDN and WAF rules. Many default configurations block legitimate AI agents.
The Search Engine Journal technical SEO guide covers many of these fundamentals, though their advice is geared towards traditional bots. The principles of clean structure, accessible markup, and fast server responses apply even more to agentic visitors.

Where Agentic Traffic Is Heading

We are still in the early days. Right now, AI agents mostly retrieve and summarise content. But the next generation of agent browsers will do much more. They will complete purchases, book appointments, and fill in applications on behalf of users. If your site is not ready for that, you will lose conversions to competitors whose sites are.

I expect agentic traffic to become a standard metric in SEO reporting within the next year. The sites that start tracking and optimising for it now will have a clear advantage. Those that ignore it will wonder why their traffic numbers look fine but their conversions keep dropping.

The shift from passive crawling to active browsing changes what it means to have a well-optimised website. Start measuring your agentic traffic today. Find the gaps. Fix the access issues. Your future visitors, both human and artificial, will thank you.
January 23, 2026
What Is GEO Measurement and Why Is It So Hard?
Most marketers talk about Generative Engine Optimisation as though it were just SEO with a new hat. But when you try to measure GEO performance, you quickly discover a problem: the numbers shift every time you look at them. AI responses are non-deterministic, region-dependent and shaped by personalisation. That makes reliable measurement genuinely difficult.

I have spent the past year tracking how brands appear inside AI-generated answers. In that time I have watched the same prompt return different brand mentions on Monday and Tuesday, from the same model, in the same location. If you are investing in GEO, you need to understand exactly where measurement stands today and where the gaps still sit.

Why AI Responses Keep Changing

Large language models are non-deterministic by design. Run the same prompt twice and you can get different wording, different sources cited and different brands mentioned. This is not a bug. It is how probabilistic text generation works. Temperature settings, model updates and cached context all influence output.

For traditional search, you could pull rank-tracking data and trust it for a week. With GEO, a single data point tells you very little. You need repeated samples across time, regions and user states to build a picture you can act on. Google’s own documentation on AI principles acknowledges that model behaviour varies with context. That variation flows straight into your measurement challenge.

Region and Personalisation Add Noise

Location matters more than most people realise. A prompt about “best project management tools” will surface different brands in Spain than in the United States. The model draws on regional training data, local popularity signals and language cues. If your measurement tool does not simulate queries from the correct region, your data is misleading from the start.

Personalisation adds another layer. When a user is logged into ChatGPT or Gemini, their conversation history and preferences shape the response. That means two users asking the same question can see completely different brand recommendations. Measuring “your” visibility in AI answers is therefore always an approximation. The best you can do is control for region, strip out personalisation where possible and sample frequently.

Here is my contrarian take: most GEO tools overstate their accuracy because they run a handful of prompts from a single location and call it visibility data. That is not measurement. That is a screenshot. Real measurement requires hundreds of prompt variations, multiple regions and longitudinal tracking. If a vendor cannot explain their sampling methodology, treat their numbers with scepticism.

The Brand Detection Problem

One issue that caught me off guard early on was brand detection. It sounds simple: scan the AI response for your brand name and count mentions. But many brands share names with common words. Think “Apple” the fruit versus Apple the company, or “Teams” the Microsoft product versus teams of people.

When I first tested detection scripts against real AI outputs, false positives were everywhere. A response about workplace collaboration would mention “teams” five times without once referring to Microsoft Teams. You need entity disambiguation, not string matching. Tools like those reviewed by Search Engine Journal are starting to address this, but it remains a weak spot across the industry.

The solution involves building custom detection layers that understand context. You look at surrounding words, the prompt category and the typical entities that appear together. It is slow, manual work. But without it, your visibility score is fiction.

What You Can Measure Today

Despite these challenges, useful measurement is possible. Here is what works reasonably well right now:
- Brand mention frequency across a large sample of prompts related to your category.
- Sentiment analysis of how AI models describe your brand when they do mention it.
- Share of voice compared to competitors within specific prompt clusters.
- Regional differences in brand visibility across key markets.
These metrics give you directional insight. They tell you whether your GEO efforts are moving the needle. They do not yet give you the precision of a Google Search Console click report. That gap is real, and pretending otherwise helps nobody.

Tools built specifically for GEO tracking, such as those explored in Moz’s research on AI search, are improving fast. Sampling methods are getting smarter. Regional simulation is more accurate. Brand detection is catching up. But we are still in the early innings.

Where GEO Measurement Goes From Here

The trajectory is clear. As more organisations invest in GEO, the demand for reliable measurement will push tooling forward. I expect three shifts over the next twelve months. First, API-level access to AI platforms will allow real-time sampling at scale. Second, standardised metrics will emerge so that brands can benchmark across tools. Third, personalisation modelling will let you estimate visibility for different audience segments, not just a generic “average user.”

My experience working with early GEO data for clients across multiple sectors has taught me one thing above all: patience matters more than precision right now. Track trends, not snapshots. Compare quarters, not days. Build your measurement practice today so that when the tools catch up, you already have baseline data to measure against.

GEO measurement is messy, imperfect and improving fast. The brands that accept that reality and invest anyway will be the ones with a head start when the data finally sharpens.
January 23, 2026

Category: Generative Engine Optimization Course

What changed, in one sentence

The headline numbers

Per-site daily traffic change, April vs May 2026

Why I report per-site and not raw totals

Raw total vs per-site change, by source

How to read this table

The finding

The uplift was a step-change, not gradual growth

Weekly OpenAI referral pageviews from Optel (illustrative, indexed)

What the chart shows

A dating discrepancy I need to flag

Why the traffic increased: brand mentions became clickable links

Before vs after

What this means for brands

Finding 1: ChatGPT is shifting from a brand channel to a traffic channel

Finding 2: The signal is robust and confirmed across three independent tools

Finding 3: Most brand mentions are still unlinked, so this is the beginning, not the peak

Tracking this in Adobe LLM Optimizer

Practical steps you can take

What I am not claiming

Summary

The five findings you can quote

What changed since August 2025

My recommendation

What llms.txt is?

Why I ran this LLMs.txt audit

Methodology, scope, and caveats

How many domains actually have an llms.txt file?

Do LLMs crawl the .md files, and are there any referrals from llms.txt?

How are LLMs actually finding pages to crawl?

Who is actually hitting llms.txt? The 22,494-hit breakdown

What is the 92% actually made of?

Host-level reality check

Limitations and an invitation to challenge

Frequently asked questions

How LLMs Crawl and Process Web Content

Common Technical Blockers

Why JavaScript Rendering Is the Biggest Problem

How to Check Your LLM Visibility

Looking Ahead

What Counts as a Mention in AI Answers

How Citations Work Differently

Why Most Marketers Get This Wrong

What This Means for Your Strategy

What a Visibility Score Actually Tells You

A Real-World Example of How Scores Break Down

Why Visibility Score Alone Is Not Enough

Do Not Forget AI Features in Traditional Search

Picking the Right Metrics for Your Situation

Why One-Size-Fits-All Prompting Falls Short

What Persona-Based Prompt Mapping Actually Means

How I Build Persona Prompt Clusters

Filling the Gaps Your Data Reveals

Where This Is Heading

Why AI Answers About Your Brand Keep Shifting

The Real Cost of Incorrect AI Responses

How Weekly Monitoring Catches Problems Early

A Contrarian View on Chasing AI Visibility

Getting Started Without Overcomplicating It

Why AI Answers Matter More Than You Think

Make Your Website the Clearest Source of Truth

Align Third-Party Sources With Your Message

Use Prompt-Based Auditing to Track Progress

What This Means Going Forward

How AI Assistants Choose Between Memory and Search

What AI Citations Actually Are

Why AI Citations Are Often Wrong

How to Verify AI Citations Before Trusting Them

Where AI Citations Are Heading

What Sentiment Actually Means in AI Responses

Where Sentiment Analysis Gets It Wrong

Why the Same Prompt Can Produce Different Sentiment

How I Track Sentiment for Brands in Practice

What This Means for Your Brand Going Forward

How Agentic Traffic Differs from Traditional Crawlers

Why Most Websites Fail the Agentic Test

Tracking and Measuring Agentic Traffic

How to Optimise Your Site for AI Agents

Where Agentic Traffic Is Heading