Every few months, a new hype emerges that promises to change marketing forever.
This time there are Prompt Tracker – Tools that show you how often your brand is mentioned in ChatGPT, Claude or Perplexity.
Sounds good at first. Who wouldn't want to know that?
The problem is that the technology behind it is rubbish – at least if you want to take this data seriously as a basis for decision-making.
Why Prompt Trackers Don't Work
Let's start with the most obvious problem:
AI is non-deterministic.
In plain language, this means that if you ask ChatGPT the same question twice, you will not reliably get the same answer. Sometimes there are nuances, sometimes completely different emphases.
For statisticians, this is a nightmare.
In order to make reasonably reliable statements, you would have to repeat each individual prompt very often – not just once or five times, but dozens or hundreds of times, depending on the desired accuracy.
In practice, this means:
Even with just a few hundred runs per prompt, costs can quickly reach high double-digit or triple-digit figures. Per prompt. Per model.
Anyone who wants to track 30, 40 or 50 prompts across multiple models will quickly end up with five-figure monthly costs – for data that is still highly noisy.

API is not the same as ChatGPT
Most prompt trackers use APIs from OpenAI, Anthropic or other providers. This is understandable – anything else would be a clear violation of the terms of use.
Only:
The API is not what users see on ChatGPT.com.
The web interface works with proprietary system prompts, internal tools, additional context enrichment and regularly changing control mechanisms. These levels are neither visible nor reproducible from the outside.
What does that mean specifically?
You measure responses with a prompt tracker from a technical setup that differs fundamentally from the real user experience.
Or, in other words:
You analyse data from a parallel universe – and wonder why it doesn't match reality.
The query fan-out problem
Another point that is rarely discussed:
When users ask complex questions, the AI does not process them internally as a single search query. Instead, parts are broken down, rephrased and evaluated in parallel – depending on the model, prompt and context.
From
„What are the best wireless headphones under £200?“
Internally, several thematic search directions are used, such as:
- „Wireless headphones under £200“
- „Bluetooth headphones test“
- „Headphone comparison“
So you are tracking a prompt that is never the operational basis for the response in this form.
The AI does its own thing – and your tracking runs alongside it.
What works instead
Instead of model assumptions, some things can be measured quite specifically – without interpretation, without guessing.
- Where AI traffic actually comes from
- What these visitors do on the site
- Which AI content is preferred for linking
- How the proportion of AI develops over time
| Prompt Tracker | AI referral tracking |
|---|---|
| Measures model answers | Measures real user visits |
| Based on API setups | Based on genuine referrers |
| High degree of interpretation | Clear, verifiable data |
| High running costs | Low additional costs |
| sense of control | Actual basis for decision-making |
The good news:
There is data that can actually be measured. Without estimates. Without interpretation.
AI referral traffic
If someone comes to your website via a link from ChatGPT, Perplexity or another AI, that is a fact.
No model, no simulation – a real visit.
And this is where it gets interesting when you take a closer look.
Breakdown by AI source
Not all AI works the same way:
- ChatGPT primarily uses Bing
- Claude uses Brave
- Perplexity operates its own search logic
Depending on where your content is easily discoverable, visitors will come from a wide variety of AI ecosystems.
If you separate them neatly, you will quickly see:
Which AI actually delivers relevant traffic – and which is just noise?
Conversion tracking per AI source
Traffic alone is nice.
But the crucial question is: what happens afterwards?
- Do these visitors buy?
- Are you signing up?
- Will you continue reading or leave immediately?
If AI visitors from one source convert significantly better than others, this is not a feeling – it is a concrete indication of where investments are worthwhile.
Landing page analysis
What content is actually linked by AIs?
If advice articles consistently receive AI traffic, but product pages hardly any, then that is no coincidence.
It shows which types of content work for AI searches – and which do not.
AI share over time
A simple but underestimated point:
How is the proportion of AI traffic developing over weeks and months?
Is it rising steadily?
Is he stagnating?
Or does it suddenly fall off?
Such changes are early indicators.
If your AI share is falling while it is growing in the market, something is wrong.
AI share over time
A simple but underestimated point:
How is the proportion of AI traffic developing over weeks and months?
Is it rising steadily?
Is he stagnating?
Or does it suddenly fall off?
Such changes are early indicators.
If your AI share is falling while it is growing in the market, something is wrong.
Conclusion
Prompt trackers sell the feeling of control. Dashboards move, numbers change, everything seems measurable. But under the bonnet, it's mostly noise – expensive and difficult to classify.
Such tools may have their place as a playground for experiments. However, they are not suitable as a basis for real decisions.
What we already do at Trackboxx
We rely on AI referral tracking. Less spectacular, but honest.
You can see which AI sends you traffic – ChatGPT, Claude, Perplexity, Gemini – clearly broken down. You can see which pages are linked. And you can see which of these convert.
Real visitors. Real data. No estimates.

All of this is cookie-free, GDPR-compliant, and doesn't require you to spend thousands of pounds a month.
Try Trackboxx free for 30 days now
No payment information required! No automatic renewal! Your Trackboxx ready to go in 1 minute.



