Why AI assistants ignore your site
Last updated: May 22, 2026
TL;DR
If AI assistants never cite you, it is almost always one of six concrete, fixable things: blocked crawlers, a CDN silently 403-ing bots, content that only exists after JavaScript runs, missing structure, thin or vague copy, or no curated entry point. None of it is mysterious and none of it requires “submitting” to anyone. Find which one applies, fix it, re-check.
It is not a ranking problem
A page can rank well on Google and still be invisible to ChatGPT, Claude or Perplexity. AI assistants reach you through their own crawlers, read pages differently than a browser, and decide what to quote based on whether the content is reachable, legible and concrete. So “why am I not cited” is rarely about authority — it is about one of the mechanics below.
1. The crawler is blocked in robots.txt
The most common cause, and the most boring. A default rule or migrated config disallows GPTBot, ClaudeBot or Google-Extended. Fix: explicitly allow the AI bots on public pages — the details are in robots.txt for AI crawlers.
2. Your CDN blocks bots before robots.txt is read
Cloudflare bot-fight, a WAF rule, or a “block AI scrapers” toggle returns 403/429 to AI user-agents at the edge — your robots.txt never gets a vote. Fix: allowlist the AI user-agents at the CDN, or, if you would rather monetize them, gate them with Bot Paywall instead of an outright block.
3. Your content only exists after JavaScript runs
Most AI crawlers fetch raw HTML and do not wait for a framework to hydrate. If your headline and body appear only after client-side JS, the bot sees an empty shell. Fix: server-render or pre-render the content you want quoted. Quick test: view-source the page — if the text is not there, neither AI assistants nor their crawlers will use it.
4. There is no structure to extract
No schema.org JSON-LD, a tangled heading outline, no descriptive title. Models turn prose into attributable facts through structure; without it, your content is a wall of text they cannot safely cite. Fix: add Organization/Article/FAQ JSON-LD, a single clear <h1>, a sequential heading outline, and a curated llms.txt.
5. The copy is thin or abstract
A page that is mostly imagery, slogans and marketing abstraction gives a model nothing concrete to lift. Fix: lead with the answer, name your brand and what it does in the first paragraph, and include real specifics — names, numbers, steps — in plain text. This is the same discipline behind getting cited by AI assistants.
6. You are not measuring it
You cannot fix what you cannot see. The five issues above are invisible from a normal browser — they only show up when you fetch the page the way a bot does. Fix: audit per-agent on a standing cadence, especially after CDN, framework or theme changes, which break AI crawlability far more often than people expect.
The fastest path to a diagnosis
Instead of guessing which of the six applies, run a free audit: it fetches your page as GPTBot, ClaudeBot, PerplexityBot and Google-Extended, post-CDN, and returns a 0–100 score with the exact failing checks ranked by severity. Prefer a directional read without a URL? Take the 2-minute readiness quiz. When you are ready to monitor it continuously, the plans add scheduled re-runs and regression alerts.
FAQ
Is there an API to submit my site to ChatGPT?
No. There is no “register with ChatGPT” endpoint. Anyone selling that is selling snake oil. The only durable lever is reachable, legible, useful content.
How long until changes show up in assistants?
Recrawl cadence varies by vendor — days to weeks. Pinging IndexNow speeds up Bing/Copilot; for the rest, fix it once and it compounds as they recrawl.
Which of the six is most common?
Crawl-level blocks (reasons 1 and 2) and JS-only content (reason 3). They are also the highest-impact to fix — an audit tells you which applies in seconds.
Find out which reason is yours: run a free audit or browse the blog for the deep dives.