AI Citation Tracking: What It Is, Why It Matters, and What 274 Brands Just Told Us
Product-led

AI Citation Tracking: What It Is, Why It Matters, and What 274 Brands Just Told Us

AI search is a citation economy now. Here is what AI citation tracking actually means, what 274 brands and 390,000 logged citations taught us, and how to start showing up.

MMMac MacDonald14 min read

AI citation tracking is the practice of measuring how often, where, and how your brand appears when an AI assistant answers a question. It is the closest thing AI search has to a ranking, and most brands are flying completely blind on it.

We just finished a 70-day analysis of 274 brands across 12 industries, scanned across the six major AI platforms (ChatGPT, Claude, Gemini, Google AI Overviews, Microsoft Copilot, and Perplexity), and logged roughly 390,000 citations along the way. The data is sobering, useful, and changes the way you should think about AI visibility work.

TL;DR

  1. 68% of relevant prompts return zero mentions of an untreated SMB brand across all six major AI platforms. That is the modal outcome, not a worst case.

  2. Reddit is cited for 96% of brands. YouTube for 94%. Two community-content platforms run the AI citation substrate, not corporate blogs or homepages.

  3. ChatGPT and Claude expose roughly half of what they read as "fan-out" sources. If your tool only counts cited URLs, you are missing half the picture on the two largest chat platforms.

  4. Search-grounded platforms (AIO, Copilot, Perplexity) cite earned third-party content at roughly 10x the rate of chat-grounded platforms (ChatGPT, Claude, Gemini). That ratio held to within half a percentage point across two analytical cuts ten days apart.

  5. The visibility ceiling between industries differs by 10x or more. Food & Beverage brands get mentioned in roughly half of relevant prompts. Real Estate brands get mentioned in roughly 1 in 20.

What "AI citation tracking" actually means

When an AI answers a question, four different things can happen to your brand inside that response. Most teams treat them as one signal. They are not.

Citation. Your brand name appears with a URL backing it, in a clear answer-relevant position. This is the gold standard. The reader can see the source, click through, and trust the claim.

Reference. Your brand is named in the response and a domain is mentioned alongside it, but the citation is not cleanly linked. Still a positive mention, but the trust signal is softer.

Passing mention. Your brand appears in the response conversationally, with no source attribution. The AI knows you exist. The user has no way to verify.

Fan-out source. The platform lists your URL as something it read while researching the answer, but did not surface as a primary citation. This signal exists on some platforms and not others, which we'll get to.

A real AI citation tracking system measures all four. Most tools on the market measure only the first one, sometimes only the first two. The brands we analyzed showed wildly different shapes depending on which signal you pay attention to, which is the whole point of why this matters.

Counting only "citations" is like counting only your first-page Google rankings and calling it traffic. It is a real number. It is also a fraction of the picture.

Why most SMBs are flying blind: the 68% zero-floor

Here is the single most important number from our analysis: when a small or mid-sized business has not done any AI search optimization, 68% of relevant prompts return zero mentions of that brand across the entire six-platform AI surface.

We pulled this from a clean cohort of 2,471 prompts that were scanned across all six platforms. Of those prompts:

  • 1,691 (68.4%) returned zero mentions anywhere

  • 196 (7.9%) returned exactly one platform mentioning the brand

  • Only 54 (2.2%) returned a mention on all six platforms

That is the baseline. Not the bottom of the distribution. The center of it.

This matters because the entire AI search marketing industry talks about "increasing your visibility score" without telling you what the starting line looks like. The starting line is silence. Two out of three relevant questions, on all six AI platforms, returning nothing about your brand. That is what a typical SMB looks like to AI today.

The number is also moving in the wrong direction as more brands enter the cohort. On May 15 it was 58%. Ten days later, with 31% more brands in the dataset (most of them newly onboarded), it climbed to 68%. The brands that arrive without prior AI search work pull the floor down with them.

There is one piece of suggestive (not conclusive) news worth flagging. Across our active customer base, brands using AI Sightline at the Pro tier with daily scan cadence and active optimization tools currently average roughly 2.4x the visibility score of the untreated baseline. This is correlational, not causal. Brands who pick a paying plan also tend to invest elsewhere. We are watching it carefully because it is the first datapoint suggesting that systematic optimization moves the floor, but we are not ready to make a stronger claim yet.

What we can say with confidence: if you are not measuring your zero-floor, you do not know if any AI visibility work you are doing is moving it.

The substrate the AI is actually reading

When AI platforms pick which third-party domains to cite, the top of the list is not what most teams expect. Across all 274 brands and 12 industries we analyzed, two domains dominate to a degree that is genuinely surprising:

Rank

Domain

Brand coverage

1

reddit.com

263 of 274 brands (96%)

2

youtube.com

258 of 274 brands (94%)

3

facebook.com

216 of 274 brands (79%)

4

en.wikipedia.org

199 of 274 brands (73%)

5

linkedin.com

194 of 274 brands (71%)

Reddit is cited for 96% of brands. YouTube is cited for 94%. Two community-content platforms are the universal AI citation substrate. This pattern holds across consumer industries (Food, Hospitality, Retail) and regulated ones (Healthcare, Finance, Industrial) alike.

Why? Reddit and YouTube are now formal AI training data. Reddit signed a $60M/year content licensing deal with Google in February 2024 (CBS News), and similar arrangements exist with other major AI labs. According to Columbia Journalism Review's analysis, between August 2024 and June 2025, Reddit was the most-cited domain by Google AI Overviews and Perplexity, and the second-most cited by ChatGPT (CJR). Our data, from a completely independent 274-brand sample, confirms it.

Now here is the strange part. Wikipedia is in the top 5 by citation count, but 84.5% of Wikipedia "citations" are actually fan-out sources, not primary citations. The AI reads Wikipedia constantly while researching your category. It rarely surfaces Wikipedia as the named source in the visible answer.

This inverts a strategy a lot of brand teams have been chasing for years. The assumption was "if I can get into Wikipedia, AI will cite me." The data says "if you get into Wikipedia, AI will read Wikipedia about you and then cite something else."

The substrate has moved. The AI citation layer runs on community content, not enterprise content. The "post on your blog and wait" SEO strategy was already weakening before AI. This data suggests it is nearly defunct for citation acquisition.

The fan-out iceberg most tools ignore

ChatGPT exposes about 2.8 fan-out sources for every source it cites in the answer. Claude exposes about 1.2. Combined, the two largest chat platforms have surfaced more than 76,000 fan-out URLs in our dataset, sources their models read but chose not to put in front of the user.

This pattern held nearly perfectly across two analytical cuts ten days apart, with a 31% larger brand cohort the second time. It is not a snapshot artifact. It is the structural state of AI search transparency in 2026.

The practical consequence is uncomfortable. If your visibility tool only counts cited URLs, then on ChatGPT and Claude, you are counting roughly the tip of the iceberg. The actual reading attention on your brand is 2 to 3 times larger than what your tool is showing you. You are reporting a "citation count" that is structurally undercounted by more than half.

For SEO and content teams: fan-out is the closest thing you have to "what topics were on the AI's research desk when it answered about my category." A brand that gets cited 5 times and shows up in 15 fan-out lists is a brand the AI is paying attention to, even if it did not earn the final citation slot. That is actionable signal.

For anyone shopping AI visibility tools right now: ask the vendor whether they capture fan-out separately, on which platforms, and where it shows up in the product. If the answer is "we count citations," you know what they are missing.

The 10:1 gap between search-grounded and chat-grounded platforms

The clearest dichotomy in the data is between platforms that ground answers in live web search and platforms that rely on training data plus optional tool use. We call them search-grounded and chat-grounded. They behave like two completely different systems.

The cleanest evidence is the share of total citations that come from earned third-party content (press coverage, review sites, directory listings, news articles, anything the brand does not directly own):

Platform

Earned citation share

Style

Copilot

20.3%

Search-grounded

Google AIO

20.2%

Search-grounded

Perplexity

7.8%

Search-grounded

Gemini

4.6%

Chat-grounded

ChatGPT

3.8%

Chat-grounded

Claude

2.0%

Chat-grounded

Search-grounded platforms cite earned content at roughly 10 times the rate of chat-grounded platforms. The ratio between Copilot and Claude is almost exactly 10:1.

This held across two cuts of the data ten days apart. Every platform's earned-citation share moved by less than half a percentage point. The 10:1 gap is structural, not provisional.

What this means: a single "AI visibility score" hides 80% of the information you need. Two brands with identical composite scores can have wildly different platform shapes. Optimizing for ChatGPT and optimizing for AI Overviews are not the same job and do not respond to the same work.

Practical translation:

  • To get cited on chat-grounded platforms (ChatGPT, Claude, Gemini): show up in the substrate the model was trained on. Reddit, YouTube, Quora, G2, Wikipedia (as research input), high-authority blogs that get scraped.

  • To get cited on search-grounded platforms (Copilot, AIO, Perplexity): earn fresh press coverage, get listed in vertical-relevant directories, build strong on-page SEO for the queries the AI sub-searches, and (for local) build local citations.

A brand doing only the first set ranks on chat platforms and underperforms on search platforms. A brand doing only the second wins on AIO and Copilot and never breaks through on ChatGPT or Claude. The brands winning on both are doing both, deliberately.

Industries don't start at the same line

If you take one thing from this section, take this: your industry's structural visibility ceiling is largely set before you do anything.

Per-platform mention rate (the percent of prompts where the AI named at least one of the cohort brands) by industry:

Industry

Best platform

Mention rate

Worst platform

Mention rate

Food & Beverage

Google AIO

49.2%

ChatGPT

37.8%

Financial Services

Google AIO

30.6%

Claude

5.6%

Hospitality & Travel

Gemini

28.0%

Claude

10.4%

Marketing Agencies

Claude

25.3%

Perplexity

10.6%

Retail & E-commerce

Google AIO

24.7%

ChatGPT

11.7%

Industrial & Energy

Google AIO

16.9%

ChatGPT

1.7%

Healthcare & Wellness

Gemini

13.8%

ChatGPT

3.1%

B2B SaaS & Tech

Gemini

13.5%

ChatGPT

10.8%

Real Estate

Google AIO

5.0%

ChatGPT

1.1%

Food & Beverage brands get mentioned in roughly half of relevant AI responses. Real Estate brands get mentioned in roughly 1 in 20. The mention-rate ceiling between industries differs by a factor of 10 or more.

The reasons are structural, not strategic:

  • Food & Beverage wins because the answer space is short and concrete. "Best wine subscription," "specialty coffee roasters," "non-alcoholic cocktail brands" all force the AI to name brands to answer.

  • Real Estate loses because the answer space defaults to advice. "How do I evaluate a brokerage" produces educational content, not brand lists. Even when brands surface, they tend to be marketplaces like Zillow, not local brokerages.

  • Industrial flips the script across platforms. ChatGPT mentions industrial brands in 1.7% of prompts, but Copilot, AIO, and Perplexity sit at 15-19%. Search-grounded AI finds brands that chat-grounded AI never trained on.

And then there is the Marketing Agency paradox. Agencies should, in theory, be the best-positioned industry for AI search visibility (they live in the search ecosystem). The reality:

  • 25 of 38 agencies in our cohort (66%) score zero on composite visibility.

  • A handful of agencies dominate. Two products (Semrush, Ahrefs) and one directory (Clutch) account for nearly half of all marketing-related citations.

  • The category is a strict power law.

Some industries are easy mode for AI visibility and some are hard mode. Setting realistic per-industry expectations is half the strategy. Promising a Real Estate brokerage that they will be "cited number one on ChatGPT" is dishonest. Helping them go from invisible to one of the named commercial brands cited alongside Zillow is achievable, and valuable.

How AI Sightline tracks citations and fan-outs


When you create a brand in AI Sightline, here is what happens.

1. Onboarding scan, in under three minutes. You give us your domain and your industry. We pull your homepage, suggest a starter set of prompts that real buyers in your category actually ask, let you edit them, and run an initial scan across the AI platforms your tier covers. You see your first citation map and your first zero-floor number in the same session you signed up. No 30-day onboarding period. No "schedule a kickoff call."

2. Daily scans, every platform you are subscribed to. From day two onward, your prompts run automatically on a daily cadence (Pro and above) or weekly cadence (Starter). Every response is logged. Every URL is extracted. Every mention is classified as citation, reference, passing mention, or fan-out source. Nothing is averaged into a single black-box number.

3. Full citation extraction across all four signal types. Citations get a URL, a role, and a category (community, earned third-party, owned, infrastructure, government, retailer, social, etc.). Fan-outs are captured separately on the platforms that expose them, so you can see what the AI read but did not put in the answer. This is the surface most tools quietly skip. It is one of the most important signals in the dataset.

4. Per-platform breakdown, not a single score. You see how you are doing on ChatGPT, Claude, Gemini, AIO, Copilot, and Perplexity separately. You see which platforms cite you and which only reference you. You see where your competitors are winning and where you are. The composite visibility score is there for trending; the per-platform view is where the strategy lives.

5. Share of voice, by query and by competitor. For every prompt you track, you see who is being cited alongside you, who is being cited instead of you, and how that has moved over time. This is the closest thing AI search has to a SERP, and it is where most optimization decisions actually get made.

The shortest path to all of this is the free tier. Three prompts, two AI platforms, real data on your zero-floor in 30 seconds. No credit card.

What we do with the data to help you show up more

Tracking is the first half. Acting on what the data says is the second half, and this is where most AI visibility tools wave their hands.

Here is what AI Sightline does with the citation, fan-out, and share-of-voice data we capture on your behalf:

Recommendations engine. Every brand on the Pro tier and above gets a continuously refreshed action list pulled directly from the gaps in your citation profile. Recommendations come in four categories: content gaps (topics your competitors are getting cited for and you are not), schema fixes (structured data improvements that increase the chance AI surfaces your owned content), substrate participation (specific places, Reddit subreddits, directories, that are missing your brand for queries you should win), and competitive displacement (specific prompts where one competitor is winning and the citation gap is closable).

Each recommendation comes with a "do this exact thing" instruction and an expected impact estimate. You complete it, mark it done, and we track whether the next scan reflects the lift. The recommendations engine is not a content idea generator. It is an action queue.

Content audit and content generation. On Pro and above, we audit your owned content (homepage, product pages, blog posts) for AI-readiness: schema completeness, citation-style answer formatting, fan-out friendliness, alt text, internal linking patterns. Then we surface "rewrite this paragraph this way" suggestions and, on higher tiers, generate AI-optimized drafts you can edit and publish.

Schema suggestions, automatically generated. Structured data is one of the few things you fully control that meaningfully affects how AI surfaces your owned content. AI Sightline reads your pages and proposes the JSON-LD blocks that match the answer shapes AI platforms actually cite from. On Business and above, we generate the schema for you.

Topic authority map. On Pro and above, you get a category map showing where you are building authority, where you are flat, and which adjacent topics could open new citation surface. This is where you decide what to write next based on what the AI is actually willing to cite.

MCP access on Pro and above. Every visibility signal we track is also queryable from an AI agent. Your team can ask Claude or ChatGPT "how did our Perplexity citation rate move this week" and get a real answer from real data. Most competitors are dashboard-only. We think the future of this category is dashboards plus agents, and we have both.

The point of the data is not to make you stare at a chart every Monday. The point is to compress the 68% zero-floor. Every recommendation, every audit, every schema suggestion is a tool to do that. It is the closest thing the AI search world has to "here is what to do tomorrow morning."

What to do tomorrow morning if you take nothing else from this

If the post stops being useful right here, take these six actions and you will be ahead of 80% of your category.

  1. Be on Reddit. For real, not as a marketing project. Reddit is cited for 96% of brands in our dataset. Find the subreddits where your category gets discussed. Show up as a real participant. Answer questions. Let other people mention your brand naturally. If no relevant subreddit exists, that itself is a signal.

  2. Audit which AI platform actually matches your buyer's behavior. Optimize for the platforms your buyers are actually using. Do not assume "all AI" is one surface. They disagree more than they agree.

  3. Get your brand into the directories that match your vertical. Clutch and G2 for B2B services. Yelp and Angi for local. Capterra for SaaS. TripAdvisor and food publications for hospitality. Find the 3 to 5 most-cited directories in your industry and get a complete, active listing on each.

  4. Track citations and fan-out, not just a single visibility score. The per-platform shape is where the strategy lives. A composite score hides 80% of the information.

  5. If you are in a regulated vertical, set realistic expectations. Government and association domains will keep the top citation slots. Aim to be one of the named commercial brands cited alongside them, not to displace them.

  6. Start measuring your zero-floor. You cannot improve a number you cannot see. Even a manual sample of 20 to 30 buyer-realistic prompts run across all six platforms will give you a defensible baseline.

Start tracking free

This entire post was built on the same kind of data we capture for every brand we track. Citations, fan-outs, share of voice across all six major AI platforms, every day.

If you want to see what your own brand looks like in that view, the free tier gives you three prompts on two AI platforms, no credit card required. You can run a starter scan in under three minutes and have a real number for your zero-floor before you finish a coffee.

Start tracking your AI citations free.

Show up where AI answers.


This analysis is based on AI Sightline's May 2026 State of AI Search for SMBs report, covering 274 brands across 12 industries, ~85,000 AI responses, and roughly 390,000 logged citations across 70 days. Detailed methodology and per-industry tables available on request at aisightline@gmail.com.

Get your free AI visibility score.

See how ChatGPT, Claude, Perplexity, Gemini, Google AIO, and Copilot talk about your brand.

Start free
MM
Mac MacDonald
Founder, AI Sightline

Solo founder building AI visibility monitoring. Ships weekly. No venture capital, a lot of opinions about where AI search is going.