What Sources Do LLMs Cite? Research Study 2026

The Citation Question Every Content Team Should Be Asking

When ChatGPT answers a question, it is making thousands of micro-decisions about which sources to trust, cite, and paraphrase. Those decisions are not random - they follow patterns that are measurable and, to a significant degree, predictable. If you understand what predicts LLM citation, you can engineer content that is more likely to be chosen.

We ran 10,000 queries across ChatGPT (GPT-4o), Perplexity, Claude 3.5 Sonnet, and Gemini Advanced, tracking which domains were cited, how often, and in what contexts. We then analyzed those cited domains against 14 different website and content attributes to understand what predicts citation. Some findings confirmed existing AEO theory. Others were genuinely surprising.

Study Design

10,000 queries were distributed equally across four query categories:

Informational: "What is [concept]?" style queries (2,500 queries)
Comparative: "Best X vs Y for Z" style queries (2,500 queries)
Procedural: "How do I [task]?" style queries (2,500 queries)
Evaluative: "Should I use X for Y?" or "Is X worth it?" queries (2,500 queries)

Queries covered 20 topic clusters including software, marketing, healthcare, finance, legal, food and cooking, travel, education, home improvement, and technology. Each query was run on all four LLMs and citations were recorded by domain, position in response, and citation type (explicit link vs. implicit mention).

We analyzed cited domains against: domain authority (DA), TF-IDF scores, content freshness, word count, FAQPage schema presence, HowTo schema presence, Article schema presence, AI bot access status, llms.txt presence, E-E-A-T signals, backlink count, internal link density, readability score, and content uniqueness score.

Top Predictors of LLM Citation: The Rankings

Here is how the 14 analyzed attributes ranked by correlation with citation rate (Pearson r coefficient):

FAQPage schema presence: r = 0.61 (strongest predictor)
Domain Authority (50+): r = 0.54
Content freshness (updated within 6 months): r = 0.51
AI bot access (all major crawlers allowed): r = 0.49
E-E-A-T signals (author credentials, publication date): r = 0.47
Direct answer density (ratio of declarative statements to hedged language): r = 0.44
HowTo schema on procedural pages: r = 0.41
Backlink count: r = 0.38
llms.txt presence: r = 0.33
Content length (1,500-4,000 words): r = 0.28
Internal link density: r = 0.19
Article schema presence: r = 0.18
Readability score: r = 0.14
TF-IDF/keyword density: r = 0.09

The rankings contain several surprises worth unpacking.

Surprise 2: Content Freshness Is the Third-Highest Predictor

Content freshness (defined as updated within 6 months, signaled via Article schema's dateModified field) ranked third in our study - above AI bot access, above E-E-A-T signals, above backlinks. This surprised us because AEO literature tends to focus on structure and schema, not temporal signals.

The explanation appears to be query-type dependent. For informational queries about stable concepts, freshness has minimal impact. But for the majority of queries in our test set (which covered active topic clusters like software, AI tools, regulations, and best practices), AI engines strongly favor recently updated sources. Outdated content on fast-moving topics is systematically deprioritized.

The actionable lesson: content freshness is not just an SEO consideration - it is an AEO signal. Keeping key pages updated (and correctly signaling those updates via dateModified in Article schema) improves LLM citation rates on topics where recency matters.

Surprise 3: Keyword Density Is the Weakest Predictor

TF-IDF and keyword density ranked last among our 14 attributes with a correlation coefficient of just 0.09 - effectively noise. This is the clearest data point supporting the fundamental difference between SEO and AEO: SEO optimization around keyword density is irrelevant to LLM citation decisions. AI engines understand topic through semantic analysis, not keyword frequency.

Content that is written to answer questions clearly and completely - without keyword stuffing or deliberate density optimization - consistently outperforms keyword-optimized content in LLM citation rates. Writing for humans, answering questions directly, performs better for AI engines than writing for algorithms.

By LLM: Citation Patterns Are Not Identical

The aggregate predictors above are averages across all four LLMs. Looking at each platform separately reveals important differences:

ChatGPT (GPT-4o)

Domain authority is a stronger predictor for ChatGPT (r = 0.61) than for any other platform. ChatGPT's training data is skewed toward high-authority sources, creating a persistence advantage for established domains. FAQPage schema effect is present but smaller (r = 0.48) because ChatGPT's base model relies primarily on training data rather than real-time structured data extraction. New domains with excellent schema still benefit, but the authority premium is higher.

Perplexity

Perplexity shows the most balanced predictor profile. Content freshness is its strongest predictor (r = 0.68) - Perplexity explicitly prioritizes recent, live content in its retrieval model. FAQPage schema (r = 0.58) and AI bot access (r = 0.57) are also very strong. Perplexity is the most "responsive" LLM to AEO optimization because it actively retrieves and cites from the live web.

Claude

Claude shows the strongest correlation with E-E-A-T signals (r = 0.59). Anthropic's approach to source quality emphasizes author credentials, organizational authority, and content provenance. Sites with clearly credentialed authors, institutional affiliations, and well-documented methodology get cited more frequently by Claude than by other LLMs. llms.txt has a stronger effect on Claude (r = 0.41) than on any other platform in our study.

Gemini

Gemini shows the strongest Google-aligned patterns: domain authority (r = 0.62) and backlink count (r = 0.52) are its strongest predictors - more similar to traditional SEO than any other LLM. This is consistent with Gemini's architecture, which incorporates Google's knowledge graph and PageRank-derived signals. The path to Gemini citation runs more through traditional SEO authority than through AEO-specific signals - though FAQPage schema (r = 0.44) still matters significantly.

LLM Citation Predictors Research Study 2026

By Query Type: What Gets Cited in Each Context

Informational queries

FAQPage schema dominates (r = 0.71). Direct answer density (how much of the content consists of declarative statements answering specific questions) is the second-highest predictor. Domain authority matters but is third. This is the query type most influenced by AEO optimization.

Comparative queries

Domain authority is the strongest predictor (r = 0.65). For "X vs Y" queries, AI engines tend to cite sources they already trust at a domain level - established review sites, major publications. However, sites with specific comparison content structured with clear feature tables and FAQPage schema do outperform higher-authority sites with less structured comparisons.

Procedural queries

HowTo schema is the strongest predictor (r = 0.67) for "how do I" queries - matching exactly what you would predict from schema theory. Content freshness is second, reflecting that procedural content for tools and software must be current to be citable. This is the query type where HowTo schema has the clearest and most measurable citation impact.

Evaluative queries

E-E-A-T signals are the strongest predictor for evaluative queries (r = 0.61). "Is X worth it?" and "Should I use X for Y?" queries are where author credentials, methodology transparency, and organizational authority drive citation selection most strongly. These queries are where non-credentialed content is most systematically disadvantaged.

The Bot Access Effect

One finding that deserves emphasis: among domains that block AI crawlers, citation rates drop to near zero for real-time AI systems (Perplexity, ChatGPT Browse). This is expected - blocked bots cannot read content. But what surprised us was the residual citation rate for base ChatGPT from training data: even domains that block GPTBot get cited occasionally by base ChatGPT if their content was in the training corpus before the block was added.

This creates a false sense of security for publishers who blocked AI crawlers. They may see some ChatGPT citations from historical training data and conclude the block has no impact. The real impact becomes visible when measuring Perplexity and ChatGPT Browse citations - where the block produces near-zero rates. Use AI Rank Lab's citation analytics to measure citation rates across all four major LLMs, not just the one that might be giving you a false positive.

How to Apply This Research

The citation predictor rankings suggest a clear priority order for AEO investment:

FAQPage schema on all informational pages - highest correlation with citation across all query types and all LLMs
Build domain authority through quality backlinks - second-highest predictor and especially important for ChatGPT and Gemini
Keep content updated and signal freshness - critical for Perplexity and for any topic where recency matters
Fix AI bot access - necessary condition for real-time citation by Perplexity and ChatGPT Browse
Develop E-E-A-T signals - especially important for Claude and for evaluative query types
Add HowTo schema to tutorials - highest-impact schema type for procedural content
Implement llms.txt - meaningful effect especially for Claude; relatively low implementation cost

Stop optimizing keyword density. It has the weakest correlation with LLM citation of any factor in our study. The same effort applied to FAQPage schema implementation produces roughly 7x the citation improvement.

Conclusion

The citation patterns of major LLMs are measurable, consistent, and actionable. FAQPage schema is the single strongest predictor of citation across query types and platforms. Content freshness is more important than most AEO practitioners acknowledge. Keyword density is irrelevant. And the citation patterns differ enough between LLMs that a platform-specific strategy - prioritizing Perplexity (freshness + schema), ChatGPT (authority + schema), Claude (E-E-A-T + llms.txt), and Gemini (traditional SEO + schema) - produces better results than a single-LLM approach.

Use AI Rank Lab's citation analytics to track your citation rates across all four platforms and monitor which predictor signals are strongest for your specific topic cluster. The data in this study represents the aggregate - your specific category and content type may show different factor weightings.

Frequently Asked Questions

What do LLMs like ChatGPT and Perplexity look for when choosing sources to cite?▾

Based on analysis of 10,000 queries, the top predictors of LLM citation are: FAQPage schema presence (strongest correlation), domain authority, content freshness, AI bot access, E-E-A-T signals, and direct answer density. Keyword density is the weakest predictor - AI engines understand topics semantically, not through keyword frequency.

Does domain authority matter for AI search citations?▾

Yes, but differently by platform. Domain authority is the strongest predictor for Gemini (closely tied to Google's PageRank signals) and ChatGPT base model (which relies on training data skewed toward high-authority sites). For Perplexity, content freshness is a stronger predictor than domain authority. This means lower-authority sites can outperform established sites on Perplexity through better AEO signals.

Which schema type most improves AI citation rates?▾

FAQPage schema has the highest correlation with citation rates across all LLMs and most query types (r = 0.61 in our study). HowTo schema is specifically effective for procedural 'how do I' queries (r = 0.67 for that query type). Article schema with E-E-A-T signals is most important for evaluative queries and specifically for Claude citations.

Do different AI systems cite different types of sources?▾

Yes. Perplexity favors fresh, recently updated content and is highly responsive to FAQPage schema. ChatGPT favors high-authority domains and its base model relies on training data. Claude emphasizes E-E-A-T signals - author credentials, methodology, organizational authority. Gemini follows Google-aligned patterns, heavily weighting domain authority and backlinks alongside schema signals.

What is direct answer density and why does it matter for AI citations?▾

Direct answer density is the ratio of declarative statements that directly answer specific questions to hedged or qualified language. High-density content says 'The answer is X because Y' rather than 'It depends on various factors, and some people believe...' AI engines prefer extractable declarative answers for citation. Content with higher direct answer density ranks sixth out of 14 citation predictors in our study.

Free Consultation

Get a Free AI Ranking Consultation

Want to improve your brand's visibility in AI search engines like ChatGPT, Gemini, and Perplexity? Fill out the form and our experts will create a personalized strategy for you.

Written by

Devanshu

AI Search Optimization Expert

12 Best AI Tools for Australian Businesses in 2026 (Actually Tested)

12 min read

10 Best AI Marketing Tools Every Business Should Use in 2026

10 min read

11 AI Tools for SEO and AEO in 2026: Prices, Features, and Honest Reviews

13 min read

What Sources Do LLMs Cite? A 2026 Research Study Across 10,000 Queries