How ChatGPT Actually Chooses Which Brands to Recommend: The Mechanisms
When you ask ChatGPT "what's a good sulfate-free shampoo?" it names 2-3 brands. It's not random. It's pattern-matching on structured content the model can parse with high confidence.
The default story says reviews drive citations. The data tells a different story.
The Three Mechanisms (Reordered)
When ChatGPT decides which brands to surface, three things determine the outcome. The order matters - and it's not the order most AEO advice suggests.
Mechanism 1: Structured-Content Parseability (~50% of citation signal)
AI engines preferentially cite content they can parse with high confidence. FAQ blocks marked with FAQPage schema, comparison tables with named entities, claim-evidence pairs with verifiable sources, product schema (JSON-LD) - these are the formats the model lifts most reliably.
Example: Two brands with similar review depth. Brand A has 5 FAQ blocks with FAQPage schema per product. Brand B has 5 paragraphs of prose covering the same information. Brand A gets cited materially more on the relevant buyer intents.
The lever: Deploy structured content systematically. This is what RevWay's Storefronts engine produces.
Mechanism 2: Attribute-Query Match Specificity (~30% of citation signal)
When a buyer asks "sulfate-free shampoo for color-treated hair," the model looks for content that explicitly covers both attributes. Brands whose structured content literally says "sulfate-free" and "color-treated" get matched. Brands whose content says "gentle formula for healthy hair" - same product, different language - don't.
Example: Brand C has a great product, strong reviews, decent press. But product descriptions use marketing language ("gentle," "premium") rather than literal attributes ("sulfate-free," "paraben-free"). Brand C gets cited for vague queries but never for specific attribute queries.
The lever: Restate attributes literally across structured content. The engine generates attribute-explicit content from your product data automatically.
Mechanism 3: Citation Freshness & Refresh Signal (~20% of citation signal)
Brands whose structured content is recently published, recently updated, or recently A/B tested signal "actively maintained." Brands whose content hasn't been touched in 12 months signal "stale" and get discounted.
Pattern we see: Brands that shipped strong structured content once and stopped tend to lose ground as citation patterns rotate. Brands that ship less but refresh monthly and track patterns tend to climb steadily. The model rewards active maintenance over one-time effort.
The lever: Continuous refresh and A/B testing. RevWay tracks citation patterns weekly and refreshes content automatically as patterns shift.
Parseable structure > attribute match > freshness. In that order.
The Citation Signal Stack
Across the three mechanisms, here's the relative weight of each specific signal we've measured. Structured content sits at the top. Reviews and press are lower than legacy AEO advice suggests.
| Signal | Estimated Weight | What it is |
|---|---|---|
| FAQ blocks with FAQPage schema | ~25% | Q+A pairs the model lifts verbatim |
| Comparison tables with named entities | ~20% | Side-by-side spec / feature, schema-marked |
| Claim-evidence pairs with sources | ~15% | Assertion + verifiable proof, parseable |
| Product schema (JSON-LD) | ~10% | Structural metadata across product pages |
| Reviews mentioning the attribute literally | ~10% | Aggregator content with attribute keywords |
| Expert guide / category round-up mentions | ~8% | Third-party authority signal |
| Press / earned media in category context | ~5% | Authoritative source diversity |
| Refresh / freshness signal | ~7% | Recently updated content carries more weight |
How to read this: The top 4 signals (~70% of the weight) are all things RevWay's Storefronts engine produces directly. The bottom 4 signals (~30%) are authority signals that contribute at the margin - useful, but not the primary lever.
If you're spending 80% of your AEO effort on review collection and press pitching, you're optimizing for the lower 30% of the signal stack while ignoring the upper 70%.
Six Citation Patterns We See
1. The Structural Winner
Brand has clear product-attribute match plus deep structured content (FAQ, tables, claim-evidence, schema) covering the attribute. Cited in 80%+ of relevant queries.
How to replicate: Pick a specific buyer intent. Deploy 5-8 FAQ blocks, one comparison table, claim-evidence pairs - all schema-marked - against that intent.
2. The Sub-Niche Specialist
Brand doesn't have mass appeal but dominates a specific sub-niche. Deep structured content for a narrow attribute combination. Cited consistently in that niche, rarely outside it.
How to replicate: Pick a sub-niche (specific attribute combination, demographic, use-case). Deploy deep structured content there. Accept narrower citation footprint but high consistency.
3. The Underdog with Structure
Smaller brand cited more than market share predicts. Usually: less brand-name recognition but better structured content coverage than the leader.
How to replicate: Brand size doesn't determine citation rate; structural content coverage does. A focused emerging brand can out-cite a sleepy incumbent.
4. The Declining Leader
Brand was once dominant but is being out-cited. Usually: deployed structured content once, stopped refreshing. Position decayed as competitors caught up and AI behavior shifted.
How to replicate (in reverse): Don't let the refresh loop lapse. Continuous tracking and content updates prevent decline.
5. The Brand-Authority Halo
Brand gets cited for broad categories adjacent to its core products, because overall brand visibility carries over. Halo effect on attribute queries the brand doesn't fully own.
How to replicate: Strong brand presence (press, expert mentions, thought leadership) does help at the broader citation level. Treat this as the secondary layer underneath structured content.
6. The Ghost Brand
Brand has good products but virtually no structured content for its target buyer intents. Either very new, or hasn't deployed yet. Citation rate near zero.
How to replicate (in reverse): Deploy structured content for one buyer intent. First citation typically lands within 4-8 weeks.
Can You Game This?
Short answer: not sustainably.
Synthetic FAQ blocks (auto-generated nonsense Q+A pairs), fake schema markup, content that doesn't map to real product attributes - all of it gets noise-filtered over time. AI engines have improving detection for synthetic content patterns.
What compounds: structured content tied to real product attributes, real customer use-cases, real verifiable claims. Continuously refreshed. A/B tested for what actually wins in your category. That's the moat.
Frequently Asked Questions
Why does FAQPage-marked content get cited more than equivalent prose?
Because the parser doesn't have to guess what's a question and what's an answer - the schema tells it explicitly. AI engines preferentially cite content they can lift with high confidence, and FAQ blocks with FAQPage JSON-LD are the lowest-ambiguity format. We see materially higher citation rates for FAQ-marked content vs unstructured prose covering the same information.
If I have 500 reviews mentioning my attribute, why isn't that enough?
Reviews are part of the signal, but they're unstructured prose competing for citation attention with structured formats. 5 FAQ blocks marked with FAQPage schema typically outperform 500 reviews for citation rate on the same buyer intent. The model parses structured content more reliably.
Does Google ranking still correlate with ChatGPT citations?
Weakly and indirectly. The signals that drove Google ranking (domain authority, backlinks) correlate with broader brand visibility, which correlates with appearing in training data. But ChatGPT specifically cites brands whose content the model can parse - and that's a different lever than domain authority. Brands ranking #1 on Google are sometimes invisible on ChatGPT. Brands ranking #20 are sometimes cited heavily.
Can I game this with synthetic FAQ blocks or fake schema?
Not sustainably. AI engines detect synthetic patterns over time and discount them. The structured content that compounds is content tied to real product attributes, real customer use-cases, real verifiable claims. Synthetic blocks get noise-filtered within 60-90 days.
Will the mechanisms shift as AI models update?
The relative weights shift, but the underlying logic doesn't. Structured content > prose. Explicit attributes > implicit. Verifiable claims > unsourced assertions. Recent content > stale content. The specific percentages move with each model update, which is why RevWay tracks citation patterns continuously and refreshes content as patterns rotate.
The Move
The top 70% of the citation signal stack is structured content. Reviews and press are the lower 30%.
Spend your AEO effort proportionally. The Storefronts engine produces the structured content; the subscription runs the tracking and refresh loop.
See where your brand stands on AI
Your AI Citations Score - live dashboard in 30 minutes. Where ChatGPT cites you, where competitors win, where the opportunities are hiding.