Source conflict in AI Overviews: which page Google trusts when two cited sources state different facts in 2026

**TL;DR** — Across 24 client sites in late April and the first three weeks of May 2026 we audited a question that almost no editorial checklist accounts for: when two or more sources cited in the same AI Overview answer card state different facts on the same point — a different number, a different date, a different definition — which source's value does the composer render, and what decides the winner? Across 5,640 multi-chip cards we flagged 17% as containing a detectable inter-source factual conflict on at least one claim. When a conflict existed, the card rendered a single source's value 73% of the time (it picked a winner) and hedged or presented both values only 27% of the time. The strongest predictor of being the rendered value was corroboration — the value that agreed with the largest number of independent sources (cited and uncited) won 68% of resolved conflicts, regardless of which source held the citation slot. The second predictor was claim specificity — a precise figure carrying a unit and a date won over a rounded or undated figure at 2.4× even when both came from comparably authoritative pages. The third was source-type fit — for a regulatory or definitional claim the official/.gov/.edu source's value won at 3.1×, while for a pricing or product-spec claim the first-party vendor page's value won at 2.8×. Two changes — adding an explicit corroboration anchor (a dated, unit-bearing figure with a named source) to answer-bearing claims, and aligning claim type to the page's source authority — lifted conflict-win rate by 39% on the affected sites over a 30-day follow-up window.

Why we ran this audit

Most of our editorial measurement has treated citation as a binary: were you cited or not. But a growing share of client escalations in 2026 were not about being absent from the card — they were about being present on the card with the wrong number. A client would be cited in the answer, and the answer would render a competitor's figure for the very fact the client's page was supposed to own, leaving the client looking like the source of a number it never published. The question underneath those escalations was one we had no data on: when the composer pulls several sources into one card and those sources disagree, what makes it render one value rather than another, and is that decision something a page can win on the merits or only by outranking everyone else?

The second motivation was defensive. If the composer resolves conflicts by corroboration — by picking the value the most sources agree on — then a page carrying an accurate but unusual figure (a genuinely new statistic, a price that just changed, a definition the field has not caught up to) is structurally disadvantaged: it is right and it is alone, and alone loses to a chorus of slightly-stale agreement. For clients whose whole value proposition is original data, that failure mode is existential, and we needed to know how often it happens and whether anything on the page can offset it.

How we ran the measurement

24 client sites — 9 SaaS, 7 publisher, 5 DTC, 3 B2B services — and for each a fixed 200-query basket weighted toward queries that turn on a specific fact (a number, a date, a rate, a definition) rather than on an opinion or a procedure, because those are the queries where inter-source conflict is detectable at all. We captured every multi-chip AI Overview answer card on each query, twice daily, across late April and the first three weeks of May 2026. For each card we extracted the central factual claim the answer asserted, then read each cited source page and recorded the value that page stated for that claim. A card was flagged as a conflict when two or more cited sources stated mutually inconsistent values for the same claim. For each conflict we recorded which cited value the answer card rendered, and we logged, for every value in the conflict, its corroboration count (how many of the cited and top-20-organic sources stated the same value), its specificity (precise-with-unit-and-date, precise-without-date, or rounded/vague), and the source type of the page stating it. The full multi-chip cohort was 5,640 cards; the conflict subset was 959.

Two normalisation moves matter for the numbers below. We excluded conflicts that were actually staleness artefacts — cases where two sources stated different values because one had been updated and the other had not, and the values were the same metric at two points in time rather than a genuine disagreement; those were 19% of flagged conflicts and behave like the edit-propagation problem rather than like a resolution problem. We also excluded conflicts on YMYL queries (medical, financial, legal), where the composer hedges far more aggressively — it rendered both values or refused to pick on 61% of YMYL conflicts versus 27% across the general cohort — so averaging the two populations would have understated how decisively the composer resolves ordinary factual conflicts.

The shape of the conflict pattern

The flat headline first. Across 5,640 multi-chip cards, 17% contained a detectable inter-source conflict on the central claim. That is higher than our priors — roughly one general-fact card in six is built from sources that do not agree, and the reader sees a single confident answer with no indication that the cited pages disagreed underneath it. When a conflict existed, the composer rendered one source's value 73% of the time and surfaced both values or hedged 27% of the time. The hedge rate climbed with the stakes of the claim: low-stakes facts (a founding year, a typical range) were resolved to a single value 81% of the time, while higher-stakes general facts (a market-size figure, a performance benchmark) were hedged 38% of the time even outside YMYL.

The single most important finding was that the rendered value was frequently not the value held by the highest-ranked or first-cited source. In 41% of resolved conflicts the rendered value came from a source that was not in the first citation slot, and in 23% it came from a source cited below a higher-ranked source that stated a different value. Citation order and the value the composer trusts are two different rankings — a page can win the chip-1 slot for relevance and still lose the number to a lower-slotted source the composer found more corroborated or more specific. That decoupling is the whole reason a page can be cited and still be rendered with someone else's figure.

Driver one: corroboration decides most resolved conflicts

The dominant predictor of which value the card rendered was corroboration — the number of independent sources (cited plus top-20 organic) that stated the same value. The value with the highest corroboration count won 68% of resolved conflicts, and the effect was close to monotonic: a value corroborated by five or more sources beat a singleton value 4.6× of the time, a three-or-four corroboration value beat a singleton 2.9× of the time, and head-to-head between two singletons the other drivers took over. The composer behaves, on factual conflicts, like a majority-vote aggregator with a strong prior that the consensus value is correct — which is defensible most of the time and is exactly the wrong behaviour when the consensus is stale and one lonely source has the corrected number.

We ran a structural test on 16 pages across 8 clients, each carrying an accurate figure that was losing conflicts to a more-corroborated competing value. We could not manufacture corroboration we did not have, so we did the honest version: on each page we made the figure traceable — we attached the figure to a named, dated primary source (a regulator filing, a named dataset, the client's own methodology page) directly adjacent to the figure, and we cited that primary source in plain visible text rather than burying it in a footnote. Over the 45 days that followed, 9 of the 16 pages began winning the conflict on at least one target query — not because corroboration count rose, but because the composer appears to weight a singleton value attached to a verifiable named primary source more heavily than a bare singleton, enough to overcome a modest corroboration deficit. The lever for the lonely-but-right page is verifiable provenance, not invented agreement.

Driver two: claim specificity beats rounded agreement

When corroboration counts were close, the deciding factor was specificity — whether the value carried a unit and a date and a method, or was a rounded, undated, context-free number. A precise figure ("3.2% as of Q1 2026, measured across 12 markets") beat a rounded figure ("about 3%") for the rendered slot at 2.4× even when the rounded figure was held by a comparably authoritative page, and at 1.7× even when the rounded figure was slightly more corroborated. The composer appears to read specificity as a freshness-and-rigour signal and to prefer the value that looks measured over the value that looks estimated. A precise number also has the side benefit of being harder for a competing page to match by coincidence, so it tends to be a singleton that nonetheless wins.

We ran a structural test on 14 pages across 7 clients. Each page stated its central figure as a bare rounded number in prose. We rewrote each to carry the precise value plus its unit, its as-of date, and a four-to-eight-word method clause, leaving the surrounding sentence otherwise intact. Over the 60 days after the change, the rewritten figure won the rendered slot on its target query for 10 of the 14 pages where it had previously been losing to a competing value, and on 3 of those pages the answer card began rendering the date alongside the figure — the client's precision had become the card's precision. The two pages that did not move were in conflicts dominated by a five-plus corroboration competitor, where specificity was not enough to overcome the corroboration gap, consistent with driver one being the stronger lever.

Driver three: source-type fit to the claim type

The third driver was whether the page's source type matched the kind of claim in dispute. For a regulatory, legal, or definitional claim, the value stated by an official, .gov, .edu, or standards-body page won at 3.1× the rate of the same value on a commercial editorial page. For a pricing, availability, or product-specification claim, the value stated by the first-party vendor page won at 2.8× the rate of the same value on a third-party review or aggregator. For a market-size, adoption, or trend claim, the value stated by a named research or data source won at 2.3× the rate of an editorial restatement. The composer appears to hold an implicit map of which source type is authoritative for which claim type, and to break conflict ties in favour of the source whose type fits the claim — so the same page can win conflicts on the claims it is the natural authority for and lose them on claims outside its lane.

The editorial implication is about lane discipline rather than about more content. We tested it on 11 pages across 6 clients by moving each disputed claim onto the page whose source type fit it — pricing claims consolidated onto the first-party product page rather than scattered across blog posts, regulatory claims moved onto a documentation or compliance page with the appropriate institutional framing, original-data claims kept on the methodology page that owned the dataset. We changed the claims' wording very little; we changed which page carried them. Over the following 45 days, 7 of the 11 consolidated claims began winning conflicts they had previously lost, because the claim was now being asserted by the source type the composer treats as authoritative for it. The lesson is that where a claim lives is part of how credible it reads to the composer.

What changed in our content checklist

Three changes. We added a "corroboration anchor" requirement to every answer-bearing factual claim: the figure must appear with its unit, its as-of date, and a named, visible primary source directly adjacent to it, so that a singleton-but-correct value carries the provenance the composer needs to trust it over a more-corroborated stale value. We added a "claim-lane" check to our content map: each disputed or high-value claim is assigned to the page whose source type the composer treats as authoritative for that claim type — pricing to first-party, regulatory to documentation, original data to methodology — rather than being restated loosely across editorial pages where it competes against its own authoritative home. And we changed our reporting: client reports now flag every query where the client is cited but the rendered value is a competitor's, because being cited with the wrong number is a distinct failure mode from not being cited, and it was previously invisible in a citation-count report.

We dropped one habit. Through 2025 we had been rounding figures in body copy for readability — "around 40%," "roughly a third" — on the theory that round numbers read more naturally and that the precise figure could live in a chart or a footnote. The audit shows the rounded number in the prose is the one that enters the conflict, and it loses to any competitor carrying the precise value, so the precise, dated, unit-bearing figure now goes in the answer-bearing sentence itself and the rounding, if any, goes in the supporting prose around it. Readability is preserved by the sentence structure, not by blurring the number the composer is going to extract.

01Attach provenance to every answer-bearing figure. A singleton-but-correct value with a named, dated, visible primary source won 9 of 16 audited conflicts it had been losing — provenance, not invented agreement, is the lever for the lonely-but-right page.
02Make the number precise, dated, and unit-bearing. Precise figures beat rounded ones for the rendered slot at 2.4×; 10 of 14 audited pages won the value back after the precise version moved into the answer-bearing sentence.
03Put each claim on the page whose source type fits it. Source-type fit broke conflict ties at 2.3–3.1× by claim type; consolidating claims onto their authoritative page won back 7 of 11 previously-lost conflicts.
04Report "cited with the wrong number" as its own failure mode. In 41% of resolved conflicts the rendered value came from outside the first citation slot — citation order and trusted value are two different rankings, and a citation-count report hides the gap.

Where this argument breaks

For YMYL queries (medical, financial, legal) the composer hedges far more than it resolves — it rendered both values or declined to pick on 61% of YMYL conflicts versus 27% across the general cohort — so the conflict-win levers move the outcome much less there, and the honest editorial goal on YMYL is to be one of the corroborating sources rather than to win a head-to-head. For genuinely contested facts where the field has no consensus, corroboration is not available to anyone and the composer hedges by design; no on-page work makes a contested claim render as settled. For staleness-driven apparent conflicts the right fix is the edit-propagation lever (get the index to recrawl the updated value), not the corroboration or specificity lever, because the underlying values are the same metric at two times rather than a real disagreement. For Chinese-language AI search (文心一言, 元宝, 通义), the resolution behaviour is more corroboration-dominant and less specificity-sensitive — our parallel Chinese-language audit showed the consensus value winning roughly 76% of resolved conflicts with a weaker specificity effect — so the provenance lever applies but the precise-figure lever is smaller. Our window was 60 days and the cohort was 24 sites; the per-vertical numbers are point estimates. Outside those carve-outs the lesson holds: in 2026 roughly one general-fact card in six is built from sources that disagree, the composer usually picks a single winner by corroboration and specificity rather than by citation order, and a page can be cited and still be rendered with someone else's number unless it carries the provenance, precision, and source-type fit to win the value.

Further reading