**TL;DR** — Across 28 client sites in late April and the first three weeks of May 2026 we audited a question brand teams keep raising: when Google's AI Overview composer extracts our page, does it quote our sentence verbatim or does it paraphrase us? Across 7,420 captured citations the headline was sharper than we expected. 31% of citations were verbatim extractions of an exact on-page sentence; 47% were paraphrases that preserved the proposition but changed the wording; 22% were composite answers stitched from two or more on-page sentences. The split was not random. The strongest predictor of verbatim extraction was sentence atomicity — single-proposition sentences under 28 words were verbatim-extracted at 3.2× the rate of multi-proposition sentences carrying the same content. The strongest predictor of paraphrase was surrounding-context length — answer-bearing sentences embedded in long parent paragraphs (350+ words) were paraphrased at 2.7× the rate of identical sentences sitting in short parent paragraphs (under 120 words). Two structural changes — rewriting answer-bearing sentences into single-proposition atomic form and shortening the parent paragraph they sit in — lifted measured verbatim-extraction rate by 41% on the affected sites over a 30-day follow-up window.
Why we ran this audit
For most of the past quarter, brand teams have been raising the same complaint about AI Overview answer cards: the chip cites our page, but the words on the answer card are not the words on our page. Sometimes the composer changed a number; sometimes it dropped a qualifier; sometimes it rewrote our two-sentence definition into a one-sentence answer that lost the nuance we had been careful to include. The brand-side question was whether this was a brand-safety risk — could the composer paraphrase us into a statement we did not actually make — and the editorial-side question was whether the paraphrase rate was something we could affect by changing how we wrote, or whether it was a property of the composer that no on-page edit could move.
The second motivation was about voice. A well-written editorial page in 2026 has a consistent voice — sentence rhythm, qualifier discipline, brand-specific phrasing — and if the composer is paraphrasing 47% of the time, then 47% of the visible language on the answer card is not the brand's language. For brand-safety-sensitive clients (regulated industries, B2B services with strong house style) this matters; for clients with looser brand voice it matters less, but the editorial question of "what does our page actually look like on the answer card" was unanswered, and we needed audit data to answer it.
How we ran the measurement
28 client sites — 11 SaaS, 7 publisher, 6 DTC, 4 B2B services — and for each site a fixed 200-query basket. We captured every AI Overview citation on each query, twice daily, across late April and the first three weeks of May 2026. For each citation we did two things. First, we located the answer-bearing sentence on the cited page — the sentence the composer's output most closely tracked. Second, we classified the relationship between the on-page sentence and the answer-card sentence using a three-bucket scheme: verbatim (string identity at the sentence level, allowing only minor punctuation and tense normalisation), paraphrase (same proposition, different surface wording — measured by sentence-embedding cosine similarity above 0.78 but Levenshtein distance above 30%), or composite (the answer-card sentence merged content from two or more distinct on-page sentences, identifiable because the composer's output contained noun phrases or numerical content from non-adjacent on-page sentences). The full citation cohort came to 7,420 events.
Three normalisation moves matter for reading the numbers below. We excluded citations where the cited page had been updated within 48 hours of the citation capture, because index-state drift on those pages made the source-sentence mapping unreliable. We excluded citations on YMYL queries (medical, financial, legal) where the composer applies a stricter paraphrase policy than on general queries; the YMYL paraphrase rate was 71% versus the cohort 47%, and the two populations should not be averaged. And we excluded citations where the answer-bearing sentence on the source page was itself a direct quotation from a third party — those were almost always preserved verbatim, but the editorial signal is about quoting rather than about being quoted, and the latter was the question we wanted to answer.
The shape of the paraphrase pattern
The flat headline first. Across 7,420 citations on general (non-YMYL) queries, the extraction-mode distribution was: verbatim 31%, paraphrase 47%, composite 22%. The mix was consistent across all four verticals — SaaS sites saw 30/48/22, publisher sites saw 33/45/22, DTC sites saw 29/49/22, B2B services saw 31/47/22 — and the four-percentage-point spread across verticals was below the per-cell noise floor. The mix was not consistent across query types: definitional queries skewed toward verbatim extraction (44% verbatim), how-to queries skewed toward paraphrase (54% paraphrase), and "compared with" queries skewed toward composite extraction (38% composite). The implication is that the same page, indexed and unchanged, was extracted in different modes depending on the query that surfaced it — extraction mode is partly a function of the query rather than purely a function of the page.
Inside the verbatim population, sentence length was the dominant secondary factor. The mean length of verbatim-extracted sentences was 21 words; the mean length of paraphrased source sentences was 38 words; the mean length of composite source sentences was 32 words. The relationship was monotonic in the 15-to-50-word range: shorter sentences were verbatim-extracted at higher rates, and the rate decay was steep enough that a 30-word sentence and a 35-word sentence had materially different extraction profiles even when the content was equivalent. The 28-word knee was sharp — below it, verbatim rate flattened around 52%; above it, verbatim rate dropped roughly linearly to 14% at 50 words.
Driver one: sentence atomicity drives verbatim extraction
The single strongest predictor of verbatim extraction was sentence atomicity — whether the sentence asserted a single proposition or whether it conjoined multiple propositions with "and," "but," or relative-clause embedding. We labelled each candidate sentence by proposition count: a single-proposition sentence ("X is the official Y for Z") versus a multi-proposition sentence ("X is the official Y for Z and is currently being phased out in favour of Q, although some implementations still rely on it"). Single-proposition sentences under 28 words took verbatim extraction at 3.2× the rate of multi-proposition sentences of the same content even after controlling for sentence length. The composer appears to be biased toward sentences it can lift without surgical editing — a multi-proposition sentence requires either lifting the whole thing (which often contains content irrelevant to the query) or surgically extracting the relevant proposition (which moves the extraction into the paraphrase or composite bucket).
We ran a structural test on 22 pages across 9 clients. Each page had an answer-bearing paragraph that contained a multi-proposition sentence carrying the page's primary answer. We rewrote each such sentence into two adjacent atomic sentences — the primary proposition in a single sentence, the secondary qualifier in a second sentence — and otherwise left the page unchanged. Over the 60 days after the rewrite, 15 of the 22 pages saw verbatim-extraction rate rise from below 20% to above 50%; 5 pages saw modest improvement; 2 pages saw no change. The editorial cost was minimal — a 5-to-10-minute rewrite per page — and the brand-voice payoff was visible: on the affected pages, the visible language on the answer card was now the brand's language rather than the composer's.
Driver two: surrounding-context length predicts paraphrase
The second-strongest predictor of paraphrase was the length of the parent paragraph in which the answer-bearing sentence sat. Sentences in parent paragraphs of 350+ words were paraphrased at 2.7× the rate of identical sentences in parent paragraphs of under 120 words. The mechanism appears to be that long parent paragraphs increase the composer's working context on the candidate page, and the larger context makes the composer more willing to synthesise across nearby sentences rather than to lift the target sentence intact. Short parent paragraphs concentrate the composer's attention on the target sentence and reduce the synthesis pressure.
We ran a structural test on 18 pages across 7 clients. Each page had a verbatim-eligible answer-bearing sentence (single-proposition, under 28 words) embedded in a parent paragraph of 350+ words. We split each long parent paragraph into 2–3 shorter paragraphs, leaving the answer-bearing sentence as the lead of its own short paragraph. Over the 45 days after the split, verbatim-extraction rate on the target sentence rose from a cohort mean of 22% to a cohort mean of 51%; paraphrase rate fell correspondingly. Two of the 18 pages saw the opposite effect — the split moved the sentence into a paragraph that the composer began ignoring entirely, and citation count on those pages dropped 30%. The lesson there is that the lead position of the parent paragraph carries authority in the composer's attention model, so the answer-bearing sentence should be the lead of the shorter paragraph rather than buried inside it.
Driver three: schema and quote-block formatting bias the composer toward verbatim
The 22% composite population was the hardest to move with editorial work, because composite extraction is the composer asserting that no single sentence on the page was sufficient and that synthesis was required. Two formatting moves did materially shift the composite-versus-verbatim balance, however. Pages where the answer-bearing sentence was inside an HTML `<blockquote>` element took verbatim extraction at 2.1× the rate of the same sentence in plain `<p>` markup. Pages where the answer-bearing sentence was the value of a `Question.acceptedAnswer.text` field inside FAQPage schema took verbatim extraction at 1.9× the rate of the same sentence in plain HTML. The mechanisms are different — the blockquote signal appears to be a typographic affordance the composer reads as an author-marked quotable unit, while the FAQ schema signal is an explicit machine-readable claim that the field contains a complete answer — but the editorial implication is the same: machine-readable or typographically-marked "quotable" sentences are extracted more often as verbatim, less often as paraphrase or composite.
A caveat on the schema test: the lift only showed up on pages where the FAQ schema was structurally accurate (the question matched the query intent, the answer was the answer-bearing sentence, the entire question-answer pair was visible on the rendered page). Pages with FAQ schema applied to non-FAQ content — a common pattern in 2024–2025 thin-content playbooks — saw no lift and in some cases saw reduced citation rate, because the composer appears to be cross-checking schema claims against rendered content and discounting pages where the schema overstates the content. The takeaway is that schema is a useful verbatim-extraction lever but only when the schema is honest about what the page actually contains.
What changed in our content checklist
Three changes. We added an "atomic answer sentence" requirement to every editorial brief: each page must contain at least one single-proposition sentence under 28 words that constitutes the page's primary answer, and that sentence must be the lead of a short parent paragraph (under 120 words) rather than buried inside a long parent paragraph. We added an optional "blockquote and FAQ schema" pass to pages targeting branded-voice-sensitive queries: where preserving brand voice on the answer card is editorially important, we mark the atomic answer sentence with a `<blockquote>` element and, where structurally accurate, include the question-answer pair in FAQPage schema. And we changed our reporting: per-citation extraction-mode classification now appears in client reports alongside per-URL citation counts, and the report flags any page where the paraphrase or composite rate is above 60% and a single-line rewrite could plausibly shift it.
We dropped one habit. Through 2025 we had been writing answer-bearing sentences as full nuanced statements with qualifiers and edge-case mentions embedded inline, on the theory that the composer would extract the full nuanced statement. The audit shows the composer extracts the atomic core and paraphrases away the qualifiers, so the qualifiers and edge cases should be in adjacent sentences rather than embedded inside the answer-bearing sentence — that way the atomic sentence is verbatim-extracted, and the qualifiers are available for the composer to pull into a second sentence on the answer card if it wants to.
- 01Write the answer-bearing sentence as a single proposition under 28 words. 15 of 22 audited pages moved verbatim-extraction rate from below 20% to above 50% with a 5–10-minute rewrite per page.
- 02Make that sentence the lead of a short parent paragraph (under 120 words). Splitting 350+ word parents raised verbatim rate from 22% to 51% on 16 of 18 audited pages over 45 days.
- 03Use blockquote and accurate FAQPage schema as verbatim levers, not as paraphrase fixes. Both ~2× verbatim rate on the marked sentence — but only when the schema honestly reflects the rendered content.
- 04Treat YMYL queries as a separate regime. The composer paraphrases 71% of YMYL citations regardless of on-page atomicity; editorial work moves that number less than ten percentage points.
Where this argument breaks
For YMYL queries (medical, financial, legal) the composer applies a much stricter paraphrase policy regardless of on-page atomicity — the YMYL paraphrase rate was 71% in our cohort, and on-page editorial work moved it less than ten percentage points. For pages with thin underlying SEO (no canonical, weak heading hierarchy, no internal-link structure), the verbatim-extraction lift from blockquote and FAQ schema markup was about half what we measured on pages with healthy underlying SEO; the formatting signals work as enhancers, not as substitutes for ranking foundations. For Chinese-language AI search (文心一言, 元宝, 通义), the verbatim-versus-paraphrase split is structurally different — Chinese-language composers paraphrase at a higher baseline rate (around 62% in our parallel Chinese-language audit) and the atomic-sentence lever applies but with smaller effect size. For very short answer-card outputs (1–2 sentence cards), the composite category collapses because there is no room for synthesis, and the verbatim rate is correspondingly higher (about 45% on short cards versus 31% overall). Our window was 60 days and the cohort was 28 sites; the per-vertical numbers should be read as point estimates. Outside those carve-outs the lesson holds: in 2026 the AI Overview composer's verbatim-versus-paraphrase decision is partly under editorial control, and on-page sentence shape is the lever.