How we re-write content to win AI Overview citations

We were sitting on positions three and five in the link sidebar for the term, which is a perfectly respectable ranking and the kind of number that would once have meant the work was done. The AI Overview directly above those sidebar links was quoting a competitor on every substantive bullet — a timeline that didn't add up, a list of named photo spots, a confetti rule we'd happened to get backwards on our own page. Our reviews count was 304. Theirs was 3. Their content had structural patterns the AI grabs; ours didn't. Ranking and citation aren't the same scoreboard, and on that day the scoreboard that mattered — the answer above the links, the one the buyer reads first — wasn't reading from us at all.

This piece is the methodology that closed that gap, written up while it's fresh and while we can still point at the before/after on a real page. I'll lay out the seven-step playbook first because it's reusable; the worked example follows underneath, and the audit-table format I used there is the one I'll keep using on the next venue page and the one after that. Some of what's described here is the kind of detail you give away once and stop being differentiated for. We'll decide what to publish once it's been through the redaction pass — for now it's all on the page so the decisions can be made from substance rather than guesswork.

01 · The frameWhy ranking and citation come apart, and why content shape is the lever.

The longer essay on this is at Ranking isn't citation, so I'll only restate the part that matters for this piece: a classic ranking is the output of a retrieval system that orders pages, and an AI Overview citation is the output of a generative system that synthesises an answer from a small handful of grounding sources. Different inputs, different appetites, different winners. The same page can be highly ranked and entirely un-cited because the AI didn't reach for it when it built the summary above the links.

Which sources the AI does reach for is the question we can act on, and the answer is concrete enough to engineer toward. Across the multiple AI Overviews we've audited for clients, the structural patterns that get quoted are remarkably consistent. The model preferentially grounds on content shaped like this: numbered minute-blocks with explicit time labels and bold durations, named-spot cards with consistent heading levels, FAQPage Q&As written with verbatim question phrasing that matches how users actually search, Service/Offer schema with named offerings rather than generic labels, and concise "what's included" bullet lists. Long flowing prose loses to short specific lists every time. Pages with no structured-data layer lose to pages with one. The model is doing a kind of pattern-matching against shapes that are easy to extract and cite, and you can either match those shapes or you can produce content with no on-ramp into the summary.

That's the lever. The ranking layer rewards inbound links and topical authority over time. The citation layer rewards structural readability for a parser that has seconds to decide which page becomes the source. They're not opposed — ranking is still useful, often necessary as a discovery signal — but the work that wins the citation layer is the work most agencies are still treating as a finishing touch.

02 · The playbookSeven steps, in the order they happen.

Below is the procedure as we run it. The order matters: confirming the term is worth the work comes first because none of the structural restructuring is free; competitor fetch comes before our own audit because we want to know what we're beating before we measure ourselves against an abstraction; the three-surface landing comes before the deploy because drift between HTML, schema and the LLM corpus is the most common reason a re-write doesn't bank.

Confirm the term actually triggers an AI Overview.

Open the query in a logged-out browser, ideally in the geography you're targeting, and check what's above the classic link list. If there's no AI Overview the entire playbook is unnecessary — focus on classic SERP work and move on. If there is one, screenshot the full Overview including every citation badge. Note the "+N" counts next to each chip because those are the additional competitors the AI has fused into the same factual claim, and they need their own row in the audit.
Identify the cited competitors from the chips, not from a vibe.

Click the chips and write the cited URLs down. Don't guess from memory which competitors "probably matter" — the AI is telling you which sources it's reading, and the surprise is almost always who isn't on the list (the longest-established player in the category, often) and who is (a new entrant with a structurally cleaner page). That gap is the whole signal. Treat the cited list as the ground truth for who you're actually competing with on this query, today, in this surface.
Fetch each cited competitor's page directly, including raw source.

Don't read the AI's summary of the competitor; read the competitor. A WebFetch is usually enough, but for client-rendered sites (Squarespace, Wix, anything heavy on JavaScript that resolves content after page load) you'll need the raw HTML so the structured-data blocks and the actual text aren't hidden behind a render. Two things turn up here repeatedly. First, the AI is often synthesising rather than quoting — fusing partial facts from multiple sources into a confident-sounding paragraph that no single page actually said. Second, some of what gets attributed to a "source" is the model filling in arithmetic to make a structure tidy. Both matter for the next step.
Audit four layers, in this order.

Structural patterns: numbered minute-blocks, named-spot cards, FAQPage Q&As with verbatim question phrasing, Service/Offer schema with named offerings, concise "what's included" bullet lists. These are the on-ramps the model uses to decide which page becomes the cite.

Specific facts the AI grabs: the minutes, the counts, the named places, the named rooms, the prices, the rules. These are what end up in the Overview text — make a list of every concrete fact the competitor page surfaces that yours doesn't.

Authority signals: verifiable counts (your number of weddings shot, the number of years trading), enforced rules with named enforcers (the council bans X), primary-source citations the customer can verify themselves. These are the moats — the things competitors can't copy back to your page because the underlying reality is yours, not theirs.

Errors and gaps in their content: arithmetic that doesn't sum, rules quoted wrong, named places that don't exist or are misattributed. Competitor pages are often less rigorous than they look, and the AI quotes them anyway. Catching these is how you become the more reliable source on the same fact.
Match every structural pattern, beat them on grounded facts, skip what doesn't fit voice.

Parity on structural patterns is non-negotiable — if they have a minute-block timeline and you don't, you're invisible on that bullet of the Overview no matter how much social proof you carry. Beating on facts means naming the verifiable thing they can't claim: the count, the rule, the year, the named-source citation. Skip what doesn't fit voice — chasing parity at the cost of originality is the failure mode that produces pages that could be any agency's, with all the trust-signal that implies.
Land changes in three surfaces simultaneously.

The visible HTML is what the human reads. The JSON-LD schema (FAQPage, Service, AggregateRating, Review) is what Google's parsers extract and the AI grounding stack consumes for high-confidence facts. The LLM corpus (llms.txt and llms-full.txt) is what AI agents and on-site chat tools ground on when they want to answer a question about your business without crawling the page. If those three surfaces drift — visible HTML says one thing, schema says another, LLM corpus says a third — the model defaults to whichever source it independently trusts more, and your re-write competes with itself. The build-time gate that prevents drift is the most important single piece of plumbing in this whole playbook.
Validate, deploy, measure.

Pre-deploy: build cleanly, JSON-LD parses on every page, numbers add up where they should. Post-deploy: pass the page through Google's Rich Results Test to confirm the FAQPage and any other rich-eligible schema are picked up without errors. Then wait one to two weeks for the recrawl, re-run the same query in an AI Overview, and check whether the bullet the competitor used to own now cites you. The window between deploy and re-query is where the patience is — most clients want to check it the next day. The recrawl cycle is the actual cadence to plan around.

03 · The worked exampleBelfast city hall wedding photography.

The page is a venue-specific landing for one of our wedding-photography clients — Belfast City Hall, ceremony-only and short-coverage packages, the entire page already rebuilt as static HTML on the framework documented in the existing case study. The classic-search ranking work had landed: two of our URLs were at positions three and five in the link sidebar for the query "city hall wedding," which is a perfectly creditable result for a competitive Belfast term.

The AI Overview above those links told a different story. It carried a tidy numbered timeline ("First 20 mins / Next 30 mins / Next 20 mins / Final 50 mins") attributed to a competitor with three reviews in their structured data and ours not surfaced at all. It carried a "Best Places to Photograph at Belfast City Hall" panel naming spots inside the building, attributed to the same competitor. It cited a confetti rule we'd happened to get backwards on our own page — we'd been treating the post-ceremony confetti toss as a feature and selling it accordingly, and the council page they were quoting from had it listed under explicitly prohibited items. We were ranking. We were not the source.

Before

304 reviews across 5 platforms · 700+ weddings shot · positions 3 & 5 in the link sidebar for the query · cited on zero bullets in the AI Overview text.

Goal

Same link positions · same review count · structural patterns and named facts that match what the Overview is grounding on · cited on the bullets the competitor currently owns.

Step 3 in practice — what the competitors actually said.

The cited competitor for the timeline runs on Squarespace, and the raw page source confirmed what the AI was synthesising. Their literal "How a Typical 2 Hour City Hall Wedding Unfolds" section laid out four named blocks: Arrival 20 mins, Ceremony 30 mins, Indoor Portraits 20 mins, Outdoor Portraits 30 mins. Add those up. It comes to 100, not 120, and the page is titled and positioned as a two-hour breakdown. The AI Overview's "Final 50 mins" bullet — the one the model attributed to the competitor — exists nowhere on the page. The AI had taken the four-block structure, noticed the sum problem, fused in the optional "extend your coverage" add-on that the competitor mentions later (Cathedral Quarter, Botanic Gardens), and synthesised a fifty-minute closing block to make the arithmetic land on 120. It then cited the competitor as the source for an arrangement of facts the competitor never published.

The second cited competitor was a Wedding-photography page on a different stack with much thinner prose — two paragraphs on a two-to-three-hour package, with the lines "Your ceremony can last between 20-30 minutes" and "the staircase is reserved for your photos for 15 minutes after the ceremony, just choose the white one or the marble one." That second sentence was the one that mattered to us: it independently corroborated a named-feature claim ("White Staircase") that we hadn't carried on our page. We'd named only the Grand Marble Staircase because we hadn't been able to verify the White Staircase as a distinct named feature from a single source. With two independent sources naming it, the verification threshold was met and the addition could go in.

The confetti rule was the third strand and the one that turned the audit into a correction rather than just an expansion. Belfast City Council's own published page lists confetti, rice, candles and animals (other than assistance dogs) under explicitly prohibited items. The competitor pages — and a real customer testimonial on one of them — independently corroborated the rule, with the workaround being a bubble send-off on the front steps. Our page was selling the confetti shot in six places, including the meta description, the JSON-LD service description, and the hero lede. We were telling buyers, structurally, that we'd capture something the venue doesn't permit. That's not a content gap; that's a content error, and it's the kind of error that erodes the page's claim to be the authoritative source even if the AI never explicitly contradicted us in the Overview text.

Step 4 in practice — the four-layer audit, applied.

Running our page through the four audit layers produced a short, concrete list of things to change. Writing the findings down in this format made the work tractable; without it the rewrite would have sprawled into a vague "improve the page" exercise rather than a set of named patches.

Structural patterns

Missing: a minute-block timeline section with explicit duration labels and h3 headers per block (the format the AI was grounding on); a named-spots section using the verbatim phrase "best places for wedding photos inside Belfast City Hall" as the h2 (the literal question being parsed); a concise "what's included" bullet list inside the recommended-package aside; the FAQPage mainEntity with the verbatim "best places" question as Q1. We had a granular shot list with time offsets, but it was framed as what we shoot rather than how the 120 minutes structure, and the AI didn't parse it as a timeline.

Specific facts

Missing: the White Staircase as a named feature (now independently corroborated); per-package minute breakdown summing correctly to 120; explicit ceremony-room capacities (~25 and 80–100); the 15-minute interior portrait window with the reason it's enforced; the package-specific differences for 1-hour and 3-hour variants. The Overview's most-cited fact bullets all corresponded to facts we hadn't surfaced on the page.

Authority signals

Held but not named: 700+ weddings shot since 2008; 304 verified reviews across five platforms; firsthand operational knowledge that competitor pages can't fake. Available, unclaimed: the council's enforced 15-minute interior portrait window — a verifiable, primary-source rule with a stated reason ("the next ceremony is queued behind"). Naming the rule and the reason converts a generic fact into a moat the competitor can't replicate without sourcing the same primary document — at which point they're amplifying our authority.

Errors and gaps

The competitor's broken arithmetic: their "2-hour timeline" sums to 100 minutes. We can publish a timeline that sums correctly. Our own error: six places on the page where confetti was sold as a feature, against a venue rule that bans it. Removing the error and turning the correction into an FAQ answer ("Can we have confetti at Belfast City Hall?" — naming the rule, the reason, and two workarounds) converts the biggest single liability on the page into a defensible authority claim.

Step 5 and 6 in practice — what landed, and where.

The patches went in across the three surfaces in a single coordinated pass. None of the changes are large in isolation; the work was in landing them everywhere simultaneously so the page, the schema and the LLM corpus told the same story.

Surface 1 · Visible HTML

What the human reads.

Six-block timeline summing to 120 (15 + 25 + 15 + 15 + 15 + 35), with the council's 15-minute cap and its reason inside the interior-portraits block. Named-spots section with six cards. Eight-item FAQ with the "best places" question at position one. Per-variant blocks below the total for the 1-hour (cuts off at the Rotunda) and 3-hour (adds an hour at the reception venue) packages. Confetti scrubbed in all six places and replaced with bubble send-off language wherever it had been sold as a feature.

Surface 2 · JSON-LD schema

What Google parses.

FAQPage mainEntity expanded from seven Q&As to eight, with the "best places" question as the first entry and the answer mirroring the visible HTML verbatim. Service description rewritten to remove the confetti reference. WebPage description updated. The cross-page #business node remained the source of truth for AggregateRating (304 / 4.97) and the per-platform Review breakdown we'd already wired in the previous pass.

Surface 3 · LLM corpus

What AI agents ground on.

The City Hall section in llms-full.txt grew from four to six subsections — minute-by-minute timeline as prose, named interior spots, what's included in the 2-hour package, portrait stops between ceremony and reception, 8-item venue-specific FAQ, reception pairings. The corpus answers match the visible FAQ verbatim because the on-site chat tool and the public schema both read from them, and any drift would produce a quote-vs-page conflict.

The validation gate at build time is what enforces this. Every JSON-LD block on every built page is parsed before deploy; the timeline minute-block durations are extracted and summed (the script fails if they don't add to 120); the FAQPage mainEntity length is checked against the visible <details> count to confirm no drift between schema and HTML; the LLM corpus is grep-checked for the verbatim phrase that appears in the FAQ Q1. None of those are sophisticated checks individually, but having them all run on every build means the only failure mode for three-surface drift is a deliberate one.

// pre-deploy validation summary ✓ 37 JSON-LD blocks across 61 built pages parse cleanly ✓ FAQPage mainEntity: 8 questions (matches visible <details> count) ✓ Timeline blocks: [15, 25, 15, 15, 15, 35] sum=120 ✓ "best places for wedding photos inside Belfast City Hall" present in h2 + FAQ summary + JSON-LD answer + llms-full.txt ✓ 0 dangling #shots anchor references after Shot list consolidation ✓ Confetti references: 6 → 0 (as a feature); 6 → 6 (in the "not permitted, here's the workaround" FAQ context)

Step 7 in practice — what we're measuring next.

The deploy went in, and the next two checks are scheduled rather than complete. First, the Google Rich Results Test against the live URL once the cache clears — confirming the expanded FAQPage is picked up and that no errors have crept in. Second, the AI Overview re-query in one to two weeks, after Google's recrawl cycle has had a chance to ingest the structural changes. The question isn't "do we now rank better?" — we were already at three and five and rank is the depreciating scoreboard. The question is whether the bullet about the timeline now cites us, whether the bullet about the named photo spots now cites us, and whether the confetti question — now that we own the corrected answer with the council rule and the workarounds — surfaces our page rather than a competitor.

Honest framing: the outcome layer of this case study isn't in yet. The build is shipped, the structure is documented, the patches are real and visible on the page. Whether the AI Overview re-attributes is a measurement we'll add when the recrawl cycle completes, in the same spirit as the existing rebuild case study — nothing claimed that isn't instrumented, and the process is the story until the outcome data lands.

04 · Failure modesTwo things to watch for, named out loud.

Mimicry.

Reading enough competitor pages quietly drags your voice toward theirs. There's a real risk in this playbook of producing content that's been so carefully optimised for structural parity that it could be any agency's, with all the trust-signal that implies. The defence is the third audit layer — authority signals — used as a constant check. If a change you're making would also make sense for the competitor with three reviews, it's not your differentiation; it's their parity. The City Hall page works partly because the 700+ weddings figure and the council's 15-min cap are in our voice, not borrowed. Strip those and the page is anyone's.

Inherited errors.

Copying competitor structure inherits their structural mistakes. The clearest example in this case study is the competitor's timeline summing to 100 minutes despite being titled and positioned as a two-hour breakdown. A naive "match what they're doing" pass would have produced the same broken sum on our page, and we'd be quietly competing on the same flaw. The fourth audit layer — errors and gaps — exists specifically to catch this. The same diligence applies to specific facts: if you can't verify a competitor's claim from a primary source, don't reproduce it just because the AI is citing it. The City Hall page's most defensible single line is that the 15-minute window is enforced because the next ceremony is queued behind. That sentence is true, verifiable from the council site, and sourced from an actual operating photographer. A competitor reproducing it would be quoting us, not the venue.

05 · The moatsWhat competitors can't copy back.

The work above produces a page that wins on structural parity, but the question worth asking out loud is what stops a competitor from running the same playbook on your page and overtaking you in three weeks. The honest answer is that the structural patterns themselves are not moats. Anyone can publish a minute-block timeline; anyone can add a FAQPage with eight questions; anyone can add the verbatim phrase to their h2. If that's all you've done, you've achieved temporary parity and the next round of the same audit by the same competitor closes the gap.

The moats that hold up are the verifiable, primary-source facts that anchor in reality the competitor can't reach. Three categories of them, in roughly ascending order of how durable they are:

Verifiable counts. The number of weddings you've shot, the year you started, the number of platforms you carry reviews on, the number of named venues in your portfolio. These are checkable against your gallery, your Companies House record, your platform profiles. A competitor with three reviews can claim three hundred, but the public reviews count contradicts them on every platform simultaneously. The 304-vs-3 asymmetry on the Belfast page is a stronger differentiator named explicitly on the page than left implicit.

Enforced rules with named enforcers. The council bans confetti on the grounds because the next ceremony is queued behind, and the published council page confirms it. The cathedral requires a permit for flash photography during evensong. The registrar caps the ceremony at 25 minutes plus paperwork. These are primary-source facts with named institutional enforcers and stated reasons. Competitors can copy the fact, but the moment they do, they're amplifying your authority because the source remains the same primary document and your page got there first.

Firsthand operational knowledge. The shape of the day, the actual minute counts that work, the names of the rooms that matter, the workarounds for the rules, the small disasters and how you handle them. This is the hardest moat for a competitor to forge because forging it requires actually doing the work hundreds of times. It's also the moat that's easiest to leave implicit by accident — most operators have it in their heads and don't bother writing it down. Writing it down explicitly, in the structural shape the AI grounds on, is the entire game.

Match every structural pattern, beat them on what they can't fake, skip what doesn't fit voice. Anyone can copy your structure. Nobody can copy your seven hundred weddings.

06 · What's nextThe targets queued behind this one.

The Belfast City Hall page is the first venue page we've taken end-to-end through this playbook. The natural next targets are the other venue pages where the same structural moves apply almost verbatim — Belfast Castle (also council-owned, same permit rule structure), Galgorm (one of the most-searched NI venues, very likely to attract an AI Overview), Merchant Hotel and the rest of the Cathedral Quarter venues. Each of those gets its own audit, its own before/after, and contributes to the broader case for the methodology.

Above the venue-page layer, the same playbook applies to the broad-term landing pages — "Northern Ireland wedding photographer" is a bigger fish with a harder competitive field and a correspondingly bigger payoff if won. That's future work; the venue pages are the natural staircase to climb first because the structural patterns are reusable and the per-page audit cost decreases sharply once the template is in place.

We'll come back to this case study with the outcome data when the recrawl cycle has run. The build is shipped, the page is live, the patches are visible on a real URL. What's left is the measurement layer — Rich Results Test pass, AI Overview re-query, brand-search delta, GA4 referral source segmentation by AI assistant. Same standard as the existing case study work: nothing claimed that isn't instrumented, and the process is the story until the outcome data lands.

Jody Nesbitt

Founder, Folium Studio — County Down, on the web since 1998. Writes about AI visibility, the static web, and what's left of SEO once the blue links stop being where the answer comes from.

How we re-write content to win the AI Overview citation.

01 · The frameWhy ranking and citation come apart, and why content shape is the lever.

02 · The playbookSeven steps, in the order they happen.

Confirm the term actually triggers an AI Overview.

Identify the cited competitors from the chips, not from a vibe.

Fetch each cited competitor's page directly, including raw source.

Audit four layers, in this order.

Match every structural pattern, beat them on grounded facts, skip what doesn't fit voice.

Land changes in three surfaces simultaneously.

Validate, deploy, measure.

03 · The worked exampleBelfast city hall wedding photography.

Step 3 in practice — what the competitors actually said.

Step 4 in practice — the four-layer audit, applied.

Step 5 and 6 in practice — what landed, and where.

What the human reads.

What Google parses.

What AI agents ground on.

Step 7 in practice — what we're measuring next.

04 · Failure modesTwo things to watch for, named out loud.

Mimicry.

Inherited errors.

05 · The moatsWhat competitors can't copy back.

06 · What's nextThe targets queued behind this one.

Ask a question.

01 · The frameWhy ranking and citation come apart, and why content shape is the lever.

02 · The playbookSeven steps, in the order they happen.

Confirm the term actually triggers an AI Overview.

Identify the cited competitors from the chips, not from a vibe.

Fetch each cited competitor's page directly, including raw source.

Audit four layers, in this order.

Match every structural pattern, beat them on grounded facts, skip what doesn't fit voice.

Land changes in three surfaces simultaneously.

Validate, deploy, measure.

03 · The worked exampleBelfast city hall wedding photography.

Step 3 in practice — what the competitors actually said.

Step 4 in practice — the four-layer audit, applied.

Step 5 and 6 in practice — what landed, and where.

What the human reads.

What Google parses.

What AI agents ground on.

Step 7 in practice — what we're measuring next.

04 · Failure modesTwo things to watch for, named out loud.

Mimicry.

Inherited errors.

05 · The moatsWhat competitors can't copy back.

06 · What's nextThe targets queued behind this one.

Ranking #1 and getting cited by AI are not the same win.