Keyword Clustering That Drives Results: 10 Lessons from Andy Chadwick

Topic Clusters & Revenue

Keyword Clustering That Drives Results: 10 Lessons from Andy Chadwick

Turn huge keyword lists into a small set of pages that rank, avoid cannibalization, and map cleanly to your architecture. These lessons are grounded in Andy Chadwick’s talk for Keyword Insights and real site examples.

1) What clustering solves

Clustering groups similar queries so one well-structured page can answer many variations. The result is fewer pages that rank better and no more self-competition. In most niches you need far fewer pages than keywords, which keeps crawl budget and internal links focused on the workhorses.

2) SERP vs NLP clustering

There are two main paths. SERP-based clustering checks overlap in Google’s top results and groups terms that share URLs. It mirrors what Google already serves, which means it respects intent splits that NLP might miss. NLP or semantic clustering is cheaper and fast for early passes, but it can blend distinct intents into one bucket.

ApproachStrengthsRisksWhen to use
SERP-based Follows real results, protects intent splits, great for competitive niches Higher compute cost, needs live SERP checks Final planning and enterprise migrations
NLP/semantic Fast, inexpensive, good for first pass May merge different intents Exploration and early grouping

Keyword Insights supports SERP-based clustering with adjustable overlap and intent labeling. Try it for strict, intent-aware clusters.

3) Enterprise case study: 50M URLs, heavy cannibalization

In the talk, Andy shares a U.S. real-estate site with about 50 million URLs stuck on page 2 because crawl budget was spread thin and near-duplicate pages competed with each other. By finding roughly 70 percent SERP overlap in categories like “homes for sale in California”, “houses for sale in California”, and similar variants, the team merged categories, redirected about 15 million URLs, and saw traffic rise around 110 percent in 3 to 4 months. Watch the talk for context.

Enterprise playbook

  • Export a massive keyword set, run SERP-overlap clustering
  • Group near-duplicates at the category level
  • Pick canonical versions and redirect the rest
  • Fix internal links so equity flows to the canonicals

Risk control

  • Phase redirects in batches, monitor Search Console
  • Keep a change log and revert only if needed
  • Capture old slugs in redirect rules to avoid 404 spikes

4) Small-site consolidation

Smaller sites often carry yearly variants and “near-same” topics. Convert sitemap URLs into their target keywords, cluster them, and merge duplicates like “most-followed Instagram 2022” vs “2023”, or overlapping posts like “best time to post on Instagram”. Keep the evergreen page and redirect the rest to build a durable winner.

5) Franchise duplicates

Decentralized teams often publish the same guide under city or region folders, which causes large-scale cannibalization. Clustering reveals stacks of near-identical posts that should become one master with local appendices, proper canonicals, or location parameters. The search signal concentrates and rankings stabilize.

6) Fast keyword-to-URL mapping with “tokenized” slugs

Tokenize your existing URLs into pseudo-keywords and cluster those alongside your keyword list. Give the tokenized items a special marker and very high volume so they always remain visible inside the clusters. This shows which keywords map to existing URLs and which clusters need new pages. Andy highlighted this in a PPC and SEO mapping workflow for Air Wick.

Export sitemap URLs Tokenize to pseudo-keywords add marker like !url_post_slug Cluster with your keyword list See which tokens sit in each cluster Map keywords to live URLs Mark clusters that need new pages

Copyable checklist

1) Export sitemap.xml to CSV 2) Convert slugs to readable tokens (replace dashes with spaces) 3) Prefix tokens with a marker like !url_slug_here 4) Set an artificial high “volume” for tokens so they sort to the top 5) Append tokens to your keyword list and run clustering 6) Inside each cluster, match real keywords to your tokenized URL 7) Mark clusters with no token as “new page required”
Copied

7) “Un-merge” terms that look the same but are not

Google often treats similar phrases differently. Plural vs singular or sibling nouns can live in different intent buckets. Examples from the talk include “vaporizer accessories” vs “vaporizer parts” and “skateboard wheel” vs “skateboard wheels”. If SERPs do not overlap, split the cluster and plan separate pages.

8) Find content gaps at scale

Pull a very large keyword universe for your topic, cluster it, then cross-check rankings to filter clusters where your domain does not show up. This surfaces net-new pages without hours of manual filtering. Prioritize by intent and business value, then brief.

9) Beat broad authorities with focused pages

Export all keywords a single giant page ranks for, cluster them, and split into multiple tightly focused articles. Andy showed an example of a page ranking for hundreds of terms that broke cleanly into a couple dozen clusters. The smaller, focused articles can outrank the broad piece because each one matches intent and depth better.

10) Zero-volume wins from forums and PAA

Scrape threads on Reddit to find recurring questions, cluster those variants, enrich with People Also Ask prompts, then publish concise answers. A skincare brand used this approach to grow quickly because questions with low reported volume still have strong intent and weak competition. Keyword Insights offers helpers like large Search Console exports that unlock far more queries for clustering. Give it a try.

Templates and quick wins

Consolidation plan

Goal: reduce cannibalization and lift a single canonical Scope: clusters with 60–80% SERP overlap Action: – Choose canonical page per cluster – Redirect near-duplicates (log old -> new) – Update internal links to point at the canonical – Add a short “what changed” note for users if needed QA: – Check 404s in logs – Watch Search Console coverage and queries for the canonical
Copied

Franchise clean-up

Inventory: – Export all /city/ and /region/ folders – Tokenize to detect duplicates across locations Decide: – One master resource + location appendices, or – Location pages with canonical to master when same content Ship: – Canonicals or redirects – Location-specific sections kept lightweight and unique
Copied

Zero-volume sprint

Sources: – Reddit threads, PAA, your support inbox Steps: – Scrape questions, cluster phrasing variants – Create short Q&A posts with internal links to pillars – Add schema where relevant and a simple table of contents – Measure assisted clicks from these posts to core pages
Copied

Brief fields per cluster

Objective: Audience: Primary intent: Top queries in cluster: Entities and definitions: Sections and examples: Internal links (hub, spokes, product): Primary CTA: Success metric:
Copied

FAQ

How strict should my SERP overlap be

Start moderate so obvious families group, then tighten for competitive topics. If top results share many URLs, one page usually serves the whole set.

Do I always map one cluster to one page

Most of the time yes. If the SERP shows two clear intents, split into two pages. Re-check after publishing and adjust links and anchors.

How often should I re-cluster

Quarterly is a good cadence. Add new queries, merge overlaps, and refresh briefs. Watch cluster-level rank distribution and traffic to catch decay early.

Where do I start if I have no tools

Begin with sitemap tokenization and a small NLP pass to spot obvious merges. Then move to a SERP-based tool like Keyword Insights for the final plan.

Case study stats and tactics summarized from Andy Chadwick’s talk. Use clusters to plan helpful pages that answer people’s questions and reduce duplicate targeting.