Cohort Analysis for SEO: Beyond Vanity Trends

How acquisition-month cohorts reveal whether your content compounds or quietly decays

Enric Ramos · · 12 min read
monitor screengrab

A site-wide organic traffic chart is the most reassuring lie in SEO. The line goes up, the team feels good, and nobody notices that 70% of the growth is from three articles published two years ago — while everything published in the last six months is flatlining. Cohort analysis is the lens that exposes this. It is also the single technique most in-house SEO teams skip because the spreadsheet work feels tedious.

The promise is simple. Group every landing page by the month it first started receiving organic traffic, then plot each cohort's traffic curve against time-since-publish. You stop seeing one aggregate line and start seeing dozens of small ones. Some climb steadily for 18 months — those are compounding assets. Some peak at month three and decay — those need a refresh strategy or pruning. Some never get above the noise floor — those are the content investments that didn't pay off, and you can stop pretending they will.

This article is a working recipe, not a theory piece. The specific spreadsheet structure, the GA4 export needed, the pillar-versus-satellite comparison, the freshness signal, and the decisions each pattern should drive.

Why aggregate charts mislead

The basic problem with reporting site-wide organic traffic over time is that you are summing two different stories: the existing content's behavior (which compounds, plateaus, or decays based on its age and relevance) and the new content's contribution (which depends on production cadence and topic choice). The two stories interact in ways that hide one inside the other.

Consider a site that publishes 10 articles a month. Six months in, traffic is up 40%. The team celebrates. Cohort analysis reveals that the 30 articles from months 1-3 are flat, the 30 articles from months 4-6 are slowly climbing, and a single piece from month 2 happened to win a featured snippet that drives 60% of new traffic. The "40% growth" is one lucky article; the rest of the work is producing nothing measurable. Without cohort analysis, that diagnosis takes years to surface.

Cohort analysis answers three questions a site-wide chart cannot. Are recently published articles starting to perform on the same curve as older articles, or has the curve flattened? Which content cluster has the steepest compounding curve? When does an article's traffic curve typically peak — and does that match your content refresh cadence?

What a useful SEO cohort actually is

A cohort, in its most useful form for SEO, is a set of landing pages grouped by the month they first received meaningful organic traffic. Not the month they were published — published articles often spend two to four months ramping into the index before their organic curve starts. The "first organic traffic month" is the better grouping variable because it normalizes the indexing lag and lets you compare cohorts on equal footing.

The threshold for "meaningful" is a judgment call, but a useful default is the month the page first receives 10+ organic sessions in a 30-day window. Below that, you are measuring noise. Above that, you have a pattern that's worth tracking forward.

A second axis matters: cohort by content type or cluster. Pillar pages and satellite pages behave differently, and lumping them into one cohort hides the most useful signal. The same is true for evergreen content versus time-sensitive content versus product-page-style content. Decide your content typology before you cohort, not after.

The third axis, used selectively, is freshness — when content was last meaningfully updated. A page published in 2022, refreshed in 2025, behaves more like a 2025 cohort than a 2022 cohort. We'll come back to this.

The data you need from GA4

GA4 makes the export side of cohort analysis straightforward, even if the analysis itself happens in a spreadsheet. The query you want is: monthly organic sessions per landing page, for as many months as you have data.

In GA4 Explore, build a Free-form exploration with rows = Landing page + query string (or just Landing page), columns = Month (using Date dimension's month component), values = Sessions, filter = Session source/medium contains google / organic (or (any organic)/Organic Search channel). Set the date range to all of your available history.

Export the table as CSV. You'll have a wide-format spreadsheet with one row per landing page and one column per month. This is your raw cohort matrix.

Two caveats worth knowing. GA4 thresholds data when row counts get small for privacy reasons — a landing page that only ever gets 1-2 sessions in a month may be aggregated as "(other)" in the export. For cohorting, those low-volume rows don't matter; we filter them out anyway. The second caveat: GA4's session definition counts the same visitor returning within 30 minutes as one session. For SEO cohort analysis, this is fine — it's the same definition across cohorts.

Read GA4 for SEO reports for the full set of GA4 reports worth pulling regularly; cohort analysis is one of the eight on that list.

The spreadsheet recipe

Here is the working structure. Columns: landing page URL, content cluster, publish month, first-organic-traffic month (the cohort label), and then one column per month of historical data.

Step 1 is to compute the first-organic-traffic month for each row. The formula in Google Sheets is something like =INDEX($F$1:$AZ$1, MATCH(TRUE, $F2:$AZ2>=10, 0)) where row 1 is your month headers and the threshold is 10 sessions. That fills the cohort column.

Step 2 is to compute "months since first traffic" for each cell. This converts your absolute-time matrix into a relative-time matrix. A cell that's three months after the cohort month gets value "3"; the cell at the cohort month gets value "0". This is the transformation that lets you compare cohorts directly, regardless of when each was acquired.

Step 3 is to pivot or aggregate. Group rows by cohort month, then average (or sum, depending on what you want to see) the relative-month columns. You now have one row per cohort and the average traffic curve from month 0 onward. Plot this and you have the picture.

Step 4 is the layered version: split the pivot by content cluster. Now you have one curve per cluster per cohort, and the differences become visible. Pillars from Q1 climbing for 12+ months. Satellites from Q1 peaking at month 4 and gently decaying. Time-sensitive news content peaking at month 1 and dropping fast.

Step 5 is the freshness overlay. Add a column for "last meaningful refresh date" and re-cohort the refresh as if it were a new acquisition. Pages that got a refresh in 2025 should be part of the 2025 cohort for the months after the refresh, not the 2022 cohort. This is laborious to do by hand, so most teams only apply it to the top 50-100 pages — which is fine, since those are the pages that drive most decisions.

What pillar versus satellite cohorts reveal

Pillars and satellites have structurally different traffic curves, and a site that doesn't separate them in cohort analysis is averaging two different patterns into a single misleading line.

A well-targeted pillar page typically shows a slow ramp — month 0 to month 3 builds slowly as the page accumulates authority signals, internal links from satellites, and external backlinks. From month 3 onward, the curve compounds: traffic grows 15-30% month-over-month for 12-18 months, then plateaus at a high baseline. A pillar that's working has a curve that bends upward through the first year.

A satellite, by contrast, has a steeper but shallower curve. It starts ranking faster (lower competition, more specific intent), peaks somewhere between month 3 and month 6, and then either stabilizes at the peak (an evergreen satellite) or starts a slow decay (a time-bound satellite). A satellite that decays past month 12 is signaling either that the topic has become saturated by competitors or that the content has aged out.

The diagnostic value of separating these two cohorts: if your pillar cohorts are flat and your satellite cohorts are climbing, your topic clusters are upside-down — the satellites are doing the topical authority work the pillars should be doing. If both are flat, you have a content quality or technical SEO problem upstream of the cohort. If pillars climb and satellites decay quickly, your satellite refresh cadence is too slow.

This pattern is also where the TOFU/MOFU/BOFU framework intersects with cohort analysis. TOFU satellites tend to peak and decay faster (search intent is broad and saturable). BOFU satellites tend to plateau at lower volume but higher conversion rate. If you cohort by funnel stage, you'll see this shape clearly.

The freshness cohort signal

The single most actionable insight cohort analysis surfaces is the freshness signal: at what month does an average article in your portfolio start to decay, and how does refreshing it change the curve?

Build a sub-analysis. Take your top 50 satellites by historical traffic. For each, identify the month their traffic peaked and the month their traffic crossed below 80% of peak. The distribution of "months from publish to 80%-of-peak" is your decay window for that content type. For most evergreen B2B SaaS content, the window is 14-22 months. For e-commerce category content, it's 18-30 months. For news-adjacent content, it can be 2-4 months.

Once you know your decay window, your content refresh calendar follows naturally: schedule refreshes 2-3 months before the typical decay point. The refresh itself becomes a fresh acquisition event in your cohort matrix, and you can measure its impact by comparing the post-refresh curve against the pre-refresh decay trajectory.

A useful quick check: the day-30 ratio. After a refresh, the page should regain at least 80% of its pre-decay peak within 30 days. If it doesn't, the refresh wasn't substantive enough — usually you changed the date and a few sentences instead of meaningfully updating the analysis. See content pruning for the companion decision: when a page is past saving, removing it can be a net positive for the site's overall topical authority signal.

Cohort comparison across acquisition channels

A subtle but powerful extension: cohort the same landing pages by acquisition channel as well as by acquisition month. The same article often performs very differently depending on whether its first traffic burst came from organic, social, paid, or referral.

A page with a strong social acquisition pulse but weak organic build often plateaus quickly — social spikes don't seed enough internal-link weight or backlinks to compound. A page with a slow organic build that gets reinforced by a backlink event two months in often shows a step-change in the curve at the point of the backlink. A page that started with paid traffic and then gradually accumulated organic shows the cleanest compounding curve, because the paid traffic helped Google discover the page and the organic curve takes over once authority is established.

This analysis informs editorial strategy: if your cleanest-compounding cohorts all started with a paid amplification, your ungated content launches should systematically include a small paid push to seed the algorithm. If your worst cohorts all came in via social-only acquisition, social-first content needs a separate measurement framework, not the SEO cohort one.

The decisions cohort analysis should drive

Cohort analysis is only valuable if it changes what you do next. The five decisions it most directly informs:

Content investment allocation. If a content cluster's cohorts compound at 2x the rate of another cluster's, the budget shift is obvious — invest more in the compounding cluster, slow production in the underperforming one. Don't kill the underperforming one outright; it may be doing brand work that doesn't show in conversion data.

Refresh cadence. The decay window from cohort analysis tells you when to refresh, by content type. Hard-code this into your editorial calendar. The instinct to "refresh when traffic drops" is reactive; cohort-driven refresh is proactive and 2-3x more effective.

Pruning decisions. Pages that never crossed the cohort entry threshold (10+ sessions/month after 6+ months) are candidates for content pruning. They aren't doing topical authority work; they may be eating crawl budget and diluting your topic signal.

New cluster validation. When you launch a new content cluster, watch its cohort curves at month 3, 6, and 12 against your established cluster benchmarks. If month-6 cohort traffic is below 30% of established-cluster benchmarks, the cluster isn't working and you stop investing — early, not after a year of sunk-cost continuation.

Topical authority diagnostics. Across all cohorts, the average month-12 traffic per published page is your topical authority indicator. If it's growing year over year, your authority is compounding. If it's flat, you're treading water — producing more content to maintain the same per-page output. This is a slow-moving leading indicator that catches problems six months before site-wide traffic drops show them.

Common mistakes to avoid

Cohort analysis is conceptually clean and operationally messy. The mistakes that turn the analysis into noise:

Cohorting by publish date instead of first-traffic date. The publish date is when you hit "publish"; the first-traffic date is when Google decided your page was real enough to send traffic. The lag between the two is meaningful and varies wildly. Cohorting by publish date adds noise that obscures the real curves.

Mixing pillars and satellites in one cohort. Their curves are structurally different. Averaging them produces a curve that doesn't describe either accurately.

Ignoring the seasonality overlay. A cohort acquired in November 2024 and a cohort acquired in March 2025 may have very different "month 6" results because their month-6 corresponds to different seasonal demand. For cyclical industries, you need to seasonally-adjust the relative-time matrix.

Using too short a history. Cohort analysis with under 12 months of data is mostly speculation. The compounding curves don't reveal themselves until month 12-18. Be patient with the methodology.

Conflating refresh with re-publish. Some teams change the publish date when they refresh, which makes the refresh look like new content in the cohort. Refreshes should keep their original publish date and be tracked as a separate refresh event, so the cohort math stays clean.

Putting cohort analysis into your monthly cadence

Once the spreadsheet is set up, monthly maintenance takes about an hour. Re-run the GA4 export with the latest month's data, append it as a new column, recompute the relative-month index for the rows that crossed thresholds, and refresh the pivot. The deliverable from each run is a one-paragraph narrative — what the latest cohort is doing relative to its predecessors, and what decision it suggests.

The most useful artifact to keep is a quarterly cohort review. Once every 90 days, walk through every cohort and ask: is this still on its expected trajectory? Where it's not, what's the cause? This is the meeting where most of the editorial calendar's next quarter gets shaped.

For the broader measurement frame this fits into, see the SEO analytics stack pillar. For the report-out side — translating cohort findings to executives — see reporting SEO to non-SEO stakeholders. For the GA4 mechanics that make cohorting fast, GA4 for SEO reports covers the eight reports worth pulling at the same monthly cadence.

The goal of cohort analysis is not to produce a beautiful chart. It is to make decisions you would not have made looking at the aggregate. If a cohort review doesn't change a single thing about next quarter's plan, you either have an extraordinarily well-tuned content engine or you're running the analysis for show. The first is rare; the second is the more common diagnosis.

Related articles