# Cross-Channel Attribution

First-click attribution and predictive conversion maturation — two methods that cross-validate to produce click-time revenue you can optimize against.

---

Every attribution model has a bias. Most of them bias in the same direction — toward the channels that touched the user last, not the ones that introduced them. That bias isn't a bug. It's the business model.

## 01 — The Direction Problem

Attribution models disagree on how to distribute credit. But they almost all agree on direction: credit flows toward the bottom of the funnel. The channel that touched the user last gets the most. The channel that introduced them gets the least — or nothing.

This isn't a quirk of one model. It's a pattern across four distinct mechanisms, each independently tilting credit the same way.

### The retargeting machine

It starts the moment someone visits your site. A platform pixel fires. The user is added to retargeting audiences across every ad network with a pixel on your page — Google Display, Meta, TikTok, your DSP — before your exclusion audiences even update.

From here, the credit theft escalates in two stages:

**Click retargeting.** The user sees a retargeting ad, clicks it, and converts. The retargeting platform claims the conversion. The prospecting ad that introduced this person — the only channel that did something no other channel could have done — gets no credit.

> **[Illustration: Retargeting Loop]**
> Vertical step-by-step flow diagram showing how a user discovered via prospecting gets intercepted by retargeting. Six steps animate sequentially: (1) User visits site organically, (2) Platform pixel fires (< 1 sec), (3) User added to retargeting audience, (4) Retargeted ad served (within minutes), (5) User clicks retargeting ad and converts, (6) Platform claims credit. The final step is highlighted to emphasize the false claim.

**Impression retargeting (post-view).** The user doesn't even click. They see an ad — or more likely, they scroll past one — and convert later on their own. The platform claims a "view-through" conversion. The user never engaged. The platform just happened to show an ad to someone who was already going to buy.

Post-view is worse than click retargeting because the bar for claiming credit is zero interaction. A large share of display ad impressions are never actually seen — the user never scrolled to them, or the ad loaded below the fold. Yet platforms count post-view conversions for all of them. Platform-reported conversions routinely sum to several times the actual number. This is not a bug. It's how post-view attribution is designed to work.

> **[Illustration: Post-View Credit Grab]**
> Hourglass fan-out flow diagram showing how view-through attribution inflates conversions. A user visits organically, pixels fire across three platforms (< 1 sec), each platform serves an impression (hours later), the user converts on their own (days later), and all three platforms claim credit. Punchline: "1 conversion — counted 3x."

The attribution window is the tell. Facebook shortened its default post-view window from 28 days to 1 day — not because 1 day is more accurate, but because a 7-day window would attribute nearly everything to social, and that doesn't look realistic. The window was shortened for optics, not accuracy. A calibrated lie.

Here is the question this raises: if you have platform pixels on your website, every returning visitor gets retargeted. What conversion *isn't* post-view retargeting?

Standard attribution often shows retargeting ROAS of 10x or higher. Incrementality testing consistently reveals the true lift is a fraction of that — sometimes negative. The gap exists because retargeting excels at claiming credit for conversions that were already happening.

> **[Illustration: Bias Comparison]**
> Side-by-side horizontal bar charts comparing two attribution biases across six channels (Brand Search, Retargeting, Direct/Organic, Display, Paid Social, YouTube). Panel 1 "Last-Click" shows credit concentrated in Brand Search and Retargeting. Panel 2 "Post-View" shows credit smeared across everything — channels with minimal spend show substantial attribution because everyone scrolls past ads.

### Brand search cannibalization

When someone searches your brand name, they already know you exist. They are navigating to your site, not discovering it. The organic result is right below the ad — same destination, one scroll away. Without the ad, the click goes to the organic link. The visit still happens. The conversion still happens.

Brand search ads intercept this intent and claim credit for it. Attribution records the ad click as the converting touchpoint. But the ad didn't create the visit — it intercepted a visit that was already happening. The reported ROAS is high because the denominator is small (brand keywords are cheap) and the numerator is large (these users were already going to buy). The actual incremental impact — conversions that would not have happened without the ad — is a fraction of what's reported.

Geo-testing confirms this reliably. Pause brand ads in a set of regions, keep them running everywhere else, and compare. The pattern is always the same: nearly all of the paid brand traffic shifts to organic. The conversions don't disappear. They just arrive through a different link. The gap between reported brand search performance and actual incremental impact is consistently enormous — often an order of magnitude.

### Algorithmic multi-touch

Algorithmic multi-touch attribution — Shapley values, Markov chains, Google's Data-Driven Attribution — borrows from cooperative game theory and probabilistic modeling. The idea: calculate each channel's "fair" contribution by evaluating all possible coalitions of channels in a conversion path.

The fair-credit calculation requires testing every possible ordering of channels in the conversion path. The number of orderings grows explosively: five channels produce 120. Twenty channels produce over two quintillion. In practice, nobody runs the full calculation. They run simplified versions — statistical sampling that trades mathematical rigor for computational feasibility. The "fair" attribution you see is an approximation of an approximation.

Google's DDA is worse: a proprietary black box that requires 400 conversions over 28 days to activate. Below that threshold, GA4 silently falls back to last-click without telling you. Most accounts never hit the threshold. They think they're running data-driven attribution. They're running last-click with a better label.

But the deeper problem isn't computation. It's the counterfactual. Algorithmic MTA does not ask: "What would have happened if no marketing action were taken?" These models run on cookie-based journey data that is fundamentally incomplete — cross-device, cross-browser, ITP, and consent gaps mean the input is missing a significant share of the actual journey. A brand search ad gets credit because the user clicked it before converting, even when the user was already navigating to your site. A retargeting ad gets credit because it appears in the observed path, even when the user was already in checkout.

Algorithmic multi-touch doesn't remove bias. It launders it through enough math that no one can trace it back.

### In-app browser fragmentation

When a user taps an ad on Instagram or TikTok, it opens in the app's embedded browser — a separate cookie sandbox from Safari or Chrome. The user browses your site, then switches to their default browser and converts. Analytics records the ad click and the conversion as two unrelated visitors. The ad gets no credit. The conversion appears organic. This problem is covered in depth in the [Identity Graph](/measurement-engine/identity-graph) page.

> Every bias over-credits in the same direction — toward the channels that touched the user last, not the ones that introduced them. This isn't noise. It's a tilt.

---

## 02 — Five Properties of Trustworthy Attribution

Before choosing an attribution model, define what you expect from one. Not every model needs every property. But knowing which properties a model has — and which it lacks — prevents the common mistake of trusting a number because it came from something called "data-driven."

The table below scores seven attribution models against five properties. Each property is a standalone requirement — a question you should be able to answer about any model before trusting its output.

**Traceable** — A conversion must be traceable to a specific session: one click, one timestamp, one landing page. Without traceability, you cannot investigate why a specific conversion was attributed the way it was. Traceability is the foundation of debugging and auditing.

**Explainable** — The attribution logic must be explainable in one sentence to a non-technical stakeholder. If you cannot explain it simply, you do not understand it well enough. Complexity that cannot be communicated cannot be challenged.

**Directionally Neutral** — Does the methodology systematically over-credit one end of the funnel? A directionally neutral methodology has no structural tilt — its errors are either random or self-correcting, rather than consistently favoring acquisition or re-engagement channels.

**Auditable** — The raw data and attribution logic must be inspectable in your own data warehouse. If you only see the output but not the inputs or the calculation, you are trusting a black box. Auditability means you can reproduce the result independently.

**Deduplicated** — Each conversion must be counted exactly once across all channels. If multiple systems independently claim credit for the same conversion, the total attributed conversions will exceed actual conversions. Deduplication requires a single source of truth.

> **[Interactive: Attribution Comparison Table]**
> HTML table scoring seven attribution models (Ad Platform Post-Click, Ad Platform Post-View, Ad Platform "Incremental", Last-Click, First-Click, Algorithmic MTA, "Cookieless") against five properties (Traceable, Explainable, Dir. Neutral, Auditable, Dedup.). Each cell shows a pass (filled circle), partial (half circle), or fail (X circle) icon. First-Click is the only model that passes all five properties and is highlighted with an accent row.

> **[Interactive: Attribution Rationale]**
> Seven collapsible accordion cards, one per attribution model. Each expands to show the scoring rationale for all five properties with pass/partial/fail icons. Explains why each model earned its score.

### Same journey, three models

Consider a real conversion path: Meta Prospecting ad, then an organic return visit, then Google retargeting, then TikTok retargeting, then a brand search click, then conversion.

> **[Interactive: Journey Three Models]**
> A five-touchpoint journey (Meta Prospecting, Organic Return, Google Retargeting, TikTok Retargeting, Brand Search) shown under three attribution models via a segmented control. Last-Click gives 100% to Brand Search. Algorithmic MTA distributes credit across all touchpoints (18%, 12%, 22%, 25%, 23%) — the discovery channel gets less than each retargeting channel. First-Click gives 100% to Meta Prospecting. Horizontal credit bars animate when switching models. Each model includes a verdict explaining its bias.

### The triangulation fallacy

A common response to broken attribution: "Use multiple models and triangulate." The logic sounds reasonable. GPS uses multiple satellites. Surely multiple attribution models converge on truth.

The analogy fails. GPS satellites measure the same physical reality — the speed of light is constant, the satellite positions are known, and the signal propagation is governed by physics. Attribution models don't measure reality. They impose different allocation rules on the same incomplete data. Averaging a compass that points north and a compass that points south doesn't give you east. It gives you the illusion of a direction.

---

## 03 — Why First-Click Is Closest to Incrementality

The question attribution should answer isn't "which channel touched the user last?" or "how should credit be distributed fairly?" It's: "which channel introduced this person for the first time?"

That question — who introduced the user — is the closest any click-based attribution method can get to incrementality. A channel that brings someone to your site for the first time has done something no other channel in the path could have done. Every subsequent touchpoint — retargeting, email nurture, brand search — requires that first visit to exist.

### The logic chain

1. **Incrementality** measures whether a conversion would have happened without marketing intervention
2. The most incremental touchpoint is the one that **introduced the user** — without it, no subsequent touchpoints exist
3. The channel that introduced the user is the **first click**

First-click attribution is not perfect incrementality. But it is the only rule-based attribution method whose bias — crediting discovery — aligns with the question marketers actually need answered: "Where are my new customers coming from?"

### Identity graph prerequisite

First-click attribution is only as good as your ability to identify who the user is across sessions. If the identity graph fails to stitch a retargeted session back to the user's original visit, first-click credit goes to the retargeting channel — the same error every other model makes.

This is why [the identity graph](/measurement-engine/identity-graph) is a prerequisite, not an optional add-on. Sessions are grouped by `universal_id` (post-identity-graph), not by `anonymous_id`. The first click is the first session for the resolved user, not the first session for a cookie.

### Attribution window

First-click does not mean first-ever. The attribution window is configurable: 30 days by default, up to 90 days maximum. If a user's first visit was 120 days ago, the attribution window has closed. Their next visit starts a new attribution cycle.

This prevents the absurdity of crediting a Facebook ad from six months ago for today's conversion. The window defines how far back "first" can mean.

Attribution accuracy matters because every downstream decision — budget allocation, marginal ROAS, channel sequencing — compounds from it. Wrong attribution at the source produces wrong optimization everywhere else.

This is the attribution methodology the SegmentStream Measurement Engine implements: first-click credit assignment on identity-resolved journeys, with click-time reporting and maturation prediction. The sections below describe how.

---

## 04 — Click-Time vs Conversion-Time Reporting

Attribution determines which channel gets credit. But there is a separate question that distorts the numbers just as badly: *when* does that credit land in your report?

Most analytics platforms default to conversion-time reporting — revenue appears on the date the conversion happened. This creates a problem whenever spend and conversions don't land in the same period. Cut your budget this month and conversions from last month's high spend inflate this month's ROAS. Increase spend and conversions haven't caught up yet — ROAS craters. Shift budget between channels and the lag makes the old channel look better and the new one look worse. Any budget fluctuation creates the distortion.

### The seasonal illusion

The distortion is easiest to see when spend varies across months. A business runs campaigns from April through September, scaling up for peak season and down after. True ROAS is 3.0 across every month — same campaigns, same performance. But their analytics shows something different:

> **[Interactive: Seasonal Toggle]**
> Toggle between Click-Time and Conversion-Time views of the same campaign data (April–September). Click-Time table shows consistent ROAS 3.0 across all months. Conversion-Time table shows distorted ROAS: April 0.6, May 1.2, June 4.5, July 4.0, August 2.2, September 10.2 — with "illusion" labels explaining the wrong conclusion each number would trigger. Same campaigns, same performance, opposite conclusions.

Conversion-time reporting attributes revenue to the month the conversion happened, not the month the click happened. Clicks in April drive conversions in June. The result: April looks like a failure (ROAS 0.6) and the team concludes "campaigns underperforming — cut budget." September shows ROAS 10.2 and the team concludes "best month ever! Keep going!" Same campaigns. Same performance. Opposite conclusions — both wrong.

Click-time reporting with maturation prediction fixes this — accurate ROAS within days, not months.

### Click-time reporting

Click-time reporting fixes this by stamping revenue to the date of the original click, not the date of the conversion. Every conversion produces two records: one at the conversion date (for finance), one at the click date (for marketing).

Click-time reporting answers: "What is the ROAS of the clicks I paid for in April?" — regardless of when those clicks converted. Conversion-time reporting answers a different question: "How much revenue was recognized in April?" Both are valid. But only click-time ROAS tells you whether your April campaigns worked.

### The maturation tradeoff

Click-time reporting has a catch: recent periods are always incomplete. Clicks from two weeks ago are still converting. ROAS for recent cohorts is artificially low because conversions are still arriving.

This is where conversion maturation prediction comes in. The system projects what the final ROAS will be based on the maturation pattern of older, fully-converted cohorts. The result: you can read click-time ROAS for last week without waiting a month for conversions to finish arriving.

> The reporting window is invisible infrastructure. Get it wrong and every ROAS number, every budget decision, every scaling call is built on a mirage. Click-time reporting with maturation prediction is the fix.

---

## 05 — The Infrastructure

First-click attribution is a rule. Implementing it correctly requires infrastructure that most analytics stacks don't provide. The SegmentStream Measurement Engine runs three capabilities underneath the attribution output, each solving a specific measurement problem.

### Conversion maturation prediction

Click-time reporting has a lag problem. Clicks from this week have not finished converting yet. ROAS for recent periods is artificially low because conversions are still arriving.

Maturation prediction solves this with statistical projection — not machine learning. The approach:

1. Observe weekly conversion curves for mature cohorts (clicks old enough that conversions have stabilized)
2. Calculate the median maturation period — how many days until a cohort reaches 90%+ of its final conversion count
3. Project forward for immature cohorts using the mature pattern, with a 10% growth threshold to determine when a cohort is considered fully matured
4. Score confidence based on sample size and variance. High-variance cohorts get lower confidence scores and wider prediction intervals

> **[Illustration: Maturation Curve]**
> Horizontal bars showing weekly conversion maturation over 6 weeks: Week 1 (38%), Week 2 (65%), Week 3 (82%), Week 4 (91%), Week 5 (96%), Week 6 (99%). Each bar has a filled portion (observed) and empty portion (remaining expected). A vertical dashed "You are here" line at Week 3 marks 82% tracked. Earlier weeks are fully matured; later weeks show observed counts understating true performance.

The prediction uses a minimum 3-week sample of mature data. Built-in data leakage prevention ensures the model never uses future conversion data to predict current performance.

### Noise smoothing

The smaller the campaign, the noisier the signal. A campaign averaging 2 conversions per day — 14 per week — swings 200% from pure randomness. That isn't a performance change. It's a coin flip.

> **[Illustration: Noise Sample Size]**
> Three horizontal range bars centered on true CPA $60, showing variance at different conversion volumes. 50 conv/day: $54–$66 (±10%), tight range highlighted in accent. 5 conv/day: $30–$90 (±50%), medium range. 1 conv/day: $0–$180 (±100%), full-width range. Bars expand outward from the true CPA center line on scroll.

For most campaigns, neither daily nor weekly data has enough conversions to be statistically meaningful. "CPA spiked 150% yesterday, pause the campaign." "ROAS dropped 40% this week, cut budget." Both are reactions to noise, not signal. The campaign didn't change. The sample is just too small for either reporting window.

Noise smoothing applies sliding-window normalization with anomaly detection. Spikes above 1.5x the window median and drops below 0.5x are flagged and dampened. The maturation confidence score penalizes high-variance cohorts — if the data is noisy, the prediction says so.

> **[Illustration: Noise Smoothing]**
> Two-line overlay chart. Raw daily CPA (gray, jagged line) swings wildly across 14 data points. 7-day rolling average (accent/indigo, stable line) barely moves. The visual gap between the lines IS the noise. End labels show yesterday's raw CPA vs. the 7-day average. The campaign didn't change — the sample size is too small for daily reporting.

### Consent modeling

Users who decline cookie consent are invisible to attribution. The identity graph cannot see them — by design. But they still click ads and they still convert. Ignoring them systematically underreports every channel's true performance.

The Measurement Engine fills this gap using bucket-level statistical redistribution. For non-consented clicks, the system captures: date/time, geolocation, user agent, landing page product, and traffic source. For non-consented conversions: date/time, geolocation, user agent, and purchased product.

These signals are matched at the bucket level — channel by geo by product category by time window — to redistribute unattributed conversions proportionally across channels.

> **[Interactive: Consent Modeling Table]**
> HTML table showing four sources (Google Ads, Meta Ads, Direct, Unattributed) with columns for Clicks, Sessions, Conversions, Consent %, and Modeled conversions. Below the table, an animated SVG shows the bucket matching process: left-side click dimensions (Date & Time, Geolocation, User Agent, Landing Page Product) and right-side conversion dimensions (Date & Time, Geolocation, User Agent, Purchased Product) converge through merge brackets into a central "Bucket Match" box.

The limitation is explicit: consent-modeled conversions have no individual customer journey. They are statistical estimates, not deterministic links. The system labels them accordingly so downstream reports can distinguish between tracked and modeled conversions.

---

## 06 — Calibrated Imperfection

### Two opposing forces

First-click attribution has a built-in balancing mechanism that most people miss. Two forces pull credit in opposite directions:

**Force 1: Stitched journeys push credit up.** When the identity graph successfully links a returning visitor to their original session, first-click credit flows to the channel that introduced them — by definition an acquisition or awareness channel. Better stitching means more upper-funnel credit.

**Force 2: Unstitched journeys push credit down.** When the identity graph fails to link sessions, a returning user's retargeting or brand search visit looks like a first visit. Credit flows to a lower-funnel channel. Worse stitching means more lower-funnel credit.

These forces pull in opposite directions. The stitching rate is the dial. At real-world stitching rates, the result is approximate directional neutrality — not because the system is tuned for it, but because the two error modes naturally counterbalance.

### The retargeting test

You can measure the balance directly. Check the first-click conversion rate on retargeting campaigns.

Retargeting reaches people who already visited your site. If the identity graph correctly stitches the retargeted session to the original visit, first-click credit goes to the channel that originally introduced the user — not to retargeting. Near-zero retargeting credit means Force 1 dominates: stitching is working.

If retargeting shows significant first-click conversions, Force 2 is active: the identity graph is failing to link sessions. The retargeting first-click rate tells you where you sit on the spectrum between upper-funnel bias and lower-funnel leakage.

> **[Interactive: Attribution Split View]**
> Three-column comparison table showing 6 channels (Paid Social, Paid Search, Organic, Retargeting, Email, Direct) with first-touch conversion counts "Without Identity Graph" versus "With Identity Graph." Key shifts: Retargeting drops from 18 to 5, Email drops from 12 to 2, Direct drops from 51 to 38. Paid Social rises from 26 to 41, Paid Search from 22 to 32, Organic from 18 to 29. Numbers animate with count-up on scroll.

### The 100% stitching counter

A natural objection: "So first-click only works because stitching is imperfect? At 100% stitching, it would just be an upper-funnel bias machine."

The opposite is true. At 100% stitching, every journey is fully resolved. Every retargeted session is linked to its original visit. Every brand search click is traced back to the campaign that introduced the user. In that scenario, first-click credit goes to the channel that genuinely introduced the person — pure incrementality credit.

What looks like "upper-funnel bias" at perfect stitching is actually the correct answer: the acquisition channel did introduce the user, and every subsequent touchpoint was re-engagement. The directional neutrality at current stitching rates is a fortunate practical property. The theoretical endpoint is even better.

> Perfect attribution doesn't exist. The choice is between a model with a known, disclosed, testable upper-funnel bias — and models with hidden, undisclosed, untestable biases that systematically favor the channels selling the most ads.

---

## 07 — See It Work

Everything described above runs inside the SegmentStream MCP server. Two queries show cross-channel attribution in practice. The first pulls a channel-level attribution report with maturation predictions. The second traces a single user's journey to show which touchpoint received first-click credit and why.

> **[Interactive: Attribution Terminal]**
> Claude Code-style terminal with two tabs. Tab 1 "Attribution Report": prompt "Show first-click attribution by channel for last 30 days with maturation" calls `get_attribution_model` and `get_report_table`. Shows a 6-channel table (Meta Prospecting, Google Search, TikTok Ads, Google Retargeting, Meta Retargeting, Brand Search) with columns for Costs, observed Revenue, observed ROAS, projected Revenue, and projected ROAS. Insight: Meta Prospecting is 1.3x observed but 3.2x projected; TikTok looks unprofitable at 0.9x but projects to 2.4x; retargeting stays below 1.0x projected. Action: shift $6K from retargeting to Meta Prospecting. Tab 2 "User Journey": prompt "Show the attribution path for the most recent conversion" calls `get_user_journey`. Shows a 3-device, 4-session journey (iPhone Instagram in-app via meta/paid_social with fbclid, iPhone Safari via IP match, Desktop Chrome via user_id match with google/retargeting and google/brand_search) ending in a $312 purchase. Attribution result: First-click, 100% credit to meta/paid_social.

### What the report shows

The attribution report combines three layers: first-click credit assignment (which channel introduced the user), click-time reporting (revenue attributed to the click date, not the conversion date), and maturation prediction (projected final ROAS for immature cohorts).

The user journey query exposes the full resolution path: every anonymous session, the identity signals that linked them into a single `universal_id`, the first session that received attribution credit, and the traffic source of that session.

### Validating the output

Every number in the attribution report is auditable. You can trace a conversion to a specific `universal_id`, see every session in that user's journey, verify which session was first within the attribution window, and confirm the traffic source recorded for that session.

The data lives in your BigQuery warehouse. You can query it directly. The attribution logic is deterministic: given the same inputs, it produces the same outputs. There is no black box.
