Why your attribution numbers don't add up (and 5 fixes that actually work)
Most attribution projects fail. Not because the team picked the wrong model, and not because the tool didn't work. They fail because the underlying touchpoint data is a mess.
Bad touchpoint data looks like this:
- First-touch source is "unknown" for 38% of leads
- Two channels that should add to 100% total credit actually add to 83%
- The same customer appears under three different company records
- Ad platform reports show 2× the conversions your CRM shows
- Revenue by channel doesn't reconcile to revenue by lead source
If any of these sounds familiar, no attribution model is going to save you. This post is about the five data hygiene problems that produce those symptoms, ordered by how much damage they do.
Problem 1: UTM sprawl and inconsistency
The symptom
You look at your CRM's lead source report and see: google-ads, Google Ads, google/ads, google_ads, Google, google, paid_search, paidsearch, and g-ads — all of which are the same thing.
The cause
No one defined the UTM parameter taxonomy. Every marketer adds campaigns with whatever UTM values feel right at the time. After 18 months you have 40 variations of 8 actual channels.
The fix
-
Publish a UTM taxonomy document. One page. Medium rare:
utm_source={platform}(one of 6–10 approved values),utm_medium={channel-type}(cpc, social, email, referral, etc.),utm_campaign={campaign-id}. Examples for every approved combination. -
Build a normalization layer. Don't rely on marketers to get UTMs right. Have your CRM (or a pipeline step) normalize
google-ads,Google Ads,g-adsall togoogle_adsbefore saving. Use regex or an allow-list. -
Block unknown UTM values at ingest. If a UTM source doesn't match the allow-list, log a warning and flag the lead for review.
-
Audit weekly for the first 3 months. UTM discipline doesn't stick until it's enforced. Once the normalization layer catches 100% of variations, you can drop this.
Problem 2: CRM-marketing dedupe failure
The symptom
A customer at "Acme Corp" appears in your CRM as three different contact records with three different first-touch sources. Their attribution is fractional across three channels. The analytics team can't get a clean view of the account.
The cause
Email capture forms don't dedupe against existing CRM records. Marketing sends a lead to CRM; sales creates a new contact for "the same person." A webinar form submission creates another. Each has its own lead source.
The fix
-
Dedupe on email (case-insensitive, normalized). Every lead-capture source must check if an existing contact has the same email before creating a new one. This is basic but widely skipped.
-
Store attribution at the contact level AND the account level. For B2B, the account is often what matters — 4 people from Acme Corp interacting with your brand should roll up to one "Acme Corp account attribution" view.
-
Don't overwrite first-touch on update. When a known contact re-engages, do not overwrite their first-touch source. Many CRMs do this by default and it destroys your multi-touch history.
-
Audit monthly for duplicate companies. Same as emails but for company records. "Acme Corp" / "Acme Corporation" / "acme" / "ACME Inc" — merge them. Ongoing.
Problem 3: cookie loss and cross-device tracking
The symptom
Your first-touch data is blank (or "direct / none") for a suspicious percentage of leads — often 20–40%. You know these leads came from somewhere, but the first-touch attribution is missing.
The cause
- Users browse on mobile, convert on desktop (or vice versa). Cookies don't follow them.
- ITP and similar privacy features clear tracking cookies after 7 days.
- Users clear cookies.
- Cross-domain tracking wasn't set up properly (cookies on www.elir.app don't follow to docs.elir.app).
The fix
-
First-party tracking wherever possible. Use first-party cookies (set from your own domain) and server-side tracking. Third-party cookies are dead; stop relying on them.
-
Persist attribution data server-side. Once captured, store first-touch in your CRM / data warehouse. Don't rely on client-side cookies to preserve it.
-
Cross-subdomain configuration. If you have multiple subdomains (app.example.com, docs.example.com), configure your analytics to share session data across them.
-
Accept some loss. You'll never recover 100% of first-touch data. If your "direct / none" bucket is under 15%, you're doing fine. Over 25% suggests a tracking setup problem worth investigating.
-
Backfill with probabilistic models where possible. For organic search, use keyword-level probabilistic attribution to recover some of what cookies can't.
Problem 4: double-counting across platforms
The symptom
Google Ads reports 200 conversions. Facebook reports 180 conversions. Your CRM shows 230 total customers. But the channel breakdown adds up to 380 (200 + 180 = 380). Something is wrong.
The cause
Ad platforms credit their own platform for conversions they touched, even if they weren't the last touch. Google Ads conversion = "there was a click within 30 days before the conversion." Facebook the same. A single customer clicked both → both platforms claim credit → your channel numbers are inflated.
The fix
-
Trust your CRM, not the ad platforms. Ad platforms are optimized to claim credit. Your CRM is (in theory) optimized to track one source of truth.
-
Use a single attribution model as the source of truth. Apply the same multi-touch attribution model to credit each channel fractionally. Fractional credits across channels should sum to the total number of customers — no double-counting.
-
Use ad platform data only for campaign optimization, not business reporting. Facebook's conversion count inside Facebook Ads Manager is for bidding. Your CRM's attribution is for business decisions. Don't confuse the two.
-
Reconcile monthly. Pull ad platform conversions, pull CRM channel attribution, reconcile the difference. Any gap >10% should be investigated.
Problem 5: return-visit and multi-session noise
The symptom
A customer's "last-touch" source is "direct" for 40% of leads. That seems high — direct traffic isn't usually the dominant acquisition channel. Something's off.
The cause
Return visitors often appear as "direct" because their session starts with a bookmark or a typed URL, not a UTM-tagged link. Their actual journey included paid search, content, social, etc. — but only the first click was tagged, and subsequent visits lost the attribution chain.
The fix
-
Preserve first-touch for the entire customer lifecycle. Don't overwrite it on return visits. The first-touch is canonical.
-
Track session-level sources but attribute from journey-level sources. For attribution, use the full journey (all touches), not just the latest session.
-
Use multi-touch attribution, not last-touch, precisely because last-touch is where this problem shows up worst. Multi-touch distributes credit so "direct" return visits don't eat all the credit.
-
Instrument return-visit prompts if they exist. If a customer saw a retargeting ad before their "direct" visit, that retargeting ad is the real last touch. Integrate ad platform data with your first-party analytics to capture it.
The reconciliation test
Once you've addressed these five problems, run the reconciliation test:
- Total customers in a period: 80
- Sum of fractional multi-touch credits across all channels: should equal 80 (or extremely close — 78–82 is fine, accounting for rounding)
- Total booked revenue: $3.2M
- Sum of attributed revenue by channel: should equal $3.2M
If these reconcile, your attribution data is trustworthy. If they don't, find the gap — it's one of the five problems above.
The order to fix them
If you're starting from scratch, fix in this order:
- UTM sprawl (highest leverage)
- CRM dedupe (second-highest)
- Return-visit noise (moderate)
- Double-counting (usually a reporting fix, not a data fix)
- Cookie loss (unfixable fully; just accept and work around)
Problems 1 and 2 are fixable with one week of engineering effort each. Problems 3 and 5 require ongoing discipline. Problem 4 is about report consumption, not underlying data.
Connecting to the full RevOps picture
Clean attribution data is a prerequisite for every other RevOps report:
- CAC by channel requires accurate channel attribution
- Pipeline velocity requires clean opportunity source data
- Win rate by channel requires reliable channel tagging at the opportunity level
- The Monday morning revenue dashboard requires all the above
In other words: data hygiene isn't a nice-to-have. It's the foundation.
Where Elir fits
Elir's attribution engine has normalization, dedupe, and reconciliation built in — the cleanups that most teams have to build themselves in a warehouse. If you want to skip 6 weeks of data hygiene work and get to reporting faster, book a 20-minute walkthrough.
TL;DR
Attribution reports that don't reconcile are almost never a model problem — they're a data problem. Five culprits: UTM sprawl, CRM dedupe failures, cookie loss, double-counting, and return-visit noise. Fix UTMs first (highest leverage). Accept some cookie loss (unfixable). Reconcile monthly by summing fractional channel credits to total customers. Without clean data, no attribution model produces trustworthy output.