CDAI Engine Validation Case Study — Real Tests, Real Results | Allocera Intelligence

The Engine That Refuses to Lie

Document Type: Validation Case Study
Test Date: May 2026
Test Organizations: 2 Independent Businesses
Data Quality: Real-World, Flawed

Most marketing analytics tools display confident numbers on incomplete data. CDAI does the opposite: when the data isn't trustworthy, the engine flags it and refuses to issue directives. Here's what happened when we ran it on two real businesses with real data quality problems.

The Question Every CFO Asks

"How do I know your tool isn't just hallucinating numbers?"

It's the question that kills analytics deals. Sophisticated buyers — CFOs, in-house data teams, finance committees — have seen too many dashboards confidently display figures that don't survive a five-minute audit. The MarTech category is full of tools that calculate ROAS to two decimal places on data they don't actually have.

CDAI was built specifically against that failure mode. The engine measures one thing — true contribution margin per campaign, after every cost layer the platforms don't report — and issues one of five enforceable directives: SCALE, HOLD, CUT, PAUSE, FLAG.

This document is an operational record. Two real businesses, real data, real engine output. What the engine did. What it refused to do. Why both matter.

What CDAI Measures

Most marketing analytics tools answer the question: "What did this campaign return in revenue divided by ad spend?"

CDAI answers a different question: "What did this campaign actually contribute to the bottom line, after every cost the platforms don't report?"

The Seven-Cost Stack

Platform-reported cost-per-lead and cost-per-acquisition reflect only the costs the platform controls — its own media spend. CDAI reconciles all seven cost layers visible in real lead-gen and acquisition operations:

Cost LayerVisible to Google/Meta?
1. Media spendYes — the only layer the platforms accurately report
2. Platform feesPartial — platform-side fees only, not third-party tooling
3. Broker payoutsNo — agency margin and lead vendor fees are invisible to platform reporting
4. RefundsNo — post-conversion refund data sits in finance systems, not ad systems
5. ChargebacksNo — payment processor data, weeks-to-months delayed
6. Compliance costsNo — TCPA, HIPAA, state-specific compliance overhead never enters ad reporting
7. Variable costsNo — intake fallout, sales close rate, fulfillment cost

In real operations, the gap between what platforms report (layer 1, sometimes part of layer 2) and the true cost-per-bound-acquisition can be 30 to 70 percent. The campaign that looks profitable on a Google Ads dashboard is often the campaign that's quietly destroying margin once the full cost stack is reconciled.

CDAI exists to close that gap. The rest of this document is what happened when it ran on real data.

Test Case 1: E-Commerce Business

Industry E-commerce, regulated product category (mushroom cultivation supplies)Reach Ships to 60+ countriesFounded 2020 (5+ years operating)

The Test

The business ran a Google Search Ads campaign from August 31, 2025 to December 2, 2025. The campaign was complete (concluded), with full historical data exportable from Google Ads.

Test Parameters
Data source: Real exported Google Ads CSV Campaign period: August 31, 2025 – December 2, 2025 (~94 days) Total spend: $2,112.17 Conversions reported: 37 Ingestion method: Direct SQL via ingestion_pipeline.py Data age at time of test: ~4 months stale

What the Engine Did

✓ Step 1: Ingestion
Successfully Loaded Real Platform Data
Engine connected to live Supabase production database. Real Google Ads export ingested via SQL into the campaigns table. Campaign and cost event records inserted with correct org_id binding. No errors.

This confirms what every analytics tool claims but few actually demonstrate: that the engine can read real exported platform data, not just demo fixtures.
✓ Step 2: Calculation
Ran Directive Logic Without Runtime Error
Engine successfully read campaigns from the database, attempted contribution margin calculation, and ran the directive logic end-to-end without runtime error.
✓ Step 3: Health Monitor
Engine Refused to Issue Directives on Stale Data
Before issuing any directives, CDAI runs a health monitor against incoming data. The monitor evaluates data freshness, completeness, and integrity, and sets a single boolean: directive_safe.

On this data, the monitor identified the campaign as approximately four months stale and set directive_safe = FALSE.

The engine refused to issue directives.

This is the behavior most analytics tools do not have. A SCALE directive issued on stale data is worse than no directive at all — it tells a marketer to commit budget to a campaign whose underlying conditions may have changed. CDAI's health monitor exists specifically to prevent that failure mode.
✓ Step 4: Multi-Tenant Isolation
Zero Cross-Organization Data Leakage
The data was ingested into its own organization within the multi-tenant Supabase architecture. Row-Level Security policies were active. Subsequent queries filtering by org_id returned only this organization's data — no cross-organization data appeared, even when running against the same database that contained other test organizations.

This is the architectural foundation that allows CDAI to safely serve multiple clients from the same engine. It was verified on real data, not just simulation.
What This Validates:
• CDAI ingests real exported platform data without modification or fabrication
• The engine calculates against whatever cost layers are populated and does not invent values for layers that aren't
• The health monitor functions as a safety gate — directives do not issue on data the engine cannot stand behind
• Multi-tenant data isolation works on real data, not just architecture diagrams

Test Case 2: Healthcare Services Business

Industry Healthcare services, senior care verticalLocation North CarolinaEngagement Pilot client (non-paying)Onboarding Date April 23, 2026

The Test

The business provided 3.5 months of Meta Ads Manager exports from January 1, 2026 to April 22, 2026, plus a CRM contacts file. The data set represents a real-world condition: messy, partial, and with one critical attribution gap.

Test Parameters
Campaigns ingested: 7 Cost events ingested: 8 Total Meta spend: $2,414.12 Total impressions: 62,211 Lead records received: 276 contacts in CRM export Lead records ingested: 0 — see analysis below

What the Engine Did

✓ Step 1: Multi-Tenant Setup
Production Architecture Verified
New organization created in the orgs table. Meta Ads channel created and bound to the org_id. Client user account created and linked to the organization via the client_users table. Row-Level Security policies confirmed active on every data table.

Cross-tenant isolation verified: The data was inserted into the same physical database that already contained the first organization's data. Subsequent queries filtering by org_id returned only this organization's data. The architectural isolation works on real client data — not just in test fixtures.
✓ Step 2: Campaign and Cost Ingestion
Data Preserved Exactly As Exported
Seven Meta campaigns ingested via SQL, all tagged with the correct org_id. Campaign records and cost events were created with values preserved exactly as exported from Meta Ads Manager — no alterations, no inferred values, no fabricated columns to fill gaps.

This is a discipline most ingestion pipelines abandon at scale. When a CSV has a missing column, the easy path is to fill it with a sensible default and keep going. CDAI does not. If the data isn't there, it isn't there.
✓ Step 3: The Lead Attribution Gap
Engine Flagged Missing Attribution, Refused to Invent
The CRM contacts file contained 276 lead records. The Source column on every lead read "Contact Import" — meaning no campaign attribution existed at the lead level. There was no way to determine which Meta campaign generated which lead.

Without that attribution, the engine cannot calculate true cost-per-lead by campaign, contribution margin by campaign, or issue SCALE / HOLD / CUT / PAUSE / FLAG directives at the campaign level. These are the engine's primary outputs.

The engine did not attempt to fabricate the attribution it lacked. Lead records were not ingested. The audit pipeline halted at the data quality gate.

A weaker tool would have done one of three things: (a) ingested the leads with null campaign_id values and silently degraded the analysis, (b) used a heuristic — chronological proximity, last-touch fuzzy match — to invent attribution, or (c) issued directives anyway, qualified by a footnote. CDAI did none of those. The audit was incomplete because the data was incomplete, and the engine reported that condition rather than disguising it.
✓ Step 4: Health Monitor
Refused to Issue Directives on Incomplete Data
With incomplete attribution upstream, the health monitor flagged the data set as not directive-safe. directive_safe = FALSE. No directives were issued.

This is the same behavior demonstrated on the first test. The trigger condition was different (stale data on Test 1; missing attribution on Test 2), but the engine's response was the same: refuse to issue directives the data cannot support.
What This Validates:
• Multi-tenant isolation works on real client data, in production, against a database that already contains other organizations
• The engine does not fabricate attribution to fill gaps — it surfaces the gap as a finding
• The health monitor responds to multiple distinct integrity failure modes, not just one
• CDAI's data quality discipline is structural, not optional

What These Two Tests Prove Together

Two real businesses. Two different industries. Two different platforms (Google Ads and Meta Ads). Two different data quality problems (stale data and missing attribution). One consistent engine behavior.

Test DimensionTest 1Test 2
PlatformGoogle AdsMeta Ads
IndustryE-commerceHealthcare / senior care
Data ConditionComplete but ~4 months staleRecent but missing campaign attribution
IngestionSuccessfulSuccessful (campaigns + costs); halted on lead attribution gap
Multi-Tenant IsolationVerifiedVerified
Health Monitor Resultdirective_safe = FALSE (stale)directive_safe = FALSE (incomplete)
Directives IssuedNone — by designNone — by design
Fabricated ValuesZeroZero

The Behavior Worth Naming

CDAI is built on a principle that's rare in the marketing analytics category: the engine refuses to issue an output it cannot stand behind.

That sounds obvious. In practice, it's not. The competitive landscape is full of tools that confidently display modeled conversions, attributed revenue, and projected ROAS — calculated on data with significant integrity issues, presented without any indication that the underlying inputs were incomplete or stale. Most CFOs evaluating these tools have learned to discount the headline numbers by 30 to 50 percent before believing them.

The two tests above demonstrate the opposite posture. When the data was complete enough but stale, the engine flagged it and refused to issue directives. When the data was recent but missing critical attribution, the engine surfaced the gap and refused to invent the attribution. In neither case did CDAI produce a confident-looking output that wouldn't have survived scrutiny.

That refusal-to-fabricate behavior is the single most important property an analytics engine can have when the buyer is a CFO with budget authority and a long memory for tools that lied to them.

What CDAI Looks Like With Complete Data

So that the integrity-first picture above isn't mistaken for a limitation, here's what CDAI's full output looks like when the cost stack is populated and the data is current.

In a separate prior test, the engine was run against a complete simulated data set — six campaigns, full seven-layer cost stack populated, all data current. The engine executed end-to-end without error and produced one directive per campaign across all five directive types:

  • SCALE — for campaigns where true contribution margin justified increased spend
  • HOLD — for campaigns with developing data not yet conclusive
  • CUT — for campaigns where true margin was negative once the full cost stack was reconciled
  • PAUSE — for campaigns showing anomalies under review
  • FLAG — for campaigns requiring manual review

Each directive carried a numeric confidence score and a reason code. The directive sheet is the engine's primary client-facing artifact and is what every paying audit will deliver.

Production Evidence — Automated Health Checks
system_health table (May 7-8, 2026): → Health checks running automatically every few minutes to hours → Confirmed 15+ automated entries with timestamps → All 5 check types executing: cost_freshness, lead_freshness, spend_reconciliation, attribution_integrity, directive_freshness → Scheduler deployed and operational via Render backend
Directive Issuance Evidence
directive_events table: → 121 directives issued across 2 organizations → Span: 32-39 days of directive coverage → Types issued: SCALE, HOLD, CUT, PAUSE, FLAG → All directives confidence-scored: HIGH, MEDIUM, LOW → Zero directives issued when directive_safe = FALSE

The simulated test confirms the engine's full output capability. The two real-data tests above confirm the engine's behavior under real-world data quality conditions. Both are necessary. Neither is sufficient on its own.

Why This Matters to a Buyer

If you run paid acquisition at any meaningful scale — six figures monthly or more — three things follow directly from these tests.

One · Your reported cost-per-acquisition is structurally incomplete

The platforms report what they control. They do not reconcile broker margin, refunds, chargebacks, or compliance overhead. A campaign that looks profitable in your Google Ads or Meta dashboard is often the campaign that's destroying margin once the full cost stack is reconciled. The 30 to 70 percent gap between reported CPL and true cost-per-bound-acquisition is not a hypothesis — it's structural arithmetic that compounds every month you don't measure it.

Two · The tools that say they fix this often don't

The contribution-margin-aware tools in market today are largely Shopify-native, optimized for ecommerce SKU economics. They do not handle lead-gen broker payouts, regulated-industry compliance overhead, multi-platform partner ecosystems, or refund-and-chargeback dynamics specific to high-ticket services. CDAI is built for these conditions specifically.

Three · The engine you choose has to refuse to lie

If your CFO is going to make capital allocation decisions on the engine's output, the engine has to be willing to tell you when it doesn't have enough data to recommend an action. CDAI does this. The two tests above are the proof.

The Question That Reveals the Gap

"Do you know your actual cost per signed case after lead vendor margin, intake fallout, and chargebacks?"
If they need a moment: "Let me put it another way: Do you know your true cost per signed case after your lead vendor takes their margin, after prospects fall out during intake, and after refunds and chargebacks hit?"

The people to think about are anyone who runs or owns:

  • A personal injury or mass tort law firm — spending on lead vendors like EvenUp, Axiom, Litigation Leaders, or running their own Google/Meta campaigns
  • An insurance agency or Medicare brokerage — buying leads from aggregators or running direct acquisition
  • A senior care or assisted living facility — acquiring patients through lead vendors, referral networks, or paid ads
  • A home services company — roofing, solar, HVAC, water damage restoration — buying HomeAdvisor, Angi, Thumbtack leads or running their own campaigns
  • Any business spending serious money buying leads from vendors — if they're writing five-figure monthly checks to lead sources and can't tell you the true cost per customer that actually closes and stays closed
Almost nobody can answer this question. That's the whole point. If anyone comes to mind, a warm intro is all we need — not asking for a pitch, just an introduction so we can ask them the question above.

See What Your Platform Isn't Telling You

Most businesses are making capital allocation decisions on numbers that are wrong by 30–70%. The gap between what your dashboard reports and what you're actually paying is structural arithmetic — it compounds every month you don't measure it.

A 30-day distortion audit on your campaign data costs $3,500 and delivers within 7 days. If we don't surface margin distortion you weren't tracking, you don't pay.

Request a Distortion Audit

Technical Architecture Summary

The CDAI Engine is deployed on:

  • Database: Supabase (PostgreSQL) with Row-Level Security policies
  • Backend: Python on Render
  • Portal: React on Vercel (live at cdai-portal.vercel.app)
  • Scheduler: Deployed and running automated health checks
  • Email Delivery: Operational via Resend integration

Multi-tenant capable. Zero consumer PII stored in any table.

Full Methodology Available Under NDA: The detection logic, directive classification engine, and attribution reconciliation framework are original intellectual property. Complete technical disclosure is available to qualified clients and partners under signed NDA.
Scroll to Top