Article > Platforms

Performance Max Over-Reports. Here’s the Calibration Playbook.

PMax routinely over-reports ROAS by claiming credit for sales that would have closed organically. The most-cited single-brand geo-lift puts the gap at 33%; multi-brand work shows brand-exclusion CAC reductions of 19-60%. Here's the practitioner playbook for getting PMax to report numbers a CFO can defend.

By: Dusty Dean | Q2 2025

“If PMax is reporting a 9x ROAS, what would my Google revenue look like if I turned it off tomorrow?”
— The question every PMax-heavy advertiser should be asking

33%

Caraway PMax-With-Brand Over-Report (single-brand 2023 geo lift)

~40%

Avg CAC Reduction From Brand Exclusion (Haus multi-brand, causal)

19–60%

Range of CAC Reduction Across Tested Brands (Haus, causal)

91.5%

Optmyzr Accounts With Search/PMax Overlap (observational)

Our last piece walked through a Q1 2025 incrementality study where Google’s reported 6.8x ROAS collapsed to a true incremental 3.1x. That study was a single-account, single-quarter modeled channel-level decomposition — directional for our practice, not a portable constant. PMax was the loudest contributor to the gap. This is the follow-up: a practitioner’s view of why PMax over-reports, what the credible evidence actually shows, and the controls that close the gap without starving the algorithm.

01 / The ProblemPMax is goal-seeking software pointed at the cheapest reported conversions on the network.

Performance Max optimizes for reported conversion value. The cheapest reported conversions on Google’s network are existing customers searching the brand name. When PMax is allowed anywhere near those queries — directly through PMax’s automated query matching (Search themes plus signal-driven matching against your feed and final URL), or indirectly through Final URL expansion — last-click and data-driven attribution credit the campaign with sales that would have closed for free.

This is not a dark pattern. It is a math problem baked into the product’s design. For context on scale: PMax accounted for 68% of Google Shopping spend among Tinuiti’s advertiser base in Q3 2025 (platform-reported spend share, not an incrementality finding), up from 53% in Q1, and Smarter Ecommerce’s 4,000-campaign analysis put PMax at a peak of 81.8% of ecommerce Google Ads spend in May 2024 — and has since declined ~6 percentage points as practitioners rebalance toward Standard Shopping and Search. When that much budget runs through one black-box bidder, the reported ROAS becomes a load-bearing number — and the structural over-attribution becomes a board-level problem.

Four documented mechanisms drive the over-report.

Mechanism	What it does	Effect on reported vs. incremental ROAS
Branded query inclusion (default)	PMax bids on brand searches via its automated query matching (Search themes plus signal-driven matching against your feed and final URL) unless brand exclusions are explicitly applied. Self-serve brand exclusions only shipped December 2024.	Reported ROAS inflated. PMax absorbs revenue that would have closed organically and the algorithm doubles down on the cheapest-conversion path.
Final URL expansion	On by default. Google rewrites your landing page to a more “relevant” URL — including branded subdirectories — based on the user query.	Reinjects brand intent into campaigns scoped to non-brand. Compounds with mechanism #1.
Search term opacity (now partly fixed)	Pre-March 2025, PMax exposed only “search categories.” The new Search Terms report adds a source column and negative-keyword application.	Historical reported ROAS was unauditable. The fix is real but advertisers still need to actively use it.
Search/PMax cannibalization	PMax overlaps with the account’s own Search campaigns on a majority of converting terms; Search wins on quality when both can serve.	Spend gets double-counted in mental models. The same query funds two campaigns; only one was needed.

Adalysis analyzed 3,300+ non-retail PMax campaigns covering roughly 1.2 million search terms (observational): 67% of PMax campaigns overlapped with the account’s Search campaigns on at least one query, and on those overlapping queries Search outperformed PMax on conversion rate 84% of the time. Optmyzr’s broader 503-account study (also observational) found 91.45% had Search/PMax keyword overlap. The collision is the rule, not the exception.

02 / The EvidenceWhat the credible incrementality work shows.

The over-attribution claim is easy to make and easy to dismiss. The reason to take it seriously is that three independent measurement shops have published causal evidence on it, with sample sizes that matter and methodology that holds up.

Source	Sample	Headline finding	What it means
Caraway × Haus, 3-cell geo lift (Jun 2023) — causal, n=1 brand	US split into thirds: PMax-with-brand, PMax-no-brand, holdout. 3 weeks. Single brand (DTC cookware).	PMax-with-brand drove 1.95x more total revenue (and 1.75x more new-customer revenue) than PMax-no-brand. Google’s reported revenue overstated PMax-with-brand’s true incremental contribution by 33%.	Including brand in PMax can be the right call for revenue — but the reporting needs to be deflated by a known multiplier before any budget decision. Treat the 33% as a directional single-brand benchmark, not a category constant.
Haus multi-brand meta (Aug 2024) — causal	Multiple geo-lift experiments across Haus’s portfolio.	Excluding brand from PMax reduced new-customer CAC in 100% of tested brands, by 19–60% (avg ~40%). On revenue (not CAC), excluding brand only won 50% of the time — but when it won, it drove ~24% more incremental revenue.	If new-customer acquisition is the goal — not revenue per se — brand exclusion is one of the highest-confidence levers in the playbook (CAC down in 100% of tested brands). For pure revenue, the call is genuinely closer: the same Haus data shows brand exclusion only wins on revenue half the time. The “free lunch” framing is wrong; the right framing is “free lunch on CAC, coin flip on revenue.”
Measured.com matched-market meta (Jan 2025) — causal	50+ matched-market geo lift tests across brands with varying PMax adoption.	PMax incrementality sits between Standard Brand Shopping (lower) and Standard Non-Brand Shopping (higher). Empirical incrementality bands published.	The empirical contribution is the incrementality band. The iROAS = incrementality % × reported ROAS relationship is an arithmetic identity (definition), not a Measured finding — but Measured’s bands give you the input you need.
Tinuiti Q3 2025 benchmark — platform-reported, not causal	Advertiser base running both PMax and Standard Shopping.	PMax conversion rate ran 2% higher than Standard Shopping; CPC 7% higher; net ROAS 2% lower. Standard Shopping’s edge is real but narrow — it is not a slam-dunk alternative.	Useful for sizing the spend mix. Do not cite as causal evidence — Tinuiti is reporting platform-reported data, which is exactly the number under audit.

Caraway’s 33% over-report is the most-cited published single-brand number in the field. It is n=1 brand, 3 weeks, June 2023, in cookware — useful as a directional benchmark for brand-heavy DTC, not a portable category-wide floor. Note the dual finding: the same Caraway test showed PMax-with-brand drove 1.95x more revenue than PMax-no-brand. The 33% over-report is real; so is the lift. Both have to be in the budget model. Haus’s CAC finding — multi-brand and causal — is the one to cite if your client cares about new customers more than revenue. The calibration lens we use internally for every PMax recalibration is straightforward arithmetic: true iROAS equals incrementality (an empirical input from work like Measured’s) times reported ROAS (the platform number).

03 / Why It Is Hard To FixGoogle has shipped real controls. They are still partly behind a request form.

Three product changes between October 2024 and March 2025 reshape what advertisers can actually do about this. They also explain why a lot of the older playbook now under-delivers.

First: in October 2024, Google narrowed the prior across-the-board prioritization of Search over PMax for the same query. Exact-match Search keywords matching the query still take priority; otherwise ad rank decides which campaign serves. The brand-fence tactic still works for exact-match brand campaigns but is less load-bearing elsewhere — explicit brand exclusions on PMax are now the primary lever for non-exact-match coverage.

Second: self-serve brand exclusions launched in December 2024, and the negative keyword limit jumped from 100 to 10,000 in March 2025. Both are material. Both still require active opt-in — defaults still allow brand traffic, and several account-level features rolled out unevenly through 2025. Confirm what is actually enabled on the account before assuming it.

Third: the March 2025 PMax Search Terms report finally exposes actual queries with a source column distinguishing user-defined Search themes from PMax’s automated matching. AdExchanger characterized the move as the first material transparency concession in the product’s history. It is enough to audit a campaign — not enough to skip incrementality testing, because the report still does not tell you which conversions were incremental.

The honest summary: PMax was a black box in 2022. In 2026 it is a black box with windows. The windows only help if you open them.

04 / The Mitigation PlaybookStack these in order. Stop when the gap closes enough.

Tactic	Effort	Lift expectation
Confirm brand exclusion access is live on the account (request via Google rep if gated)	Low (1–14 days elapsed)	Prerequisite for the row below. Several account-level features rolled out unevenly through 2025; do not assume.
Build and apply brand exclusion list (brand variants, misspellings, foreign-language, subsidiary brands)	Low–Medium (1–10 hrs depending on brand-variant complexity; longer for enterprise)	Closes the largest single source of over-attribution. CAC reductions of 19–60% (avg ~40%) in Haus’s multi-brand causal data when new-customer acquisition is the goal.
Build account-level negative keyword list (brand terms, misspellings, competitor brand)	Low	Catches what brand exclusions miss. Belt and suspenders.
Turn off Final URL expansion, or scope it with URL contains/excludes rules	Low	Prevents PMax from rewriting traffic onto branded subdirectories. Especially impactful on sites with brand-heavy URL paths; complements brand exclusion rather than substituting for it.
Run a brand-fence Search campaign (exact-match; with the post-Oct 2024 caveat)	Medium	Less reliable than it was. Still useful as a second line of defense — exact-match Search keywords matching the query retain priority.
Segment PMax by SKU role (loss leaders / hero / long tail in separate campaigns)	Medium	Stops the algorithm from over-spending on cheap conversions. Smarter Ecommerce now sees 3–7 PMax campaigns per account as the working norm.
Use 10–25% as a directional hypothesis to test (correlational benchmark, not a target)	Medium	In Optmyzr’s 24,702-campaign data, accounts in the 10–25% allocation band showed best CPA and CVR, and PMax underperformed sibling campaigns above 50% allocation. This is correlational — accounts that cap PMax tend to be more sophisticated — so treat as a directional benchmark, not a hard cap. Optimal band varies by category and account maturity.
Run a Google in-platform conversion lift study (threshold reportedly lowered into the low-five-figure range in 2025)	Medium-high; rep-dependent	Requires named Google rep engagement; not reliably available to accounts without one. Confirm the current floor and your eligibility with your rep before scoping. Gives you a Google-blessed incrementality number for that account, that quarter. Calibration only — does not solve cross-platform attribution.
Geo-holdout via Haus, Measured, Northbeam (incrementality module new as of April 2026; track record still building), or DIY matched-market design	High	Gold standard. Produces the iROAS multiplier you actually deflate reported numbers by.

Three pieces of nuance matter more than the list itself.

First, do not strip brand from PMax on a small account. Smarter Ecommerce’s data on the 30/60 monthly conversion threshold is unambiguous: PMax campaigns under 30 conversions a month swing wildly, anywhere from −100% to +400% versus target. Pulling brand traffic from a campaign that is already starved of conversion volume kills the algorithm’s ability to bid intelligently anywhere else. Below roughly $10K/month in PMax spend, the right move is usually to fix the controls that don’t risk algorithm starvation (Final URL expansion off, account-level brand negatives, basic feed segmentation) and skip the heavier mitigations (PMax-level brand exclusions, sequenced campaign restructure) until the account scales into 30+ monthly conversions per campaign. The over-attribution is real; it is also less expensive to live with than to fix poorly when volume is thin.

Second, even on big accounts, when a competitor is bidding on your brand, paid brand clicks can be genuinely incremental — a paid ad blocks the competitor’s. Test the contested-vs-uncontested brand split before pulling brand spend across the board. Average geo-holdout findings are averages; the right answer for any given account depends on who else is in the auction.

Third, run the calibration math, not just the controls. The framing to internalize is an arithmetic identity: true iROAS equals incrementality times reported ROAS. (It is a definition, not a finding from any single source — what work like Measured’s contributes is the empirical incrementality input.) If a geo test puts PMax incrementality at 35% and the platform reports 8x, your true iROAS is 2.8x. That is the number that goes into the budget model. Mike Rhodes’ PMax script and Mike Ryan’s PMax Brand Traffic Analyzer (free, from Smarter Ecommerce) bucket queries into brand / close-to-brand / non-brand / blank — useful for the ongoing audit even after Google’s native search-terms report, because they normalize across asset groups and time periods.

How to actually run a Google in-platform conversion lift study on PMax

Eligibility. Account must be enrolled in Google Ads Conversion Lift through your Google rep. The minimum-spend floor reportedly dropped into the low-five-figure range in 2025, but it is still per experiment (not per program) and rep-dependent — accounts without a named rep often cannot get a study set up at all. Confirm the current floor before scoping; most reps will not surface this proactively.

Design. Geo-based holdout, randomized at the geo level, typically 4–6 weeks. Google handles the geo split and significance testing. You decide which campaigns to suppress in the holdout cell.

What you get. An incremental conversions number with a confidence interval, scoped to the campaigns and time window you chose. Use it as the incrementality % input to the iROAS calibration. Re-run quarterly if budget allocation shifts materially.

What you do not get. Cross-platform truth. The study sees only Google. If Meta or Pinterest are running concurrently in the same geos, the lift you measure is “PMax incremental on top of everything else running” — which is the right business question, but it is not “PMax incremental in a vacuum.”

05 / What We Are Watching For Next

Three signals worth tracking through the rest of 2026. The brand exclusion universal rollout — currently still gated for some accounts — should land cleanly this year and remove one of the more frustrating support-ticket bottlenecks. AI Max for Search exited beta in May 2025 and is on a publicly-stated path to absorb Dynamic Search Ads’ inventory through 2026 (exact full-replacement timing has slipped from Google’s original guidance and is worth tracking on Google Ads release notes rather than taking as fixed). It creates a structurally similar brand-attribution problem at the Search-campaign level, and the early independent commentary (Revvim, January 2026) is already flagging it. And the next round of multi-brand geo-lift meta-analyses from Haus, Measured, and Northbeam (whose incrementality module shipped April 2026 with a track record still building) will tell us whether the 2024–2025 product changes have actually moved measured incrementality, or whether the gap between reported and real has held steady.

“

True PMax ROAS is incrementality times reported ROAS — an arithmetic identity. The platform only shows you one of the two terms. Until you measure the other, you are budgeting in the dark.

Bitcadet · Paid Media Practice

Same lesson as the incrementality study. Different campaign type. Same fix.

06 / LessonsWhat we take into every PMax engagement now.

Assume PMax over-reports until proven otherwise. The Caraway 33% is a credible single-brand benchmark for brand-heavy DTC — directional, not a portable category-wide floor. The actual gap on any account is a function of brand control posture and Final URL expansion settings, and is best measured locally.
Brand exclusions and Final URL expansion controls do most of the work. Stack them before anything else. The remaining tactics are second-order.
Calibrate, do not just measure. True iROAS = incrementality × reported ROAS — an arithmetic identity. Without a measured incrementality input, the platform number is decorative.
Do not strip brand from a starved account. Below 30 monthly conversions, the algorithm needs the volume more than the audit needs the cleanliness. Fix it when scale justifies the cost.
Re-test quarterly. Budget mix shifts. Algorithm behavior shifts. The multiplier from last quarter’s geo-test is a directional input, not a permanent constant.

Bottom line. PMax is not a fraud and it is not a black box anymore. It is a goal-seeker that will absorb whatever credit the attribution model lets it absorb. The job is to give it less to absorb (brand exclusions, Final URL expansion off, segmentation by SKU role), audit what it does take credit for (Search Terms report, Brand Traffic Analyzer scripts), and deflate what it reports by a measured incrementality multiplier (geo-holdout, Google’s in-platform conversion lift, or a third-party platform). Do those three things and the reported number becomes a number you can defend. Skip them and you are budgeting against a structural over-attribution gap that the most-cited single-brand causal study put at 33% — and that, for accounts with weak brand controls, runs materially worse.

About the author

Dusty Dean, founder of BITCADET, specializes in e-commerce strategies, leveraging technical expertise and team building to drive revenue growth and digital sales success.. Read Bio.

Ready to work with us on transforming your digital strategy?

Contact

Stay up to date on this topic

Receive all the news, case studies and updates in your inbox

Wondering what's next for your marketing?

WORK WITH US