Page 1

Data

product

vertical

region

numerator

denominator

rate

Product_X

Tech

A

55

100

55.0%

Product_X

Finance

A

270

900

30.0%

Product_Y

Tech

A

400

800

50.0%

Product_Y

Finance

A

600

1200

50.0%

Product_X

Tech

B

468

900

52.0%

Product_X

Finance

B

27

100

27.0%

Product_Y

Tech

B

360

900

40.0%

Product_Y

Finance

B

480

1100

43.6%

Product_X

Tech

C

600

1200

50.0%

Product_X

Finance

C

900

2700

33.3%

Product_Y

Tech

C

450

900

50.0%

Product_Y

Finance

C

550

1100

50.0%

Oaxaca Decomposition — Product-Level Walkthrough (NA vs ROW)

This note shows exactly how to attribute the NA–ROW rate gap to each product using Blinder-Oaxaca, with terms that sum to the top line. Data are from your corrected table.

Setup & identity

Per product $i$:

  • $m_{A,i}, m_{B,i}$: mix shares (NA vs ROW).

  • $r_{A,i}, r_{B,i}$: within-product rates.

  • Top lines computed from the same rows: $\bar r_A=\sum_i m_{A,i}r_{A,i}$, $\bar r_B=\sum_i m_{B,i}r_{B,i}$, total gap $G=\bar r_A-\bar r_B$.

Use ROW as baseline (“two-part” Oaxaca). For each product:

mixicomposition=(mA,imB,i)rB,i,rateiwithin=mA,i(rA,irB,i),gapi=mixi+ratei.\underbrace{\text{mix}_i}_{\text{composition}}=(m_{A,i}-m_{B,i})\,r_{B,i},\qquad \underbrace{\text{rate}_i}_{\text{within}}=m_{A,i}\,(r_{A,i}-r_{B,i}),\qquad \text{gap}_i=\text{mix}_i+\text{rate}_i.

Identity (per row and in total):

gapi=mA,irA,imB,irB,i,igapi=G.\text{gap}_i= m_{A,i}r_{A,i}-m_{B,i}r_{B,i},\qquad \sum_i \text{gap}_i = G.

Sanity checks (from your inputs)

  • $\sum m_A=0.99998$, $\sum m_B=0.99999$ (≈1; rounding OK).

  • $\bar r_{\text{NA}}=(m_A\cdot r_A)=\mathbf{0.352988}$ (≈ 0.352996 you stated).

  • $\bar r_{\text{ROW}}=(m_B\cdot r_B)=\mathbf{0.494539}$.

  • Total gap $G=\mathbf{-0.141551}$.

  • $\sum_i \text{gap}_i=\mathbf{-0.141551}$ (exact match).

Worked row examples (showing the arithmetic)

Reels (mA=0.17333, mB=0.18592, rA=0.44038, rB=0.55034)

  • mix = $(0.17333-0.18592)\cdot0.55034 = -0.01259\cdot0.55034 = \mathbf{-0.006929}$

  • rate = $0.17333\cdot(0.44038-0.55034)=0.17333\cdot(-0.10996)=\mathbf{-0.019059}$

  • gap = $\mathbf{-0.025988}$

Simplification (mA=0.19285, mB=0.21834, rA=0.44642, rB=0.58148)

  • mix = $(0.19285-0.21834)\cdot0.58148 = -0.02549\cdot0.58148 = \mathbf{-0.014822}$

  • rate = $0.19285\cdot(0.44642-0.58148)=0.19285\cdot(-0.13506)=\mathbf{-0.026046}$

  • gap = $\mathbf{-0.040868}$

Other (mA=0.10603, mB=0.04014, rA=0.26836, rB=0.16913)

  • mix = $(0.10603-0.04014)\cdot0.16913 = 0.06589\cdot0.16913 = \mathbf{+0.011144}$

  • rate = $0.10603\cdot(0.26836-0.16913)=0.10603\cdot0.09923=\mathbf{+0.010521}$

  • gap = $\mathbf{+0.021665}$

Full per-product table (sorted by |impact|)

product
mA
mB
rA
rB
mix_i
rate_i
gap_i
share of gap

Simplification

0.192850

0.218340

0.446420

0.581480

-0.014822

-0.026046

-0.040868

28.9%

Reels

0.173330

0.185920

0.440380

0.550340

-0.006929

-0.019059

-0.025988

18.4%

Other

0.106030

0.040140

0.268360

0.169130

+0.011144

+0.010521

+0.021665

-15.3%

Automation

0.091700

0.089260

0.373730

0.604510

+0.001475

-0.021163

-0.019688

13.9%

CAPI

0.078410

0.097820

0.323180

0.433340

-0.008411

-0.008638

-0.017049

12.0%

Scaling Good Campaigns

0.011220

0.037900

0.277780

0.465550

-0.012421

-0.002107

-0.014528

10.3%

Lead Ads

0.044860

0.054950

0.164350

0.346870

-0.003500

-0.008188

-0.011688

8.3%

Creative

0.141340

0.127020

0.524610

0.673080

+0.009639

-0.020985

-0.011346

8.0%

CTX

0.011630

0.029780

0.053570

0.328460

-0.005962

-0.003197

-0.009159

6.5%

Paid Messaging

0.005610

0.037900

0.037040

0.169190

-0.005463

-0.000741

-0.006205

4.4%

A+SC/AC

0.027830

0.024180

0.320900

0.521010

+0.001902

-0.005569

-0.003667

2.6%

Gen_AI

0.115170

0.056780

0.070330

0.196050

+0.011447

-0.014479

-0.003032

2.1%

Totals / checks

0.999980

0.999990

-0.021901

-0.119650

-0.141551

100%

Interpretation: negative gap_i rows make NA worse than ROW; positive gap_i offset the deficit. Rank by $|\text{gap}_i|$ to prioritize.

What to remember

  • Compute top lines from the same rows you decompose; then $\sum \text{gap}_i = \bar r_A-\bar r_B$ exactly.

  • Use ROW baseline for a benchmarked story; use the symmetric split for baseline-agnostic attribution.

  • Rank by $|\text{gap}_i|$; use signs and the mix/rate split to drive action (composition vs execution).


Core identities

For group A vs B by product $i$:

  • Top lines: $\bar r_A=\sum_i m_{A,i}r_{A,i},;\bar r_B=\sum_i m_{B,i}r_{B,i}$.

  • Per‑product identity (path‑independent): $\textbf{gap}i=m{A,i}r_{A,i}-m_{B,i}r_{B,i}$.

  • Totals: $\sum_i \textbf{gap}_i=\bar r_A-\bar r_B$.

Splits (how to apportion gap into mix vs rate):

  • B‑baseline (two‑part): $\text{mix}_i^B=(m_A-m_B)r_B,;\text{rate}_i^B=m_A(r_A-r_B)$.

  • A‑baseline (reverse): $\text{mix}_i^A=(m_A-m_B)r_A,;\text{rate}_i^A=m_B(r_A-r_B)$.

  • Three‑part (expose interaction): pure‑mix $(m_A-m_B)r_B$ + pure‑rate $m_B(r_A-r_B)$ + interaction $(m_A-m_B)(r_A-r_B)$.

  • Symmetric (path‑independent): $\text{mix}_i^{sym}=(m_A-m_B)\tfrac{r_A+r_B}{2},;\text{rate}_i^{sym}=\tfrac{m_A+m_B}{2}(r_A-r_B)$.

A) Hidden interaction & path dependence

Toy data (two products). All splits sum to the same total gap; mix vs rate totals move with the path.

product
mA
mB
rA
rB
mix_B
rate_B
gap_B
mix_A
rate_A
gap_A
mix_sym
rate_sym
gap_sym

P1

0.600000

0.400000

0.700000

0.600000

0.120000

0.060000

0.180000

0.140000

0.040000

0.180000

0.130000

0.050000

0.180000

P2

0.400000

0.600000

0.500000

0.400000

-0.080000

0.040000

-0.040000

-0.100000

0.060000

-0.040000

-0.090000

0.050000

-0.040000

Totals by split

Split
Σ mix
Σ rate
Σ gap

B-baseline

0.040000

0.100000

0.140000

A-baseline

0.040000

0.100000

0.140000

Symmetric (mid-point)

0.040000

0.100000

0.140000

B) Aggregation bias (Simpson's paradox)

The Problem: Within each segment, A beats B, but different segment mix creates misleading aggregate conclusions.

Business Example: APAC appears to underperform ROW in "Reels" product overall, but actually outperforms in every vertical within Reels. The issue is APAC has higher mix in harder verticals.

Detailed segment table (Reels product by vertical)

product
vertical
nA
xA
rA
mA
nB
xB
rB
mB
mix_A
rate_A
gap_A
mix_sym
rate_sym
gap_sym

Reels

Tech

500

275

0.550000

0.100000

2000

1040

0.520000

0.400000

-0.165000

0.012000

-0.153000

-0.160500

0.007500

-0.153000

Reels

Finance

4500

1350

0.300000

0.900000

3000

800

0.266667

0.600000

0.090000

0.020000

0.110000

0.085000

0.025000

0.110000

Key insight from decomposition results:

  • Tech vertical: A rate (55.0%) > B rate (52.0%) ✓ A wins, but gap = -0.153000 (A loses overall due to lower mix: 10% vs 40%)

  • Finance vertical: A rate (30.0%) > B rate (26.7%) ✓ A wins, and gap = +0.110000 (A wins overall due to higher mix: 90% vs 60%)

  • Net effect: A has worse vertical mix (90% in lower-performing Finance vs B's 60%)

Collapsed to product level (misleading aggregate)

product
rate_A
rate_B
gap
misleading_conclusion

Reels

0.325000

0.368000

-0.043000

"A underperforms in Reels"

The Simpson's paradox revealed:

  • A outperforms B in both Tech (55.0% vs 52.0%) and Finance (30.0% vs 26.7%) verticals

  • Yet A underperforms overall (-4.3pp gap) due to unfortunate vertical composition

  • Decomposition shows: Finance contributes +0.110000 to A's advantage, but Tech contributes -0.153000, netting to -0.043000

  • Root cause: A has 90% exposure to the lower-absolute-rate Finance vertical vs B's 60%

How to detect Simpson's paradox in RCA:

  1. Product-level analysis: Reels shows A underperforms by 4.3pp → suggests execution problem

  2. Vertical-level analysis: Both Tech and Finance show A > B in rates → contradicts execution story

  3. Decomposition reveals: The issue is mix composition (A overweighted in Finance) not execution

  4. Detection signal: When segment-level rate advantages don't translate to aggregate advantage due to mix effects

Business implication: Without vertical-level analysis, you'd conclude "A needs to improve Reels execution" when actually "A needs better vertical portfolio allocation within Reels." This fundamentally changes the improvement strategy from operational fixes to strategic mix optimization.

Detection rule: Simpson's paradox occurs when ≥70% of subcategories show opposite direction from the aggregate pattern, weighted by volume.

Region-Level Simpson's Paradox

The Problem: A region underperforms overall but outperforms in every product category due to unfortunate product mix.

Business Example: Region X appears to underperform vs Region Y overall (13.6% vs 20.4%), but actually outperforms in every product. The issue is X has higher mix in lower-performing products.

Test Case Data:

Region
Overall Rate
Product_Low
Product_High
Mix Analysis

X

13.6%

11.0% rate, 80% mix

24.0% rate, 20% mix

Overweighted in low-performing product

Y

20.4%

10.0% rate, 20% mix

23.0% rate, 80% mix

Overweighted in high-performing product

Product-Level Analysis (Reveals Paradox):

  • Product_High: X (24.0%) > Y (23.0%) = +1.0pp advantage ✓

  • Product_Low: X (11.0%) > Y (10.0%) = +1.0pp advantage ✓

Key Insight: X outperforms in both products but underperforms overall (-6.8pp) due to product composition.

Framework Detection:

# Region-level Simpson's paradox detection
region_paradox = detect_aggregation_bias(
    df=df,
    category_columns=None,  # Region-level detection
    subcategory_columns=["product"],
    threshold=0.7
)

Detection Results:

  • Region X: contradiction_rate = 1.0 (100% of products show opposite pattern)

  • Severity: High (all products contradict overall performance)

  • Root cause: 80% exposure to Product_Low vs Y's 20%

Business Implication: Without product-level analysis, you'd conclude "X needs overall performance improvement" when actually "X needs strategic product portfolio rebalancing." This changes the strategy from operational improvements to strategic mix optimization.

D) Example RCA Narrative: Regional Performance Analysis

Based on actual test output from comprehensive Oaxaca-Blinder analysis with validated mathematics.

1. Gap Identification & Mathematical Validation

Headline Performance Gaps (from actual test output):

  • Region A: 40.00% vs Rest 37.71% = +2.3pp gap (outperformance)

  • Region B: 40.00% vs Rest 37.71% = +2.3pp gap (outperformance)

  • Region C: 40.00% vs Rest 37.71% = +2.3pp gap (outperformance)

  • Region D: 33.12% vs Rest 40.00% = -6.9pp gap (significant underperformance)

Mathematical Validation Check:

Validation Results (from actual test output):
Region A: Actual gap +2.3pp, Decomposed gap +2.3pp, Error: 0.000000 ✅
Region B: Actual gap +2.3pp, Decomposed gap +2.3pp, Error: 0.000000 ✅
Region C: Actual gap +2.3pp, Decomposed gap +2.3pp, Error: 0.000000 ✅
Region D: Actual gap -6.9pp, Decomposed gap -6.9pp, Error: 0.000000 ✅

Perfect Mathematical Validation: All regions show zero validation error,
confirming the decomposition framework correctly attributes 100% of each
region's performance gap to the sum of construct and performance components.

2. Regional Analysis Results

Business-Focused RCA Analysis Results by Region (from actual test output):

Region
Total Gap
Performance Gap
Composition Gap
Business Conclusion
Business Narrative

A

+2.3pp

+2.3pp

+0.0pp

performance_driven

Maintain execution excellence

B

+2.3pp

+2.3pp

+0.0pp

performance_driven

Maintain execution excellence

C

+2.3pp

+2.3pp

+0.0pp

performance_driven

Maintain execution excellence

D

-6.9pp

-2.4pp

-4.5pp

composition_driven

Focus on strategic portfolio optimization

3. Clean Business Narratives (No Technical Jargon)

Region A: Strong Execution-Driven Performance

  • Business Narrative: "A outperforms by 2.3pp. Maintain execution excellence. Key drivers: Product_X (+0.6pp: rate 40.0% vs benchmark 38.8%), Product_Y (+1.7pp: rate 40.0% vs benchmark 36.7%)"

  • Business Conclusion: performance_driven

  • Key Insight: Strong outperformance (+2.3pp) driven entirely by superior execution (+2.3pp) with neutral composition (0.0pp)

  • Action Plan: Continue current execution practices and operational excellence across both products

Region B: Strong Execution-Driven Performance

  • Business Narrative: "B outperforms by 2.3pp. Maintain execution excellence. Key drivers: Product_X (+0.6pp: rate 40.0% vs benchmark 38.8%), Product_Y (+1.7pp: rate 40.0% vs benchmark 36.7%)"

  • Business Conclusion: performance_driven

  • Key Insight: Strong outperformance (+2.3pp) driven entirely by superior execution (+2.3pp) with neutral composition (0.0pp)

  • Action Plan: Continue current execution practices and operational excellence across both products

Region C: Strong Execution-Driven Performance

  • Business Narrative: "C outperforms by 2.3pp. Maintain execution excellence. Key drivers: Product_X (+0.6pp: rate 40.0% vs benchmark 38.8%), Product_Y (+1.7pp: rate 40.0% vs benchmark 36.7%)"

  • Business Conclusion: performance_driven

  • Key Insight: Strong outperformance (+2.3pp) driven entirely by superior execution (+2.3pp) with neutral composition (0.0pp)

  • Action Plan: Continue current execution practices and operational excellence across both products

Region D: Strategic Portfolio Challenge

  • Business Narrative: "D underperforms by 6.9pp (analyzing 1.9pp of 6.9pp total gap). Focus on strategic portfolio optimization and mix rebalancing. Key drivers: Product_X-Finance (+2.4pp boost: vertical mix 47.5% vs benchmark 25.0%)"

  • Business Conclusion: composition_driven

  • Key Insight: Major underperformance (-6.9pp total) with Simpson's paradox detected. Focused analysis shows composition problem (-4.5pp) with execution advantages (+2.6pp) in analyzed portion

  • Action Plan: Strategic portfolio optimization and vertical mix rebalancing within Product_X as primary focus, leveraging existing execution strengths. Full gap analysis shows additional improvement opportunities beyond the paradox-affected portion

4. Simpson's Paradox Detection Results

🚨 BUSINESS-RELEVANT PARADOX DETECTED (Actual Test Results)

Critical Paradox Found in Region D, Product_X:

SIMPSON'S PARADOX DETECTION
🚨 BUSINESS-RELEVANT PARADOX DETECTED:

Region D, Product Product_X:
  Aggregate recommendation: improve_execution
  Subcategory recommendation: fix_portfolio_mix
  Product ranks #1 worst performer
  Weighted contradiction: 95.0%
  Business impact: high
  Focus subcategories: 2
  Gap analysis:
    - Aggregate: +0.000 construct, -0.019 performance
    - Subcategory: -0.045 construct, +0.026 performance

Why This Paradox Matters for Business Decisions:

The Contradiction Explained:

  • Aggregate Analysis: Product_X appears to need execution improvement (performance-driven gap of -1.9pp)

  • Subcategory Reality: Product_X actually needs portfolio mix optimization (composition-driven gap of -4.5pp)

  • Business Risk: Without subcategory analysis, would invest in wrong improvement area

  • Resource Impact: High - Product_X is the #1 worst performer requiring immediate attention

Business Value Delivered:

  • Prevents Wrong Investment: Stops execution training when portfolio rebalancing needed

  • Resource Efficiency: Directs improvement efforts to correct root cause (composition vs performance)

  • Decision Accuracy: Ensures Product_X receives appropriate strategic intervention

Integration with Enhanced RCA (Actual Results - 4 Regions):

Region A: Has Simpson's Paradox: False → Use product-level analysis reliably ✅
Region B: Has Simpson's Paradox: False → Use product-level analysis reliably ✅
Region C: Has Simpson's Paradox: False → Use product-level analysis reliably ✅
Region D: Has Simpson's Paradox: True → Requires subcategory analysis for Product_X ⚠️

Mathematical Validation: All regions show 0.000000 validation error ✅
Regional Performance: A(+2.3pp), B(+2.3pp), C(+2.3pp), D(-6.9pp)

5. Multi-Baseline Strategic Analysis

Baseline Sensitivity Analysis (from actual test output):

Region
Rest-of-World
Global Average
Top Performer
Strategic Implication

A

+2.3pp

+1.7pp

+0.0pp

Top performer - maintain excellence

B

+2.3pp

+1.7pp

+0.0pp

Top performer - maintain excellence

C

+2.3pp

+1.7pp

+0.0pp

Top performer - maintain excellence

D

-6.9pp

-5.2pp

-6.9pp

Consistent underperformance requiring improvement

Strategic Insights:

  • vs Rest-of-World: Regions A, B, C show identical strong performance (+2.3pp), Region D significantly lags

  • vs Global Average: Similar pattern with slightly reduced gaps (+1.7pp vs -5.2pp)

  • vs Top Performer: Regions A, B, C are tied as top performers (0.0pp gap), Region D shows full improvement potential

  • Baseline Selection Impact: Region D consistently underperforms across all baselines, confirming need for intervention

Performance vs Composition Interpretation Guide

When analyzing the decomposition results, use this framework to determine the primary driver and appropriate business actions:

Composition-Driven Performance (|Composition Gap| > |Performance Gap|)

Interpretation: The region's performance difference is primarily due to having a different mix of products/categories compared to the baseline, rather than executing differently within those categories.

Business Actions:

  • Portfolio Optimization: Rebalance toward higher-performing product categories

  • Strategic Mix Adjustment: Increase allocation to products with better baseline performance

  • Market Positioning: Focus on segments where the region has natural advantages

  • Resource Reallocation: Shift investment from lower-performing to higher-performing categories

Performance-Driven Performance (|Performance Gap| > |Composition Gap|)

Interpretation: The region executes differently (better or worse) within the same product categories compared to the baseline, regardless of product mix.

Business Actions:

  • Execution Excellence: Improve operational processes and capabilities

  • Best Practice Sharing: Transfer successful execution methods from high-performing regions

  • Training & Development: Enhance team capabilities and operational efficiency

  • Process Optimization: Streamline workflows and eliminate execution bottlenecks

7. Key Methodological Insights

Mathematical Rigor

  • Perfect Decomposition: Every gap decomposes exactly into composition + performance components

  • Validation: Zero mathematical error across all test cases confirms framework reliability

  • Path Independence: Results are consistent regardless of baseline choice for core insights

Business Intelligence

  • Simpson's Paradox Detection: Prevents misallocation of resources by identifying when product-level analysis misleads

  • Actionable Recommendations: Clear distinction between portfolio optimization vs execution improvement strategies

  • Multi-Baseline Analysis: Provides strategic context for setting improvement targets and priorities

6. Performance vs Composition Interpretation Guide

Framework for Reading Decomposition Results:

When Simpson's Paradox Detection Matters

Case 1: Gap Magnitude and Paradox Detection AGREE (Confirmatory)

Region C: |Performance gap| = 2.0pp, |Composition gap| = 12.7pp Simpson's Paradox: Not detected Conclusion: Composition-driven (both methods agree) Action: Strategic mix rebalancing - reliable conclusion

Case 2: Gap Magnitude and Paradox Detection DISAGREE (Critical)

Hypothetical Region X: |Performance gap| = 8.0pp, |Composition gap| = 3.0pp Simpson's Paradox: Detected in Product_Y Gap magnitude suggests: "Performance-driven" Paradox detection reveals: "Product-level analysis is misleading" Conclusion: Must use subcategory-level analysis - product-level conclusion unreliable

Case 3: Gap Magnitude Suggests Composition, Paradox Confirms (Reinforcing)

Region B: |Performance gap| = 3.9pp, |Composition gap| = 9.4pp Simpson's Paradox: Detected in Product_X Both methods suggest: "Composition-driven" Additional insight: Subcategory analysis reveals execution issues masked at product level

Why Simpson's Paradox Detection Matters Even When Gap Magnitudes Agree

Critical Insight: Paradox Detection Validates Analysis Reliability, Not Just Conclusions

Even when |composition_gap| > |performance_gap| AND Simpson's paradox detection both suggest "composition-driven," the paradox detection serves a crucial validation role:

Scenario: Both Methods Agree on "Composition-Driven"

Region B: |Performance gap| = 3.9pp, |Composition gap| = 9.4pp Gap magnitude conclusion: "Composition-driven" ✓ Simpson's paradox detected: Product_X contradictions ⚠️

Why Paradox Detection Still Matters:

  1. Analysis Level Reliability: Paradox detection tells you whether the product-level composition story is trustworthy

  2. Hidden Execution Issues: Even if composition dominates, paradox reveals masked execution problems at subcategory level

  3. Action Precision: Changes business focus from "leverage good composition" to "leverage good composition AND address hidden execution gaps"

The Problem with Product-Level Analysis When Paradox Exists:

Product-Level View (Incomplete Truth):

Region B Product_X: +2.5pp contribution from composition, -0.8pp from performance Gap magnitude conclusion: "B has superior Product_X composition advantage" Business action: "Leverage B's Product_X portfolio strategy"

Vertical-Level Reality (Complete Truth):

Product_X Tech: B rate 52% vs baseline 55% = underperforms by 3pp Product_X Finance: B rate 27% vs baseline 30% = underperforms by 3pp Mix effect: B has 90% in high-performing Tech vs baseline's 10% Complete story: B has superior vertical mix BUT underperforms in BOTH verticals

Why Region-Level Analysis Cannot Provide This Insight:

Region-Level Analysis Limitations:

  • Shows B has "good Product_X performance contribution" (+1.7pp net)

  • Cannot distinguish between:

    • Scenario A: "B executes Product_X well across all verticals"

    • Scenario B: "B underperforms in all Product_X verticals but has superior vertical mix"

  • Both scenarios produce identical region-level metrics but require completely different business actions

Business Action Implications:

  • Without Subcategory Analysis: "Replicate B's Product_X execution methods" (wrong if Scenario B)

  • With Subcategory Analysis: "Replicate B's vertical allocation strategy within Product_X while addressing execution gaps" (correct for Scenario B)

The Critical Distinction:

  • Operational Training (if execution advantage) vs Strategic Portfolio Optimization (if mix advantage)

  • Capability Building (if skill gaps) vs Resource Allocation (if portfolio gaps)

  • Process Improvement (if operational issues) vs Strategic Planning (if composition issues)

Practical Interpretation Examples

Region A Example (Reliable Analysis):

  • |Performance gap| = 0.1pp, |Composition gap| = 1.2pp

  • Simpson's Paradox: Not detected ✅

  • Conclusion: Composition-driven with minimal performance impact (1.2pp > 0.1pp)

  • Business Meaning: Slight outperformance stems primarily from favorable product mix

  • Action Focus: Maintain portfolio advantage while fine-tuning execution

  • Reliability: Product-level analysis is trustworthy

Region B Example (Analysis Requires Caution):

  • |Performance gap| = 5.0pp, |Composition gap| = 6.6pp

  • Simpson's Paradox: Detected ⚠️

  • Gap Magnitude Suggests: Composition-driven (6.6pp > 5.0pp)

  • Paradox Detection Reveals: Product-level performance analysis is misleading

  • Business Meaning: Success comes from superior product portfolio, but significant execution issues masked at product level

  • Action Focus: Leverage portfolio advantage while conducting subcategory-level execution analysis

  • Reliability: Product-level composition conclusion reliable, performance conclusion requires subcategory verification

Region C Example (Reliable Analysis):

  • |Performance gap| = 2.6pp, |Composition gap| = 4.5pp

  • Simpson's Paradox: Not detected ✅

  • Conclusion: Composition-constrained with execution strength (4.5pp > 2.6pp)

  • Business Meaning: Strong execution offset by unfavorable product portfolio

  • Action Focus: Strategic mix rebalancing while preserving execution excellence

  • Reliability: Both composition and performance conclusions are trustworthy

Key Decision Framework

When to Trust Product-Level Analysis:

  • ✅ No Simpson's paradox detected

  • ✅ Gap magnitude comparison provides clear direction

  • ✅ Business actions can focus on product-level insights

When to Require Subcategory Analysis:

  • ⚠️ Simpson's paradox detected in any product

  • ⚠️ Gap magnitude suggests performance-driven but paradox detected

  • ⚠️ Need to verify execution conclusions before operational changes

  • ⚠️ Business actions involve replicating "successful execution" patterns

The Critical Insight: Simpson's paradox detection prevents incorrect business conclusions. Without it, Region B would be seen as having "good Product_X execution to replicate" when actually they have "good Product_X vertical mix to replicate." This fundamentally changes the improvement strategy from operational training to strategic portfolio optimization.

7. Key Methodological Insights

Framework Validation Results:

  • Mathematical Accuracy: All decomposition components sum correctly to total gaps (0.000000 error for all regions)

  • Simpson's Paradox Detection: Successfully identified Region B contradictions at subcategory level

  • Region-Independent Analysis: Each region receives separate reliability assessment

  • Multi-Baseline Flexibility: Different baselines provide strategic context (rest-of-world, global average, top performer)

  • Top Performer Logic: Correctly identifies Region B as best performer (44.50% rate)

Business Value Demonstrated:

  • Perfect Mathematical Validation: Zero validation error across all regions confirms framework reliability

  • Paradox Prevention: Detected Region B contradictions preventing misleading subcategory conclusions

  • Strategic Clarity: Clear action priorities based on mathematical gap decomposition

  • Analytical Reliability: Region-specific paradox detection ensures trustworthy insights

  • Benchmark Identification: Region B identified as top performer for stretch targets

Critical Implementation Factors:

  • Mathematical Rigor: Perfect decomposition validation (0.000000 error) establishes credibility

  • Multi-Level Validation: Subcategory analysis prevents Simpson's paradox misinterpretation

  • Region-Specific Assessment: Each region's analytical reliability evaluated independently

  • Baseline Strategy: Multiple baselines reveal different improvement opportunities

  • Implicit Paradox Checking: Simpson's paradox detection integrated into narrative generation

Real-World Application: This validated framework enables leadership to make data-driven decisions about resource allocation, distinguishing between regions needing composition optimization (Region A), analytical caution due to paradoxes (Region B), and strategic portfolio rebalancing (Region C).

Last updated