Why You Need More Metrics to Evaluate Your Marketing Mix Models (MMM)
By expanding the evaluation criteria for MMM, we believe we are paving the way for robust and insightful modeling practices.
One of the key questions we at Aryma Labs asked ourselves while building Marketing Mix Models (MMM) was: Are traditional MMM calibration metrics simply measuring the same thing in slightly different ways?
Are traditional MMM calibration metrics simply measuring the same thing in slightly different ways?
Why is this important?
Traditional MMM calibration metrics like R squared value, NRMSE, and in-sample MAPE are widely used, but they tend to focus on similar aspects of model performance - mainly how well the model fits the data. While these metrics are useful, relying solely on them may limit our understanding of model quality.
When evaluating an MMM, you ideally want to assess it across a diverse set of parameters.
This realization led us to develop new calibration metrics for MMM. (You can find the link to our research paper in the resources section.) These innovations also inspired the creation of MMM Diagnose, a tool designed to evaluate models across diverse dimensions.
New Calibration Metrics
We invented unique metrics such as KL Divergence, PIT residuals, and Chebyshev’s inequality to expand the evaluation criteria for MMM. These metrics go beyond traditional measures, capturing information related to:
▪️ Probability distributions
▪️ Residual behavior
▪️ Information theory
📌 Mapping Metrics in a Vector Space
To understand how these new metrics relate to traditional ones, we projected all the calibration metrics into a vector space. By conducting a Principal Component Analysis (PCA) on ten of the latest models we built for clients in Q4 2024, we aimed to explore their orientations and interrelationships.
Hypothesis:
Metrics calculated based on residuals (R squared, NRMSE) might cluster together, while metrics derived from probability distributions or information theory (KL Divergence) could orient differently in the vector space.
Key Insights from the PCA
The PCA results seem to validate our hypothesis:
Distinct Orientations: Residual-based metrics exhibited similar orientations, indicating that they measure closely related aspects of model performance.
New Perspectives: Metrics like KL Divergence and PIT residuals showed distinct orientations, offering insights that traditional metrics cannot provide.
Why This Matters
If you’re restricted to traditional metrics like R squared, NRMSE, MAPE, you risk evaluating your models on redundant dimensions. Incorporating more distinct metrics provides a broader, more accurate understanding of model performance.
Some caveats
While our findings are promising, there’s more to explore. We will continue our research to:
- Validate these insights across a broader set of models.
- Refine the application of new calibration metrics.
- Share practical guidelines for their implementation.
Stay tuned for further updates - we plan to publish a detailed research paper or whitepaper soon.
By expanding the evaluation criteria for MMM, we believe we are paving the way for robust and insightful modeling practices.
Resources:
Our research papers on calibration metrics:
https://www.techrxiv.org/users/778033/articles/912223-investigation-of-marketing-mix-models-business-error-using-kl-divergence-and-chebyshev-s-inequality
https://www.techrxiv.org/users/778033/articles/941571-calibrating-marketing-mix-models-through-probability-integral-transform-pit-residuals
You can read more of our research work here: https://arymalabs.com/resources/
Link to MMM Diagnose app : https://arymalabs.com/mmmdiagnose/
Thanks for reading.
For consulting and help with MMM implementation, Click here