Causation in a Nutshell

Clear thinking about causation is increasingly important in MR, especially as we move into a world of bigger, more diverse data sets.

by Kevin Gray

President at Cannon Gray

Editor’s Intro: Causation is not a simple analytic topic, and it is easy to be caught out in a mistake. Kevin describes some of the challenges and lessons learned in a comprehensive manner that is a good introduction to the topic.

Every move we make, every breath we take, and every heartbeat is an effect that is caused. Even apparent randomness may just be something we cannot explain.

Knowing the who, what, when, where, etc., is vital in marketing. Predictive analytics can also be useful for many organizations. However, also knowing the why helps us better understand the who, what, when, where, and so on, and the ways they are tied together. It also helps us predict them more accurately. Knowing the why increases their value to marketers and increases the value of marketing.

Analysis of causation can be challenging, though, and there are differences of opinion among authorities. The statistical orthodoxy is that randomized experiments are the best approach. Experiments in many cases are infeasible or unethical, however. They also can be botched or be so artificial that they do not generalize to real-world conditions. They may also fail to replicate. They are not magic.

Non-experimental research may be our only option in many instances. The key distinction between randomized experiments and non-experimental research is that, in experiments, subjects (e.g., consumers) are randomly assigned to treatment conditions (e.g., different versions of a website). Randomization reduces the possibility that the groups were different before the experiment in ways which might bias the results.

In non-experimental research, this random assignment mechanism is absent. Fortunately, various statistical methods have been developed to reduce bias caused by pre-existing differences among groups. This short article summarizes some of the more popular ones.

Causal analysis is part of my work and has been an area of interest to me for many years. Based on my experience, outside reading, and interaction with academics and other researchers, I’d like to offer a few practical tips for marketing researchers.

First, be very clear about objectives and client expectations. Politics can knock out any statistical model in the first round. If the recommendations of academic consultants can be disregarded, so can ours!
Use randomized experiments whenever possible. Design and analysis of experiments has come a long way since the days of R.A. Fisher, but simple designs are often all we’ll need. An excellent reference is Experimental Design: Procedures for the Behavioral Sciences (Kirk).
Most effects have more than one cause, and alternative causal models may fit the data about equally well but suggest different courses of action to decision-makers.
Don’t be fooled by randomness. Stuff Happens explains what I mean by this.
Don’t be fooled by spurious correlations. A correlation between ice cream consumption and sunburn is probably attributable to weather, for instance.
There are also interactions – moderated effects – in which the relationship between two variables depends on a third variable. For example, older consumers may be heavy users among males but light users among females.
Mediation is easy to confuse with moderation, though they are not the same. A mediator variable is influenced by an independent variable and, in turn, affects the dependent variable. Thus, an independent variable may have both direct and indirect effects, depending on the causal model. Mediation is currently a hot topic in the marketing literature.
Relationships between variables are not always linear. Below a certain point, for instance, increases in ad spend may have no impact. Above that point, sales may increase very rapidly until they begin to taper off, with further increases in ad spend having little or no effect. Statistical analyses need to account for patterns such as this.
Cause will normally precede effect, but causation can also be reciprocal. For instance, A influences B, then B influences A. Brand usage and image frequently interact in a similar fashion.
Lagged effects are common in time-series data. The impact of a new ad campaign, for example, may not show up in sales data for several days or weeks. If we analyze data collected across time with methodologies designed for cross-sectional data, we may get a very distorted picture of the marketplace.
There are often distinct causal models at work for different consumer segments. Some of these segments, such as lifestage, may be obvious but others are hidden and must be uncovered through statistical analysis.
Beware of regression to the mean. In key driver analysis, for example, variables scoring very high (or very low) in relative importance will probably rank more towards the middle the next time we conduct the study (though still high or low). Regression to the mean is a statistical phenomenon but researchers often mistakenly attribute its effects solely to marketing activities or market trends.
More data and bigger data has actually made business understanding more important, not less. Correlation has not replaced theory, despite some claims by data scientists.
Use significance testing judiciously. Statistical significance is unrelated to business significance, and a variable’s effect size is far more meaningful.
Also, be wary of automated modeling, even if it’s been branded as machine learning or AI. With just a few variables, many causal models are possible and mathematical criteria can seldom determine which is best for decision-makers. As the legendary statistician George Box put it, “Essentially, all models are wrong, but some are useful.”
Marketing ROI is often treated as an accounting exercise but is really a form of causal analysis. Econometric methods are generally most suitable because they enable us to examine how inputs (e.g., marketing activity) covary with outputs (e.g., sales) over time.

This short interview with Harvard professor Tyler VanderWeele provides a snapshot of causal analysis. Mastering ‘Metrics (Angrist and Pischke) and Observation and Experiment (Rosenbaum) are two comparatively non-technical overviews of the topic.

The Shaddish, Cook and Campbell classic Experimental and Quasi-Experimental Designs is a hard read in places, but I’d recommend it to any marketing researcher. The book’s diagrams of research designs and summaries of their advantages and vulnerabilities are priceless and timeless.

There are also advanced books more appropriate for marketing scientists, such as Causal Inference (Imbens and Rubin), Counterfactuals and Causal Inference (Morgan and Winship), Explanation in Causal Inference (VanderWeele), and Linear Causal Modeling with Structural Equations (Mulaik).

The importance of theory in causal analysis is difficult to overstate. A theory is often tested multiple times and in different ways, and a set of procedures known as meta-analysis is increasingly used to statistically synthesize the results of numerous primary studies.

Causality is finally beginning to receive the attention I feel it has long deserved. Hopefully, it will not turn into another business fad, with fallacies and embellishments going viral on the Internet.

career data science

Kevin Gray

President at Cannon Gray

43 articles

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Responsible AI: Balancing Innovation with Ethics

Discover the world of Artificial Intelligence and unravel the confusion of basic concepts. Explore distinctions between pattern detection and generati...

Kevin Gray

President at Cannon Gray

October 27, 2023

Read article

Research Technology (ResTech)

The Impact and Ethics of Artificial Intelligence

Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about the impact and ethics of Artificial Intelligence...

Kevin Gray

President at Cannon Gray

February 8, 2023

Read article

Where is Marketing Data Science Headed?

Some marketing data science directly competes with traditional marketing research areas and many marketing researchers may wonder what the future hold...

Kevin Gray

President at Cannon Gray

January 26, 2021

Read article

Research Technology (ResTech)

Data And Oil

The importance of pursuing the discovery and understanding of data.

Kevin Gray

President at Cannon Gray

March 16, 2020

Read article

ARTICLES

Top in Quantitative Research

Research Methodologies

Moving Away from a Narcissistic Market Research Model

Why are we still measuring brand loyalty? It isn’t something that naturally comes up with consumers, who rarely think about brand first, if at all. Ma...