By Steve Cohen
It’s not often that I take offense at something written on the Internet. After all, it’s a wild, wild West out there where moderation is often in short supply and signs of intelligent life are hard to find.
With this in mind, during a recent Google search I came across a White Paper that claims to “debunk” the use of “fancy, schmancy” segmentation procedures. In particular, the author laments the fact that the statisticians who “are running the asylum” recommend for segmentation studies the use of MaxDiff Scaling and Latent Clustering (which is called Latent Class or Mixture Models by all people I know). In fact, the author states that using both in segmentation studies is a “recipe for disaster.”
Wow. Just wow.
As someone who has won several awards for my work introducing and using MaxDiff and Latent Class segmentation in the marketing research community, I had to read this document in depth. When I did, I took immediate offense at several of the boneheaded assertions in it.
Let’s take a brief tour of what the author claims. First, MaxDiff
“… is a great measurement tool that should not be used as the source of segmentation inputs. Sound segmentation inputs need to be measured at the individual level and use a method that can be readily reproduced in “short forms” applied during follow-up research. MaxDiff does neither.”
I have two very serious problems with this. First, in my experience using MaxDiff in segmentation studies since 1997, I know that MaxDiff can produce stable, reliable, and very usable results that provide much better differentiation and interpretability than traditional methods. And, second, I contend that a clever analyst can develop a very compact and accurate short-form using MaxDiff results that can be applied in follow-up research.
MaxDiff does measure segmentation inputs at the individual level. These measures are the responses collected in the best-worst choice tasks. Under certain circumstances, what MaxDiff can do is yield individual-level utilities that are estimated using a Hierarchical Bayesian multinomial logit (HB-MNL) model.
The author seems to be blissfully unaware of the discussions in the marketing science literature these past few years about the nature of segments. Prof. Greg Allenby of Ohio State has argued that heterogeneity (segments) should be measured on a person-by-person basis and we should think of segments as people who behave at the extremes, based on an examination of the individual-level utilities. Others, like Michel Wedel at Maryland and Wagner Kamakura at Rice, claim that segments are really constructs that help managers deal with the complexity of markets by providing shorthand ways of talking about consumers and customers in aggregates — which we call segments.
My own view leans heavily to not using individual-level utilities estimated with hierarchical Bayesian tools since the utilities are assumed to be drawn from a normal distribution — meaning the distribution of utilities is smooth and thus does not display any obvious places to “cut” into groups. What I do instead is use Latent Class Models, which assume that the utilities can be estimated to be lumpy and multi-modal – meaning that segments, if they exist, can be discovered.
By the way, since Choice-Based Conjoint Analysis also uses choice inputs and then estimates individual-level utilities using HB-MNL, would the author make the same argument to debunk CBCA? Somehow, I think not.
My guess is that the author has been using Sawtooth Software, which does generate individual-level utilities, in a rote way too often and has not paid much attention to the behavioral science behind segmentation nor to the assumptions underlying these tools.
Short Form MaxDiff?
Let’s examine the second claim that MaxDiff does not yield a method that can be used in a short-form after the segmentation study is complete. Specifically, the author says,
“There is no way to reproduce the MaxDiff importance scores in a short-form classification algorithm.”
First of all, follow-up short-form classification surveys are never be designed to reproduce the MaxDiff importance scores. What is this claim all about?
Rather, as in traditional segmentation studies which employ Discriminant Analysis for post hoc classification, the function of the short-form is to assign people to known segments which have known characteristics by using as few questions as is reasonably possible. Got that? We are not looking to reproduce importances, but just to put people into groups with good accuracy.
I find it hilarious that this wrong-headed assertion is compounded by this declaration about short-form classification tools:
“… the accuracy rates are so low they would scare you. As a result, short forms generated off MaxDiff segmentation schemes tend to be both lengthy and inaccurate.”
I can state categorically that, in my experience, we can create such short forms which are as accurate, or even more so, than traditional methods and are much more compact than traditional methods. I have personally created such short-forms and these contain typically less than 10 questions with accuracy rates in excess of 85%.
Latent Clustering (sic)
Latent Class (LCM) or Mixture Models are based on sound statistical foundations and have a long history of use in marketing science and many other disciplines for uncovering hidden (latent) groups (classes or segments).
So what is the author’s beef with Latent Clustering (sic)? Again, I quote:
“Consumer segmentations are generally done on survey data and respondents have the unfortunate tendency to use scales in slightly different ways from each other (see benefits of MaxDiff). The reason this is a problem in Latent Clustering is that frequently the model tends to form segments based on how people use the scale (e.g., high raters or middle raters) rather than what people were trying to tell us on the scale.”
Hello? Respondents using a rating scale badly is a ubiquitous problem, not only for clustering or grouping of any flavor, but also for brand ratings and many other typical marketing research tasks. Blaming LCMs for how people answer surveys in a biased way is just absurd.
Is there a suggested alternative?
“Transformations (e.g., within-respondent-standardization) that are an effective solution to this issue in Euclidean distance models do not prevent Latent Clustering from generating these meaningless groups,”
I really tried to untangle this word salad, but there are so many ideas happening in this one sentence, I was forced to reach for the aspirin bottle.
But suppose just for example that there are some survey respondents with no or little within-person variation. Claiming that the within-respondent standardization supposedly solves this issue is wrong; it can create yet another set of thorny problems. Think about it. If a respondent “straight-lines” a series of survey attitudes (which happens quite frequently), a within-respondent standardization will require dividing the mean response for each person by his/her own standard deviation, which is exactly equal to or very close to zero. Good luck with that being an effective solution.
Mixed Levels of Measurement
Yet another beef with LCMs!
“The ability to mix metrics generates the temptation to throw in the kitchen sink and segment on virtually the entire survey (attitudes, needs, behaviors, demographics and even brand usage!).”
Good lord! You mean to say that there are researchers in our industry who dump the kitchen sink in a segmentation analysis without even thinking about what they are doing? Oh, no! Where have I been all these years?
My contention is that, used judiciously and wisely, variables at mixed levels of measurement are a great help in developing actionable segmentation solutions. Dumping everything in at once is not a flaw of LC models, but rather of an ineffective analyst.
So what is the suggested alternative?
You dear readers who have actually spent the time to read the quoted article were, no doubt, eager to hear the punch line.
Once the author has “debunked” these tools, surely the magic bullet, the keys to the kingdom, the secrets of life, and the sacred tablets as written by the author will be shown to us lowly mortals.
And what do we get? What do we hear? What is the long-awaited wisdom? What should we do instead of using these heinous methods?
(That is the sound of crickets.)
I suggest that this author clearly needs to get a firm grip on the behavioral and statistical assumptions, theories, and methods of MaxDiff, Latent Class Models, and Hierarchical Bayesian modeling. Spending time trashing these modern advances, misunderstanding their uses and application, and then suggesting nothing to replace them is not even remotely helpful.
Expecting everyone in marketing research to be an above-average analyst born in Lake Wobegon is foolhardy. Perhaps the author will come to realize that some people are just good examples of the Dunning-Krueger effect.
Over twenty years of working with these tools have convinced me that conducting segmentation studies using MaxDiff and Latent Class models represents a powerful combination of tools for marketing researchers and is not at all a recipe for disaster. Is this combination to be used all of the time? Of course not. Marketing researchers should select the best methods and statistical procedures to meet the objectives at hand.
Are the statisticians running the segmentation asylum? Hardly.
Let’s not follow flawed guidance that may not be based on a full picture of the collective experience and best thinking of many experts (not just me!). Otherwise, the incompetents may end up running the segmentation asylum and that is really why it could get scary out there.