By Kevin Gray
One of the things I like most about Marketing Research is its diversity. On the job, all of us wear many hats. What’s more, Marketing Research is no longer a cottage industry and has developed into a large global business, and we come from many different ethnic, religious and national backgrounds. In terms of education we also hail from all over the map, including literature, linguistics, philosophy and the social sciences. Our educational backgrounds often have given us just a fleeting glimpse of statistics and research methodology and to many of us “analytics” is arcane. It can mean different things to different Marketing Researchers and the role of a Marketing Scientist also varies from company to company.
In this short article I’d like to draw back the curtain for a moment and let you peer inside the sometimes mystifying world of the Marketing Scientist. Let’s imagine a hard case scenario. Imagine yourself in the position of a Marketing Scientist who is given data and asked to “find something interesting for my client.” While this sort of request might strike even someone new to our business as unprofessional and unreasonable, on more occasions that I care to remember this has happened to me.
What to do under these circumstances? First, some fairly obvious questions come to mind apart from confirming budget and timing:
- What do the data mean? You’ve been given a long questionnaire and data map but it could take ages to go through and make sense of them. Not clearly understanding what the data mean can lead to a lot of rework later on or, even worse, misleading or incorrect results and recommendations.
- How clean are the data? This is something you’ll have to ascertain yourself and you should never assume data are error-free.
- Who will be using the results, and how and when will they be used?
- What are the client’s expectations?
None of these questions can be answered automatically by software and each requires human expertise. The last two questions are most critical and it is at this point that many wrong turns are taken because incorrect assumptions are made. This can lead to a lot of agony (and cost!) downstream. “Getting to know” the data and data cleaning can be accomplished in tandem, through an iterative process. Recoding data is usually done at this stage. This can be laborious and time-consuming but is usually essential because many questions or data fields have a large number of categories that must be re-grouped or combined so they are easier to interpret or because of small base sizes. Rating scales may also need to be reversed so that higher numbers mean more positive scores than lower numbers. There is occasional glory in analytics but also a lot of grunt work!
SPSS, Excel and CSV are lingua franca for data files but the software you will be using may require some other format. Not infrequently, statisticians need only a small part of the original data file and must create one or more data files for analytic purposes. Sometimes DP can do this for you but normally there are decisions the person analyzing the data is in the best position to make.
Once beyond these first critical steps, you’ll now need to think concretely about how to analyze the data. In some instances this is clear from your understanding of the client’s needs and the data themselves but often it is not, as in our hard case illustration. Perhaps surprisingly, in some respects analytics is becoming more difficult precisely because we have more options than ever before. (See Analytics Revolution and Why Survey? for brief overviews of analytics.) Frequently, several techniques are combined in the analysis. Moreover, many statistical terms such as “regression” and “Bayesian” are quite generic and refer to broad families of methods, and your research exec or the client may have suggested a technique without really understanding what it is. Many distinctions that seem like geeky minutiae to those with limited statistical background actually are very consequential but tricky to communicate. This takes practice and is an important skill a Marketing Science person must acquire.
Generally speaking, you should keep the analysis as simple as possible and be solutions-driven rather than technique-driven. When considering various approaches ask yourself “How will this choice affect my client’s decisions? Will they be able to communicate the results to their internal clients?” Don’t use a method just because you are comfortable with it and don’t try to show off your mathematical virtuosity. That said, running huge numbers of cross tabs and letting significance tests (more or less) do your thinking for you is commonplace but dicey practice.
This is a lengthy topic but, in short, significance testing (and computer algorithms generally) can only suggest rough cutoffs for deciding what is “important” and what isn’t from a business standpoint. Furthermore, significance testing is only concerned with sampling error and as a rule assumes simple random sampling, which is seldom in fact used in survey research. Significance tests also are not independent and Type I error (“false positives”) rapidly accumulates when many tests are conducted on the same data. In Data Mining and Predictive Analytics sampling is often less problematic but the data may contain millions of records and miniscule differences flagged as highly statistically significant. What’s more, if the data represent an entire population – records for all customers for example – inferential statistics are meaningless. On the other hand, abandoning significance testing altogether is unwise; sometimes it is helpful in cutting through the clutter and how to use it is a case-by-case decision.
Eminent statistician George Box had many wise words over the course of his long and distinguished career and “all models are wrong, but some are useful” is one his most quoted pieces of advice. Analytics requires many choices and we usually will never know what mechanism or mechanisms gave rise to the data we are analyzing. Often several models will provide equivalent “fit” to the data but suggest different courses of action, and the choice among them may dramatically affect the client’s decisions. While model comparison indices such as the BIC and AIC or other heuristics can help narrow down the range of plausible models, as with significance testing, they cannot provide THE answer. We need to roll up our sleeves and think.
“Correlation is not Causation” is currently a buzz-phase in the business media. (Ironic, given that conspiracy theories flourish in many news outlets!) An association may support or suggest a hypothesis but it does not prove a causal relationship. Why does this matter? When prediction rather than explanation is really what is necessary we sometimes can lighten up a bit and rely on semi-automated methods popular in Data Mining and Predictive Analytics. However, while these techniques often excel at prediction, they frequently yield results that are hard to interpret. Being able to spot potentially high-spending customers, for example, by itself may be insufficient. Lacking insights into why they and similar customers behave the way they do will make it more difficult to design marketing programs that will work in practice. Also, many decision makers are understandably distrustful of “black box” solutions.
There are many decisions to make in analytics and it is only possible to mention a handful of the most typical ones here. To the extent possible a Marketing Scientist should be involved early in the design of the research. That will reduce the headaches described at the beginning of the article! In some situations, though, Marketing Scientists can become involved too early in the process and the discussion veers off towards methodological details before the key business concerns have been sorted out. This also is to be avoided.
I should stress that I’ve only given a glimpse of Marketing Science, which is much broader and more varied than the foregoing might suggest. One small request before we draw the curtain closed and get back to work: please do not just give your Marketing Scientists some data and ask them to find something interesting for your client! You are all part of a team so please interact with them proactively and provide them as much background and feedback as you can. Try to understand what your client really needs – which is not always what they request – and work backwards into the methodology. This is a better way to do research and a better way to do business.