By Kevin Gray
The term data science has entered business vernacular with a bang…but what exactly is it? Despite all the media buzz, one story that has gone largely untold is that statisticians are asking themselves the very same question: “The exact meaning of this term is a matter of some debate; it seems like a hybrid of a computer scientist and a statistician.” I have quoted from Statistics and Science: A Report of the London Workshop on the Future of the Statistical Sciences, a product of a meeting in London in November, 2013 that was attended by more than 100 prominent statisticians from around the world.
If such a distinguished body doesn’t have the answer, for me to declare that I do would strain credibility. In place of suggesting my own definition of data science I will offer some thoughts about it and what I feel is its place in marketing research, based on my experience as a marketing science person as well as interaction with contacts and business associates who describe themselves as data scientists.
The first dimension
As noted in Statistics and Science, “data science” is loosely used to refer to lines of work that make extensive use of computer science and statistics. Most of these occupations are not directly related to marketing, genomic research and seismology being two examples, and now play a role in many fields. Data science is often coupled with the term big data, and I should note that there doesn’t appear to be much agreement about what big data means either (see, for example http://datascience.berkeley.edu/what-is-big-data/?utm_source=linkedin&utm_medium=social&utm_campaign=blog. )
Many working in these areas are computer scientists and mainly concerned with IT matters. However, I perceive a rough continuum, on the opposite side of which there is greater emphasis on analysis and interpretation of data. Statisticians and marketing scientists (with assorted job titles) are mostly located on that side of the continuum. Of course, it’s not quite as simple as IT people on one side and statisticians on the other and there are other attributes, such as industry or subject matter expertise, that distinguish the various kinds of data scientists. There is more than one dimension to data science. Here is another, psychographic, perspective on data scientists that may also be of interest: http://www.information-age.com/industry/uk-industry/123458536/uks-data-scientists-face-burnout-due-work-related-stress .
The extremes of my (real or imagined) continuum have become increasingly mindful of one other and in LinkedIn discussion groups and other public forums there are often heated exchanges between them. The former often characterize the statto types as stuck in the past and out of touch, while the latter frequently see the IT focused as lacking in basic analytical skills and scientific thinking. Both score points in these debates but what I think is more important is that these two groups differ in educational background and skills, and also seem to be different sorts of people. Statisticians, for instance, are notoriously comfortable with uncertainty; probability, after all, lies at the heart of their discipline and if you want a quick yes-or-no answer, don’t ask a statistician. (I confess…)
Heavily IT focused data scientists are often not well-versed in statistics and some are actually distrustful of statistical models. Data management and related tasks are their main concerns. Conjoint, structural equation modeling, time series analysis and many other statistical tools widely-used in marketing research are a foreign world for some, and statisticians often criticize current data science practice as mechanical and algorithm driven or as focusing too much on the What and not enough on the Why.
To flesh out these criticisms, let’s consider an example from marketing. While we may be able to predict future purchase patterns of consumers from their demographics and past purchases fairly accurately, by integrating data from various sources, such as consumer surveys, and by using advanced statistical modeling, we can gain insights into why certain types of consumers behave the way they do in certain situations. Marketing is also about changing behavior, not just predicting it, and these insights can help us develop more effective and profitable marketing, as well as improving our predictions. Generally speaking, I believe these criticisms have substantial merit but will concede that causal modeling is not feasible or necessary in every data science project.
Quite a few universities now offer Data Science or Analytics programs that blend statistics and computer science but, with swift advances and increasing specialization within each discipline, these programs may be difficult to sustain. Needless to say, it will always be hard to develop individuals who are highly competent in statistics and computer science, to say nothing of subject matter expertise or the political savvy needed to survive in today’s rough work environments. Admittedly, I am greatly simplifying here and quite a few job descriptions for data science positions I’ve seen are not that dissimilar to what I do for a living, and I now include data science in my LinkedIn headline and company website. More importantly, though, data science teams can include computer scientists, statisticians, economists, psychologists and specialists from many other backgrounds and there is no mandate for such teams to be comprised of only one type of data scientist.
Not quite plus ça change, plus c’est la même chose
So, what is the role of data science in marketing research? Many aspects of data science are actually already part of marketing research, even if the term data science is fairly new. Beyond doubt, in the last few years there has been an explosion in the amount of data we are able to capture, store and retrieve, accompanied by rapid developments in computer hardware and software. Nevertheless, over the past several decades many organizations have increasingly been using data and analytics in decision-making, including marketing. Since the 1990’s, much of this activity has been referred to as data mining or predictive analytics, though data science is now commonly used in their place.
I can recall a senior colleague who had spent much of his career with multinational manufacturers commenting, in this context, that the strongest competitors MR agencies faced were their own clients. That was back in the last century! The popular data mining software was developed by a company called Integral Solutions Limited (ISL) and originally known as Clementine and released in 1994. SPSS acquired ISL four years later and SPSS Clementine was launched with much flourish – the rollout event I attended drew a crowd of more than 1,000 people. So, while many things have changed, many things have remained more or less the same.
That said, I wouldn’t agree with those who believe data science and big data can be dismissed as mere semantic fiddling. I also disagree with MR colleagues who fear them as tsunamis racing towards us and, instead, I see data science and big data more as opportunities for marketing research than as threats. Though gut feel will always be a part of most decisions, I concur with those who predict that data and analytics will play a much larger role in management than is now normally the case, and this dovetails very neatly with the essential purpose of marketing research.
Back to the present
We shouldn’t let ourselves get carried away, though. In A Practitioner’s Guide to Business Analytics, Randy Bartlett devotes considerable space to organizational cultural challenges and more than he does to technical matters. We should note that the author is not a journalist or software vendor but an analytics veteran of more than 20 years who holds degrees in both computer science and statistics. I share his view that the old ways still dominate true science in most decisions: “Corporations are not as sophisticated or as successful as we might grasp from the sound bytes appearing in conferences, books, and journals. Instead opinion-based decision making, statistical malfeasance, and counterfeit analysis are pandemic. We are swimming in make-believe analytics.” That is the real world as I see it too.
Big data and data science are Big Business and in my opinion have been overhyped. We humans do not appear to be hard-wired to use data to make decisions and for years, if anything, managers have complained about information overload. Our schooling by and large has not prepared us fully exploit new data sources and advanced information technology. Even if there were radical changes in the way we are educated, as long as there are human managers and human consumers, data and analytics will never entirely replace gut feel in decisions. We are emotional and not easily persuaded by logic or evidence and the often rancorous debates about data science are ironic reminders of that part of our nature.
Besides, many important decisions cannot simply be calculated; after all, even thermostats are regularly overruled by humans! Something else we should be alert to is that more data, particularly when the numbers aren’t trending in the same direction, will be more fuel for organizational politics in some companies and only make decision-making more unwieldy. Add to these our natural inclination to stay the course and the very bureaucratic character of many organizations, and an abrupt and radical transformation in the way we make decisions would seem unlikely.
Decision-making will gradually evolve and become more, if never wholly, evidence-based. Over the next few years I foresee decreasing emphasis on data infrastructure and more emphasis on what data tell us and how they can be leveraged. With bigger and frequently messier data, understanding people will become more critical, not less, and demand will rise for marketing scientists able to see beyond math and programming who truly understand marketing and consumers. Incompletely observed behavior or conversations only tell us part of the story and have the potential to mislead. More analytic options also mean more risk and increase the need for well-trained and experienced researchers. The resurgence of Bayesian statistics is further evidence that human judgment cannot be purged from analytics; as Noel Cressie and Chris Wikle point out in their heavily mathematical textbook, Statistics for Spatio-Temporal Data, “Science cannot be done by the numbers.”
An unfortunate corollary of rapid technological change is increasing specialization and even more silos and misunderstandings. Buyers may not really know what they’ve bought and sellers may not really know what they’ve sold. Closer to home, in marketing research, the well-rounded generalist is already becoming hard to find and I think over-specialization is hurting our profession. We have lots of shiny new tools that many of us don’t know how to use properly, and MR educational and training programs will need to provide more cross-training to counteract this flip side of progress.
Though there will always be things outside our control, there is much we marketing researchers can do to shape the destiny of our profession. Besides embracing new technologies and methodologies, less exotic activities such as educating clients about how to use marketing research to make better decisions will not lose their importance. Just the opposite. Changing habits of thinking will be crucial and improving our own decision-making skills would do us little harm. We must also be on guard against dubious claims and pseudo-science, which I see as threats to genuine innovation. After all, not everything that is far-fetched actually works!
We must also learn to be more effective at marketing marketing research; paradoxical though it may be, I think many us will admit that our industry has historically been pretty lousy at marketing itself. We must compellingly respond to contentions that data science has made marketing research irrelevant, and one way is to demonstrate that “data science” has, in fact, been part of marketing research for quite some time.
Data science is not entirely new and not entirely old. It can do amazing things but cannot work miracles. Despite the hype and hogwash, I see it much more as friend than foe.