Artificial Intelligence and Machine Learning

October 23, 2013

Analytics Revolution

There’s a revolution happening in analytics. We can answer more questions more quickly than ever before. But, there are downsides.

Kevin Gray

by Kevin Gray

President at Cannon Gray

0

By Kevin Gray

There’s a revolution going on in Analytics. But first, what is Analytics?

Analytics has gotten a massive amount of buzz in recent years, lately in connection with Big Data and Data Science.  The term “analytics” is by no means new but, perhaps surprisingly, there is a lack of consistency in what it means or implies.  Sometimes it is used to designate a process, from problem identification through recommended actions.  It also can refer to inferential statistics like standard errors, confidence intervals and t-tests or to basic measures of association, e.g., the Pearson product-moment correlation or chi square.  At other times, though perhaps couched in esoteric claims, it merely refers to descriptive statistics such as frequency counts, means and standard deviations or to commonplace graphics such as line graphs, histograms and pie charts.  More sophisticated data visualization is also at times called analytics.

In addition, analytics can refer to an extensive assortment of multivariate statistical methods and machine learning algorithms.  That usage is the focus of this article.  These techniques can be classified in various manners and one way is to characterize them either as Interdependence methods or Dependence methods.  A second point of differentiation pertains to whether a method is intended for Cross Sectional data or Time Series data.

Factor Analysis and Cluster Analysis are probably the best known Interdependence methods, though there are many others.  Put very simply, Factor Analysis groups variables and Cluster Analysis groups observations, respondents in a consumer survey for example.

Dependence methods differ in that there is one or more Target (Dependent) variable we would like to explain or predict from one or more Predictor (Independent) variable.  Many kinds of Dependence methods see extensive use in Marketing Research.  They can be further subdivided according to whether the dependent variables are quantities, counts, ordered categories or nominal categories that have no natural order or rank.  Regression and Discriminant Analysis in particular are well known in Marketing Research; the former is used when the dependent variable is quantitative (or we decide to treat it as such) and the latter comes into play when we wish to differentiate groups (e.g., User/Non-User).

Actually, it’s not quite this simple.  Partial Least Squares Regression and some varieties of Structural Equation Modeling are a blend of Independence and Dependence methods!

The techniques described thus far have been designed for cross-sectional data, data collected at one point in time.  Time Series Analysis is used when the data have been collected over many time periods.  Weekly sales data are an example of Time Series data.  Exponential Smoothing, ARIMA, Dynamic Regression, State Space and GARCH models are just a few examples of Time Series Analysis Methods.  They are household words to Econometricians but more opaque to most of us in Marketing Research.  Time Series Analysis plays important roles in Marketing Mix Modeling and ROI analysis as well as in sales forecasting.

Once again, though, things are not quite this simple!  There are also methods appropriate for Within-Subjects (Repeated Measures) and Longitudinal data.  An example of when Within-Subjects designs are suitable is when consumers are asked to evaluate two or more products, real or hypothetical, as in an in-home product use test (real) or conjoint study (typically hypothetical).  The venerable Repeated Measures MANOVA might be familiar to some of you.  Longitudinal designs are useful when we observe consumers’ behavior over time.  Survival Analysis is one such method and in Marketing Research is used in customer churn modeling.

That was just the Old Stuff

 Out of breath?  Well, these are mostly “trad” methods.  It would not be exaggerating to say there has been an explosion in the number and variety of analytic methods in recent years.  Advances in computer technology have taken many methods off the drawing board and put them right onto our laptops.  Mixture Modeling (a.k.a. Latent Class) is one example that not long ago was impractical on the computers most Marketing Scientists were using.  It is proving very useful in segmentation as well as other kinds of analyses, in part because its ability to model different kinds of data (e.g., quantitative and nominal) at once.

Bayesian methods, which can be intricate and are not easy to describe in a nutshell, are seeing increasing use in Marketing Research.  Put very, very simply, in Bayesian statistics we incorporate prior beliefs about the problem we’re studying directly into our analysis and then update our understanding of the problem we’re investigating when new data become available.  From the outset we are explicit about uncertainty.  Bayesian methods have some important advantages in comparison with the more recognizable Frequentist methods.  They are often more adept at handling sparse and messy data, for instance.

There are a lot of methods that are being developed outside of university Statistics departments, most notably by computer scientists.  Many are termed Machine Learning, though the way that term is used is often ambiguous.  Machine Learners are often much better at pattern recognition (e.g., in text analytics) than the Statistical Methods we are used to.  Some examples of these new methods, including those developed by statisticians, are Neural Networks, Support Vector Machines, Bayesian Networks and approaches utilizing boosting, and bagging.

These “non-trad” techniques are core methods in Data Mining and Predictive Analytics, nowadays often lumped together under the vaguely-used labels “Big Data” and “Data Science.”  There is now a vast array of these methods and many are also handy for analysis of consumer survey data, including segmentation and driver analysis.  One important downside many of these methods share, however, is that their results are often difficult to interpret; while they are adept at prediction (the “What”) they are often not as useful for helping us understand the “Why” as traditional methods.  Fortunately, in many cases the two can be used in combination to get the best of both analytic worlds.

Whatever the analytic methods used, it is also now easier than ever to perform various kinds of “What if?” simulations to make educated guesses about what might happen under various marketing scenarios, such as the introduction of a new product or competitor activity.  Done prudently, simulations can help our models speak to us and guide decisions we need to make.

What does all this imply?

The foregoing is only a sample of the methods used in Marketing Research.  Though many haven’t yet diffused very far into the Marketing Research mainstream, it should be evident that we have no shortage of tools for analytics!  There is truly a gigantic number and brilliant academics around the world are developing new ones around the clock.  And, due to space limitations, I haven’t even mentioned Social Network Analysis, Biometrics or many other newer kinds of analytics.  True Artificial Intelligence still lies in the future but perhaps one day…

Some of you will have heard of R, open-source (free!) statistical software that is becoming a standard research tool for Marketing Scientists.  Though not as user-friendly or well-documented as some statistical packages we’ve become accustomed to, there are now several thousand R packages and a large and increasingly sophisticated R user base.  Many R packages perform cutting-edge analytics and, perhaps surprisingly, first-rate graphics, in addition to standard methods.  There are also many other open-source tools besides R.

There’s a revolution happening in analytics…we are now able to give better answers to more questions more quickly than ever before.  But, there are downsides.   With more tools, there is more to master and more mistakes will be made if our skills sets become too thin.  Increasing specialization will be needed and more silos among analytics professionals may emerge.  We must also avoid using methods merely because they are new – newer is not synonymous with better.

More importantly, let’s not lose sight of our raison d’être.  Who will be using our deliverables, and how and when they will be used is most critical.  Let’s first focus on the decisions, not the technology.

0

analyticsmachine learning

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

More from Kevin Gray

Responsible AI: Balancing Innovation with Ethics

The Prompt

Responsible AI: Balancing Innovation with Ethics

Discover the world of Artificial Intelligence and unravel the confusion of basic concepts. Explore distinctions between pattern detection and generati...

Kevin Gray

Kevin Gray

President at Cannon Gray

The Impact and Ethics of Artificial Intelligence

Research Technology (ResTech)

The Impact and Ethics of Artificial Intelligence

Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about the impact and ethics of Artificial Intelligence...

Kevin Gray

Kevin Gray

President at Cannon Gray

Where is Marketing Data Science Headed?

Where is Marketing Data Science Headed?

Some marketing data science directly competes with traditional marketing research areas and many marketing researchers may wonder what the future hold...

Kevin Gray

Kevin Gray

President at Cannon Gray

Data And Oil

Research Technology (ResTech)

Data And Oil

The importance of pursuing the discovery and understanding of data.

Kevin Gray

Kevin Gray

President at Cannon Gray

ARTICLES

Moving Away from a Narcissistic Market Research Model

Research Methodologies

Moving Away from a Narcissistic Market Research Model

Why are we still measuring brand loyalty? It isn’t something that naturally comes up with consumers, who rarely think about brand first, if at all. Ma...

Devora Rogers

Devora Rogers

Chief Strategy Officer at Alter Agents

The Stepping Stones of Innovation: Navigating Failure and Empathy with Carol Fitzgerald
Natalie Pusch

Natalie Pusch

Senior Content Producer at Greenbook

Sign Up for
Updates

Get what matters, straight to your inbox.
Curated by top Insight Market experts.

67k+ subscribers

Weekly Newsletter

Greenbook Podcast

Webinars

Event Updates

I agree to receive emails with insights-related content from Greenbook. I understand that I can manage my email preferences or unsubscribe at any time and that Greenbook protects my privacy under the General Data Protection Regulation.*