Editor’s Note: Market researchers are often enamored of the “new” – we tend to get excited over the latest shiny object. I don’t think we differ all that much from other people in that regard. Once the excitement of the new wears off a bit, we’re then left with the question “what’s it all really good for at the end of the day?” This is when the hard work really begins, as different people and groups try to work with it and make judgments based on actual experience. In this post, Mike Kelly gives us a very mature explanation and evaluation of some new complex modeling techniques, and how harnessing them to more standard approaches and “artistry” can provide real business value. A very valuable read.
Unfulfilled Promise of Complex Models
Business analysts can leverage a large and growing portfolio of advanced (typically non-linear) modeling techniques to predict key outcomes like customer spend or defection. Available techniques include decision trees, random forests, structural equation models, neural networks, polynomial regression, regression splines, and support vector machines, among others. To novices, the options can be intimidating. Even data scientists are working hard to develop their points of view.
Advanced techniques have proliferated because they are assumed to deliver higher levels of predictive accuracy than what simpler, linear models can achieve. But do complex models actually deliver better predictions?
It turns out that they may outperform linear ones in the lab with artificial data sets but in the real world, their incremental value is less clear-cut. Data scientists who’ve been tracking the evolution of complex models have noted that gains attributable to more advanced modeling developments can be surprisingly small ‒ sometimes inconsequential.
The Shortest Distance is Often a Straight Line
Observations like these are prompting renewed appreciation for simple linear models – and growing recognition that it’s not necessarily a matter of “either-or”. It turns out that we’re able to strengthen models when we harness more advanced techniques to old-fashioned “linear” horsepower. In fact, a primary reason for the recent performance gains in neural networks has been the replacement of a non-linear function with a linear one (specifically “rectified linear units” or “ReLUs”) in network architecture. This “back to the future” approach has improved overall accuracy and learning speed in diverse areas including language translation and image classification (e.g., facial recognition).
In a market research context, the use of simple linear models can provide several advantages over complex models
- Easier to understand and communicate Linear model clarity makes it easier to see the implications for business decisions, facilitating translation of insights to action.
- More democratic Linear models are relatively easy to learn and apply, increasing the bandwidth of market analysts who might otherwise require specialized support from others in or outside the organization.
- Highly effective with small data sets More complex models such as neural networks typically require very large data sets (“Big Data”) to optimize performance.
- Less likely to lead to wrong conclusions Risk of model overfitting is significantly higher with non-linear models. Even if they are slightly more “accurate”, they are also more likely to produce the wrong business decisions.
- Less expensive and faster to apply Linear models get most of the way toward accurate prediction, which means that investment in more complex models produces diminishing returns.
- Flexible and adaptable They can be modified to include non-linear components; address multicollinearity among predictors (e.g., through Kruskal regression); and create complex, multi-stage models out of linear components (e.g., structural equation models).
Together, these advantages suggest that linear models will not be going away any time soon. In fact, the pendulum is poised to shift back a bit, with added momentum from DIY platforms designed to expedite or automate model-building and share the responsibility
But linear models will not get us the whole way there. There is much we remain unable to predict or explain about our customers, and an increasingly urgent need to better understand them. And if complex models aren’t actually much more accurate than linear, does that mean we’re stuck with underperforming models overall? The answer is no. We can get higher horsepower from our engines when we hybridize and upgrade the fuel.
‘Feature Engineering’ to Improve Linear Models: Skilled Preparation, Great Ingredients
It’s important to keep in mind that there are two components to modeling: the analytic technique and predictive features. Academics have tended to focus their attention on optimizing the “machinery” and not the “fuel” (because it’s sexy) but some of the best thinkers in the field are starting to talk about shifting the emphasis to inputs by building better predictors through “feature engineering”. Serious evaluations of modeling success using real-world data sets suggest that the nature of the predictive features used as inputs can make a big difference.
At NAXION, our applied experience across hundreds of commercial modeling assignments bears that out. While we can typically squeeze out a bit more predictive accuracy by supplementing linear models with non-linear components, the return is substantially higher if we concentrate more on feature engineering.
For example, B2B models are commonly based on Dun & Bradstreet or other business databases. These databases contain many variables that can serve as predictors in modeling, such as revenues, number of employees, and industry. But we’ve found that model accuracy is significantly enhanced if we go beyond these raw variables to engineer new ones. In fact, most of the predictors in our final B2B models consist of engineered rather than raw database variables. Some heuristics that have proven helpful in “engineering” predictive features include combining raw variables to create better indicators of key metrics, assigning businesses to industry categories customized for a particular market, and creating variables that score businesses relative to appropriate peers rather than on an absolute scale.
In consumer applications, the use of created variables that integrate certain types of purchase behaviors or brand affinities can produce more illuminating driver models than some of the “natural-occurring” inputs from our survey data sets.
The Art in Artificial Intelligence
While feature selection has long been automated in modeling, the art of feature invention is not about to be automated any time soon. The best predictive features will, instead, be hand-carved using skills and intuition sharpened through experience. Thus, while we can continue to look forward to automation in many aspects of data science, further gains in modeling acuity will still rely heavily on industry knowledge and business acumen. What comes out the hopper will not take us as far as we need to get to guide critical business decisions.