Our new GreenBook Directory site is live!
Your #1 strategic guide to consumer insights.
Qualtrics X4 The Experience Mgmt Summit
Brand & Retailer tickets for all IIeX events now start at just $99! Get or give one today!

How to Solve the Most Common Data Problems in Retail

The most successful retail companies use data science and predictive analytics to improve efficiency, improve marketing campaigns, and gain customer insights that give them a competitive advantage.

By Pauline Brown

In the retail business, big data is poised in the coming years to open up huge opportunities in the way stores (both physical and online) fundamentally operate and serve customers. Given the incredibly small margins, Big Data will also provide much-needed efficiency improvements – from tighter supply chain management to more targeted marketing campaigns – that can make a big difference to a retail business of any size.

Making data-driven decisions is no longer about learning from the past; it means making changes to the business constantly based on real time input from all data sources across the organization. Making predictions and applying machine learning is based on traditional data but also on new and innovative sources like connected Internet of Things (IoT) devices and sensors or, going a step further with deep learning, unstructured data from things like static images or cameras monitoring stock in warehouses. Consumers can be fickle, so being able to accurately anticipate what they will do next and quickly react is what puts the most innovative and successful retailers above the rest.

Data science software maker, Dataiku, recently explored the types of data problems facing retail, the problems they solve, and the steps that any retail organization can take to become more data driven.

PROBLEM #1: Siloed, Static Customer Views

Many retailers still struggle with siloed data – transaction data lives apart from web logs which in turn is separate from CRM data, etc.

SOLUTION: Complete, Real Time Customer Looks

Cutting-edge retailers look at customers as a whole, combining traditional data sources with the non-traditional (like social media or other external data sources that can provide valuable insight).


  • More accurate and targeted churn prediction.
  • Robust fraud detection systems.
  • More effective marketing campaigns due to more advanced customer segmentation.
  • Better customer service.

PROBLEM #2: Time Consuming Vendor & Supply Chain Management

Supply chains are already driven by numbers and analytics, but retailers have been slow to embrace the power of realtime analytics and harnessing huge, unstructured data sets.

SOLUTION: Automation and Prediction for Faster, More Accurate Management

Combine structured and unstructured data in real time for things like more accurate forecasts or automatic reordering.


  • More efficient inventory management based on real-time data and behavior .
  • Optimized pricing strategies.

PROBLEM #3: Analysis Based on Historical Data

Looking back at shoppers’ past activity often isn’t a good indication of what they will do next.

SOLUTION: Prediction and Machine Learning in Real Time

Instead, real-time prediction based of current trends and behaviors from all sources of data is the key


  • Anticipating what a customer will do next.
  • A more agile business based on up-to-the-minute signals.
  • The ability to adapt automatically with customer behavior.

PROBLEM #4: One-Time Data Projects

Completing one-off data projects that aren’t reproducible is frustrating and inefficient.

THE SOLUTION: Automated, Scalable and Reproducible Data Initiatives

The best data teams in retail focus on putting a data project into production that is completely automated and scalable.


  • More efficient team that can scale as the company grows.
  • With reproducible workflows, team can work on more projects.

While each organization is different, data challenges are the same.  It takes a data production plan to guide any sized team to successfully producing a working predictive model that yields meaningful insights for the business.

How to Complete any Data Project in Retail

The most successful retail companies worldwide solve these four issues by efficiently leverage all of the data at their fingertips by following set processes to see data projects through from start to finish. They also ensure those data projects are reproducible and scalable so the data team is constantly able to work on new projects vs. maintaining old ones. This is as easy as following the seven fundamental steps to completing a data project:

  1. DEFINE: Define your business question or business need: what problem are you trying to solve? What are the success metrics? What is the timeframe for completing the project?
  2. IDENTIFY DATA: Mix and merge data from different sources for a more robust data project.
  3. PREPARE & EXPLORE: Understand all variables. Ensure clean, homogenous data.
  4. PREDICT: Avoid the common error of training your model on both past and future events.  Train only on data that will be available to you when a predictive model is actually running.  Choose your evaluation method wisely; how you evaluate your model should correspond to your business needs.
  5. VISUALIZE: Communicate with product/marketing teams to build insightful visualizations.  Use visualizations to uncover additional insights to explore in the predictive phase.
  6. DEPLOY: Determine if the project is addressing an ongoing business need, and if so, ensure the model is deployed into production for a continuous strategy and to avoid one-off data projects.
  7. TAKE ACTION: Determine what should be done next with the insights you’ve gained from your data project.  Is there more automation to be done? Can teams around the company use this data for a project they’re working on?

There is no doubt that data science, machine learning, and predictive analytics combined with Big Data will become an even more fundamental part of both online and traditional retail in the coming years.  All retail organizations will use it, but only the successful ones will have an effective data production plan that yields the most effective insights into their business that gives them an edge over the competition.

Please share...

One response to “How to Solve the Most Common Data Problems in Retail

  1. Years ago, Tom Anderson wrote a great piece on straw men as an intellectual tool – here we see the same thing. Problem #1 (static silo-ed views) means they have poor researchers, not poor data – none of the results are likely from getting and integrating more data. Problem #2 (supply chain) is likewise not going to be impacted by unstructured data. It’s going to be improved by better prediction/buying models. Problem #3 (historical data) – remember what Santayana said. In most businesses, what the shopper bought last time is what they’ll buy next time. Current trends affect very few categories of products (fashion excepted) and any impact we see takes a long time to happen.
    We should not be so quick to embrace the “making changes to business constantly” approach to management. Businesses need long term planning and long term marketing to get to where they want to be. I’m not advocating an ostrich approach, but few businesses would survive if they change their purpose and focus every month.

Join the conversation