Unearth the power of declared data
Insights That Work
Brand & Retailer tickets for all IIeX events now start at just $99! Get or give one today!

Has Research Quality Really Gone Downhill?

In articles about the quality of consumer insights, a common opinion is that research quality has gone downhill in recent years. I question that perspective.



Ron Sellers

This post actually started as a reply to Scott Weinberg’s terrific Greenbook Blog post Is Online Sample Quality A Pure Oxymoron?  After doing a little writing in the reply box, I realized my comments were lengthy enough to warrant an actual blog post of their own rather than a reply.

Blogs, articles, and reader comments I’ve seen regarding research quality often have the perspective that quality in the consumer insights industry is worse today than at other times in the sector’s history.  Whether various writers blame this on DIY research, online panels, new methodologies, lack of training, or other reasons, this is a fairly common perspective.

Having been in the industry for more years than I care to admit, I have a somewhat different view.  Yes, quality is often pretty bad today, and I shudder to read lengthy lists of transgressions that Scott and others have personally witnessed.  But I question whether things are worse today than they were in past years.

First, as human beings we have the tendency to focus on recent events and situations and forget about what’s happened before.  This was really brought home to me by reactions to the New England Patriots’ recent Super Bowl victory over the Seattle Seahawks.  For those who aren’t NFL fans, Seattle was three feet away from the winning touchdown with about 40 seconds left.  Seattle has one of the most dominating running backs in the game in Marshawn Lynch, and a terrific running quarterback in Russell Wilson, so everyone expected them to use one of those two to score the winning touchdown by running the ball in.

Instead, Seattle called a passing play, and the ball was intercepted at the goal line to preserve an unexpected win for the Patriots.  After the game, a lot of the talk by pundits and fans alike focused on two opinions:

  • That was the worst play call in the history of the Super Bowl (and even that it was the worst play call in the history of sports).
  • That was the best defensive play in the history of the Super Bowl.

Now, it was a pretty bad call and a pretty great defensive play, but was it really the worst/greatest in all of 49 different Super Bowls?  I won’t get into details, but without much effort I can think of two other plays that would give it a run for the “best defensive play ever” title.  But because it’s what we just witnessed a few days ago, and because many people haven’t seen a single play from Super Bowls back in the 70s or 80s, it’s considered the best/worst ever.

We see the same things when Americans are surveyed about who is the greatest president ever.  Modern names such as Ronald Reagan and Bill Clinton generally outpoll historical greats such as Thomas Jefferson, James K. Polk, or Theodore Roosevelt.  But most respondents experienced Clinton’s presidency, while for most people Polk is just another name they might have heard of briefly in high school history.

So as bad as things are in consumer insights, are things really worse than they were ten, 20, or 30 years ago?  There’s still a problem of unqualified people doing bad research, just using a different methodology.  We still have decision makers cutting corners in order to get the lowest cost possible.  I’m guessing we’ll soon have some of the same issues with galvanic skin response, eye tracking, and any of the newer methodologies as they become more popular.

Back in the days when the phone survey was king, I worked for a boss who ordered 70% listed sample and 30% RDD sample for most studies because using listed sample was much cheaper in the phone room.  His reasoning?  Only 30% of phone numbers (at that time) were unlisted, so he was using the RDD to represent the unlisted phone numbers.  He couldn’t figure out that if 70% of phone numbers were listed, it would mean 70% of the RDD numbers would be listed, so in effect he was running with 9% unlisted and 91% listed sample.  Oh, and clients were never informed about his sample decisions, so they were unaware of the possible quality implications.

I also remember fielding a tracking study by phone in about 1988.  It had ridiculous demographic quotas and could be over an hour long for some people.  In getting it programmed, I came upon a question that made absolutely no sense to me – I didn’t even understand what it was asking.  When I questioned the client, he also had no clue and said it was worthless.  When I asked if we could change or eliminate it, he was shocked – “Absolutely not – it’s a tracking study!”  So we continued to track meaningless data for them.

I remember being a respondent for an in-person interview.  The study was about oil company advertising, and I got to listen to a variety of radio commercials with the name of each company bleeped out to see if I could identify the sponsor.  The audio editing was terrible; the bleeping generally consisted of things such as “Texa-beep” to try to hide the Texaco brand.

At the end of the survey, the interviewer asked my occupation, and I told her I was a project director at a market research company.  She looked at me and said, “I can’t put that.”  I told her that the screener had not included a security question or asked me my occupation, and she informed me that “They just know they’re supposed to ask that.”  I told her I also did some media work for the company, so she lied on the questionnaire and put me as a media liaison so that she could get credit for the interview.

I was asked by one client to falsify data to make sure that their intended advertising campaign would look good in the findings.  I was told by another client to change a question so they could get the answers they wanted, and that I had to  learn that “Sometimes you want real answers and sometimes you want to make sure you get the answers you want.”  Both of these happened back when fax machines were considered to be high tech.

When I took a corporate research job in 1993, the first thing I did was visit all of our vendors.  I monitored survey calls at one phone room and heard interviewers going completely off script and getting into conversations with respondents.  When I raised the point with the field supervisor, she was totally comfortable with what they were doing and saw no problems (needless to say, under my watch they were never used again).

We also subscribed to a number of syndicated reports, including a Hispanic tracker.  When I started digging into the data, I found that the research company was regularly reporting and graphing quantitative data from subsets of fewer than 20 people (without noting the sample sizes anywhere).  When I objected, they admitted they “probably shouldn’t do that,” but no changes were made in future waves (which is why we stopped subscribing to it).

Back on the vendor side, I took over a telephone brand tracker for a bank in 1998.  The previous vendor had first asked people an aided question about where they banked.  Only after naming about six different local banks did they ask “unaided” brand awareness.  No bias there, of course!

I could relate many other horror stories from 20 years ago as well as from last year, but you get the idea.  Are things really worse than they were in the past?  I have no quantitative way to measure that and prove or disprove my hypothesis, but I truly question whether consumer insights quality is worse today.  We had plenty of multi-paragraph concepts we had to read to people by phone, plenty of 30-minute phone questionnaires with lengthy and repetitive grids, plenty of questions which were incomprehensible, and plenty of shoddy sampling and field work back then.  We have many of the same problems today, just with different methodologies and technologies.

Is this even a relevant issue?  I would contend that it is.  For one thing, it is easy to become depressed when we believe that things are going downward, and figure there’s nothing we can do about the trend.  Can you change the whole industry?  Maybe not.  But you can darn well make certain that what you do in the industry is done properly, and you can work to point out the quality problems to those who fail to understand their importance.  If you feel the battle is already lost, it becomes much easier to throw in the towel.

For another thing, it becomes easy to blame certain methodologies for the problem, rather than human greed, sloth, or incompetence.  We have tremendous government waste in our republic, but then again so have countries under monarchies, dictatorships, socialist governments, and communist governments.  Is government waste a function of our form of government or of government in general?  Only in understanding that question can we attack the real problems rather than the symptoms.

Yes, online panel research is often atrocious, but so was a lot of the phone, intercept, and mail research that went on in the past, and so is much of the big data analysis and social media monitoring that goes on today.  We need to attack the root causes rather than the symptoms.

Finally, I want to be very clear that this post is not any sort of attack on what Scott or others have written on this topic.  Scott’s post is what got me thinking about today versus the past, but it was outstanding and I agree with the points he made.  I just wanted to bring a slightly different perspective to the discussion than we often hear about when discussing research quality, because I believe it is an important nuance that deserves some consideration.  This is an ongoing battle, not a recent development.

Please share...

6 responses to “Has Research Quality Really Gone Downhill?

  1. I really appreciate voices of quality in the industry. And I equally appreciate good context. So I’m glad to see these two pieces linked. I also have very clear memories of our past wins and losses, and a keen sense of conscience for our present and future. There will always be sinners and saints in our field. And every method will have its strengths and weaknesses. Our goal should be to stay open, transparent, and improving.

  2. Ron, I really appreciated your article and I agree with it totally. I have been in the research business over 40 years and as Melanie says in her comments. we have all had saints and sinners as clients. What worries most today is that so many surveys conducted online, on cell phones and panels are not representative of the respondents they are trying to reach. There is no appreciation for the science of our business like random samples, confidence levels, completion rates, margins of errors, etc. In fact, these statistical measures are viewed as mere obstacles and raisers of cost, which also slow down meeting schedule demands. Please call me at 484-483-7692. I would like to ask whether you would be willing to participate in an important Research Conference in May.

  3. Ron, I think your points are well made and on point. I think the realization today is that even when stat research is done well, the accuracy is less than what we assumed in the past. Corollary data has made that apparent and it clear that stated data leaves a lot to be desired. Having said that, up until a few years ago, it was the best tool we had for the decision process. It’s no surprise that managers are relying more on data compilation and correlations these days, the real issue is that beyond qualitative research, research has lost equity and is largely directional in many cases.

    The issues(s) aren’t about tools or mobile or anything other than the fact that the validity of stated responses don’t mirror the results where measurements at decision points are concerned. What is apparent is thatt the measurements need to be more granular and segment oriented and further than boiling the ocean is not an option. Whatever tools are chosen need to be administered in a way where actions speak louder than words. The technology exists to monitor floor traffic, push stimulus and identify influencers, all without asking a question. It seems a natural link for panel providers and sat managers to use smartphone participation to accomplish those goals. By enjoining those technologies with actual purchase metrics, the data is stronger and the accuracy almost has to be better. The one caveat is that the models don’t support mass marketing or large scale bucketing. Researchers have to let go of stat based analysis and look toward the future if they want to keep a place at the table. Just in the last three weeks, I have three clients directly impacted by management’s needs for more timely and efficient data. Their jobs are on the line. Like it or not, the definitions are changing. Researchers need to focus on results that inform, not techniques that relate to traditional validations.

  4. I may be wrong but techniques which monitor floor traffic, push stimulus and identify influencers, and monitor smartphone participation etc. must also be accurate and true. Such techniques and methods can also have hidden biases and result in a compilation of data junk which is not representative and cannot be easily identified. Time of day purchase, gender and age imbalances, and who or what is an influencer can also be very subjective and inaccurate. I still want to primarily use techniques that have a scientific and objective bases, rather than data based clouds and Big Data that can have all kinds of hidden biases and imbalances.

  5. I guess it’s a question of collecting enough data to see consistency with regard to observation techniques and most using them understand that there have to be baselines. Things like purchase timing are recorded through beacons and can be overlaid with purchase data and even loyalty card purchase data.

    The interesting thing about beacon and purchase data is that it you can map purchase timing, store navigation and time spend in an area to actual purchase and you can compare store to market trending. Is it 100% accurate? No. Does it have bias? Yes. But the difference is that the bias is not introduced, it is already present and in many ways can be quantified. The focus here is on answers that either define how choices are made or deliver on at least some of the purchase considerations at the time of service .

    Dropping a coupon to a smartphone while within the purchase consideration likely has a degree of influence that a targeted email/mailing or receipt coupon cannot deliver and the validity of that can be compared if the data is integrated with purchase metrics. In the above case, if brand specific coupons for diapers are delivered to those lingering the diaper isle over 7 days and they are not used while the same process occurs in paper towels with a 10% lift that is a finding regardless of demo. It might be interesting to see the influence difference between targeted non-store marketing and in-store site stimulus.

    The same is true of non-conversions when the traffic is high within an area. Certainly gender specific products or age related products have additional considerations that might warrant other techniques. Most manufacturers have very little data on end-users of their products and there might be a way to create a new data resource that would better serve both retailers and manufacturers. In any event, stated techniques most likely have a much larger margin of error than that associated with the confines within the stat program. The question is how much for any given study and it varies? For me, it’s not an either/or question but more how can we leverage less invasive methods to gain more understanding.

    It is just a different way of looking at data and it is not meant to feed into mass marketing constraints.

Join the conversation