PureSpectrum - Schedule A Demo
Qualtrics: Here to Help

Brain Scans, Cultural Popularity And Sample Sizes

Today, the scientific foundation of applied neuroscience is unparalleled, and continues to grow both in our insights and application. For those who employ and adapt these methods consistently see an improved understanding of their consumers.

Editor’s Note: The business of market research is changing rapidly, and nonconscious measurement techniques are growing ever more important as a class of consumer insight techniques. Nonconscious methods are an area exhibiting rapid cycles of innovation and growth. Certainly adoption has been slow for a variety of reasons, but the impact across many categories has been high. In today’s post one of the preeminent applied neuroscientists in the world today, Dr. Thomas Ramsoy, gives a bit more insight into why this is such an important and topic and what traditional market researchers need to understand about the science behind the techniques.

We think this is such an important topic that we’re launching a unique one day event in November that is designed to advance the conversation and increase collaboration among corporate clients, market research consultants and technology providers.

Save the date for the Nonconscious Impact Measurement Forum (#NIMF) on November 6th! Brought to you by GreenBook and the Burke Institute, NIMF is an intensive day of learning and collaboration around the business impact generated by the dynamic and growing field of nonconscious measurement. You will learn the business case for and be inspired by nonconscious measurement methods, including applied neuroscience, implicit approaches, behavioral economics techniques, biometric measurement and holistic nonconscious models, and have the opportunity to test out many methods being discussed for yourself!

Today’s post is a great lead in for the type of bigger conversations that will be happening at NIMF.


By Thomas Ramsoy

One of the key criticisms and concerns one hears towards applied neuroscience and neuromarketing is the sample size. Is it really valid, let alone representative, to test 40 people? In traditional methods, such as surveys, interviews and even focus groups, the number of test persons tested usually runs in the hundreds or thousands. In applied neuroscience tests we usually see test of around 100 people in nation-wide samples. In studies in mobile settings, we can see test with as few as 40 people. Surely, that is a problem?

Maybe the reason that applied neuroscience is not testing hundreds or thousands of people is that it will be too expensive or too hard to scale? Testing a single person with functional Magnetic Resonance Imaging (or fMRI) often costs something like $2,000. Even with other methods such as EEG (electroencephalography), although the price can be down to $100 per person or lower, it may be hard to scale, since every person you test simultaneously requires a full hardware and software setup.

Things have changed dramatically. In several neuroscience studies, it has been found that neuroscience measures can predict behaviour. Even in studies using very small samples, neuroscience can predict not only individual choice, but effects at a cultural level. For example, in a recent study by Dmochowski and colleagues (Dmochowski et al., 2014), brain responses of only 16 participants could predict effects at the cultural level. By using EEG, the researchers developed a novel method that assessed how consistently the brain activation was across the group. Higher group consistency to TV series and commercials was related to both higher social activation (the number of Tweets) as well as stated reference (Nielsen rating).

Other studies have found a similar pattern. In a study by Berns and Moore (2012) the researchers went back to older fMRI data in a study of responses to music. Here, they found that activation of a deep brain structure – the nucleus accumbens – predicted whether the music had become a massive cultural hit, far better than self-reports from the same people. These and other studies have hinted that there may be some common brain activations in a group that is representative for how a whole culture will respond.

So the criticism towards applied neuroscience on sample size is erroneous, and the concerns are unwarranted. One part of the problem with these attitudes is that they assume that traditional research methods are the golden standard. Think of it this way: in traditional methods, each person contributes with only a few data points, as given by their answers. Furthermore, traditional methods only assess measures that a person is privy to, and is willing and able to share. In neuroscience measures, an fMRI scan divides the brain into 3D boxes (voxels) of an approximate size of 3x3x3 millimeters, thus measuring in several thousands of voxels across the brain. Each voxel value is assessed about every other second, and thus a test lasting 20 minutes provides a raw data value of 2.1 million data points per person. In EEG, we see that each electrode samples the signal with a millisecond resolution. This signal is then divided into several frequencies (e.g., alpha, beta, delta, gamma, theta), and therefore a single person can contribute with an order of magnitude more in data power than in fMRI, with 60 million data points or more per person. By merely contrasting the data load of traditional methods and applied neuroscience, we can see that we’re in a completely different ball game.

Crucially, neuroscience measures do not only assess a limited set of responses that a person has conscious access to. These measures provide deep insights into what people are attending and missing; how consumers’ emotional responses occur within milliseconds after seeing a message, a brand or a product; and how particular motivational responses drive attention, desire and ultimately choice.

Today, the scientific foundation of applied neuroscience is unparalleled, and continues to grow both in our insights and application. For those who employ and adapt these methods (with the premise that they are used correctly) consistently see an improved understanding of their consumers, increase their communication effects, and thereby increased their revenue.

Today, it is fair to claim that the obstacles for adopting applied neuroscience now lies not in the science or the methods, but within the market research industry.



Berns, Gregory S, and Sara E Moore. “A Neural Predictor of Cultural Popularity.” Journal of Consumer Psychology 22, no. 1 (2012): doi:10.1016/j.jcps.2011.05.001.

Dmochowski, Jacek P, Matthew A Bezdek, Brian P Abelson, John S Johnson, Eric H Schumacher, and Lucas C Parra. “Audience Preferences Are Predicted by Temporal Reliability of Neural Processing.” Nature communications 5 (2014): doi:10.1038/ncomms5567.

Please share...

9 responses to “Brain Scans, Cultural Popularity And Sample Sizes

  1. Thanks Thomas, for this excellent perspective on the sample-size issue, and thanks Lenny, for connecting it to the upcoming Nonconscious Impact Measurement Forum. The Dmochowski study is properly making a lot of waves in both academic and marketing circles. Another study worth mentioning along the same lines is Emily Falk et al, “From Neural Responses to Population Behavior: Neural Focus Group Predicts Population-Level Media Effects.” which Roger Dooley covered nicely in 2012 in his blog at

    What’s interesting to me about the Dmochowski study is that it finds the population correlation not with the magnitude of the average response in the small sample, but with the concentration of the responses. In other words, what is predictive is the variance in responses, not the mean. Similar results have been found in other studies in related areas, for example, Uri Hasson’s work on audience synchronization while watching movies. This approach differs from what many (not all) neuromarketing firms are doing, which is providing magnitude metrics rather than spread metrics as the indicator of “good” responses – e.g., the higher the average attention allocated to an ad, the better. Given these results from variance measures (also sometimes called reliability measures vs. validity measures), that assumption may be worth looking at again.

    It is also more than a little ironic that this breakthrough validation of neuro measures using proprietary TV viewership and tweeting data should have come from a University researcher, rather than the ratings giant’s own in-house neuro research unit.

  2. Thanks Steve,

    I think we’ve seen some discussions outside this forum that suggests that people may misinterpret these results. The point with these studies are NOT that we are not also interested in differences between groups/genders/ages/ethnicities etc. These are still interesting and important questions to deal with.

    The crucial component here is that *across* groups and differences, some situations and events seem to robustly activate the same kind of response, and this level of “group coherence” is highly predictive of effects at the cultural or even global level. Indeed, it suggests that many of our responses are highly similar. To many of us, this is not surprising at all, since we all have the same foundational emotional structures. However, the fact that we can measure this in smaller samples, and predict cultural effects, is indeed a positive surprise.

    The really nice thing IMO is that many of these studies do not even have to focus on brain-mind links as such: they can do well by only focusing on brain responses and what they predict, and be completely agnostic about what they “mean”.

  3. In cases when we wish to test for one effect size and in an experiment we find a huge effect size, this result may suggest that the effect size is zero in the population of interest (ignoring that probability sampling is rarely used in human experiments). But in the real world of mr those situations are rare. Much more typically, it will be necessary to compare a set of measures among subgroups of consumers whose responses/reactions will often vary considerably across group. Replications and meta-analyses also were not cited in the article. (BTW, I’d tried to respond last week but for some reason kept getting a CAPTCHA error. Fingers crossed this makes it through.)

  4. You can have a billion measures from each person – it still comes down to one person is one observation in a repeated-measures design. The degrees of freedom are based on bodies, not observations per body. Unless, of course, you’d be silly enough to treat your 2 million observations as independent. The only thing you can accurately predict from 16 people is what the other 16 are going to do in a four-cell square dance.

  5. Thanks for the comments. I don’t think anyone is arguing that we treat each voxel as an independent measure (which is why, e.g., fMRI studies notoriously use Family-Wise Error and False Discovery Rate correction, although these are still incomplete or incorrect corrections for the functional imaging).

    That said, the point of this new trend is to construct a “group coherency score”, which is an indication of how reliably a group, small or large, is responding in the same way to a given event. This single score is shown to be significantly related to (and hence predict) effects at a population level.

    @ Mr. Needel: “it still comes down to one person is one observation in a repeated-measures design”
    *** I completely disagree, and this illustrates how the point is missed. With repeated measures you can narrow yourself by only looking at a person as one single aggregate score, or you can choose to look at the variance with which a person is responding to a repeated event. This provides additional and crucial clues about how reliably a certain event leads to a given response. Without the variance, the mean often becomes meaningless.
    *** In the same way, for groups you can choose to look at how coherently or incoherently a group responds to a given event. Again, looking at an average score tells you part of the story, but looking at how coherently the group responds provides you with crucial additional insights.
    *** In neuroimaging research, we have known this for a couple of decades: while we could find neat aggregate responses showing parts of the brain that was activated across the group, we have also realised that this effect did not tell us the story about how large individual variance was – today, we see that individual variance is a substantial area of research where we not only treat individual differences as “noise” but rather look to effects of genes and environments that drive such differences. The works that my post refers to look in the other direction and at instances where the variance is low.

    I’d love to have your comments on the primary article(s) that are referred to in this article/comments, rather assumptions about the claims. In the original paper by Dmochowski et al., all procedures and formula are made available in the paper.

  6. I think our comments pertained to your blog piece, not to the two original articles (which were pay walled when I checked last week). If strong generalizations had been made on the basis of two unreplicated studies, I would object even had I been the author of both. Multiple measurements of a construct are normally preferable to a single measurement, assuming no serious measurement concerns (which would be ambitious in the case of neuroscience). Marketing researchers are aware of this and we frequently use composite scores and latent variables in our research.

    Just for background, I am by no means anti-neuroscience. In fact, my father was a pharmacologist who worked in neuroscience for many years, and I can recall being called a Nazi on several occasions for the mere suggestion that the human brain was in some way connected with human behavior. It was all the “environment” in those days. I feel the discipline holds great promise, but there is much work to be done, and considerable amounts of taxpayer money is, accordingly, being invested in it.

Join the conversation