Is Social Media Data Predictive? A Comparison Of Polls & Sentiment Analysis
From the Netflix debacle to the 2012 Presidential Race, over on his Innovation Muse-ings blog Malcolm De Leo has been showing just how far you can stretch the basic metrics of buzz, sentiment, and intensity to generate insight.
Editor’s Note: Malcolm De Leo has been doing a bang-up job lately of using social media analysis as a tool to understand current events. From the Netflix debacle to the 2012 Presidential Race, over on his Innovation Muse-ings blog he has been showing just how far you can stretch the basic metrics of buzz, sentiment, and intensity to generate insight. In this post he teams up with Professor Mitch Lovett of the University of Rochester to examine the correlation between social media data and traditional polling: the results are intriguing to say the least. Last week at ESOMAR Facebook shared that their internal polling matched Gallup and Rasmussen data, as well as that their internal data predicts box office performance so a compelling argument is emerging that social media IS representative and predictive. Of course more testing and validation is needed, but at the very least it seems safe to assume that if we don’t use the potential gold mine of data available via social media we are doing ourselves and our clients a disservice.
By Malcolm De Leo & Professor Mitch Lovett
If you’ve been following my blog you know that I’ve been interested in the use of social media data in understanding political races for a while now (see my previous two political posts on President Obama (Post 1, Post 2)). The other day Professor Mitchell Lovett of the University of Rochester, Simon School of Business called me to tell me about some analysis he had been doing regarding the primary season using NetBase’s Brand Passion Index. He is a fantastic collaborator in the world of social media theory and application, not to mention one of our lead users and a main contact point for our university partnership with the Simon School of Business. We had a great conversation that turned into a small collaboration, and this co-authored blog post.
This blog post is really about how social media can help take the voice of the “crowd” to make sense and possibly even forecast events like polls and candidates’ strategies. In the process we will end up touching on two of Professor Lovett’s recent research projects, including one still very much in process. Throughout we are going to focus on the current frontrunners–Mitt Romney, Herman Cain, and Rick Perry (so sorry to all the Gingrich, Paul, and Bachmann fans).
We’re going to share with you some analysis and thoughts on the Republican primary and a little on what it might portend for the general election. Throughout we will be referring to the NetBase Brand Passion Indexed (link to explanation of how it is calculated), so you might want to look here for more details. The Brand Passion Index is made up of three key social media measures– buzz (total number of mentions of the candidate), sentiment (positive or negative content about the candidate) and passion (intensity of emotion about the candidate). With that intro, let’s dive in.
Herman Cain. Cainhas virtually no political experience, but lots of business experience. Whether you agree or not with the concept of 9-9-9, it is catchy and has led to a huge amount of attention for Cain. His poll number have been on the rise particularly in October, even though his organization and fundraising are well behind the other two leading candidates (for example, see OpenSecrets.org for financial contributions data, which we will reference throughout). What can social media add to this?
To understand the social media story we see, you really should know a little about two of Professor Lovett’s recent research projects. The first coauthored with Ron Shachar is titled “Seeds of Negativity: Knowledge and Money” and was recently published in Marketing Science, one of the leading quantitative marketing journals. That paper shows how when voters know more about a candidate and the opponent, and when candidates have more money, candidates go more negative in their advertising. While that study considered general elections and negativity in advertising, the general idea may be relevant to primary elections and other forms of attack such as during debates and on the campaign trail. Applied to Cain’s situation, the increased attention is going to lead to more attacks on Cain. That’s exactly what we saw in the last debate and this is what we see playing out in the social media, too.
The second research project is still in its infancy, but even the preliminary results could have relevance here. Mitch Lovett and Paulo Albuquerque, also at the Simon School of Business, correlated NetBase’s net sentiment measure to polls for over 30 races in 2010 for Governor and U.S. Senate. Though their results are still tentative, they suggest that net sentiment may lead changes in polls by up to a month. In other words, we might be able to use the net sentiment as a leading indicator of public opinion. We’ll see some anecdotal evidence on this shortly.
Okay, so now let’s turn to Herman Cain’s Brand Passion Index. First, consider the small yellow circle in the middle of the chart. This was Cain’s brand passion index in August. The center of this circle is positioned about half-way left-to-right on the graph, which indicates people on social media express moderate passion (for a political candidate, though low for a top commercial brand) about Cain. It is also about half-way up-and-down, which indicates people view Cain relatively neutrally (which is actually relatively positive for politicians, who generally are viewed fairly negatively). Cain’s buzz (indicated by the size of the circle) in August was tiny for a national candidate. People just weren’t talking about him much online.
In September he started to get more attention and this led to an initial increase in net sentiment, but that all changed in October, when for the first time he became the focus of public attention. He came under attack by his opponents and under the scrutiny of the media. His buzz went way up, but his net sentiment took a nose-dive and he is now well in the negative. How does all of this square with polls?
The chart above shows just how well the movements in net sentiment track with the poll averages. In fact, this chart is displaying the net sentiment from the prior month. In other words, the net sentiment seems to predict the movements in poll averages extremely well, at least for Cain. Given what we just learned about Professor Lovett’s research, it seems we are likely to see two things in the future: 1) It appears Cain may slide in the polls some in the coming month as mass opinion catches up with the sentiment expressed in social media, 2) Cain may continue to be attacked (for now), and 3) Cain may start throwing back some serious punches, too. Given his lower funding levels, for now Cain’s attacks might be restricted to the debate and campaign event setting.
Mitt Romney. Mitt Romney proved himself an impressive fundraising in the last Presidential nomination, and that funding machine and organization appears well in place this time around as well. In the polls he has been consistently in the top two places at least since August and he is a competent debater. What does social media have to add?
Romney has considerably lower passion than Cain. People just aren’t that excited about Romney. He is the kind of candidate that is probably adequate and could end up the last man standing, but people aren’t charging the gates for him–He’s no Ronald Reagan. At the same time, he is getting a lot of buzz with most months showing hundreds of thousands of mentions. How about net sentiment? His net sentiment is actually a little lower than even Cain’s lowest level. As for changes, he appears to have made a slight decrease in September in net sentiment, but passion is actually picking up (a little), and buzz is essentially unchanged. Nothing major going on here, which for Romney is probably a good sign given his other advantages.
So, one has to ask, why isn’t Romney advertising on television to leverage his funding advantage? Romney’s opponents haven’t been as well known as Romney, so based on Lovett’s research, the benefits of attacking them were likely lower. As a result, Romney might have preferred to wait. Besides, attacking (at least early) may cast Romney as too aggressive (remember Reagan’s wisdom to not attack fellow GOP members). The social media suggests that Romney’s opponents are even less secure than their current poll numbers might indicate and the longer he waits the clearer his main target will become. At the same time he is building his war chest. In other words, He doesn’t really need to go to the airwaves, but given his cash on hand, look out if someone takes a swing at him.
Rick Perry. With Rick Perry, we see a different story altogether. His entry into the race was received very positively and he led the polls for awhile afterward. He had an excellent fundraising month in September, but he didn’t seem to fare as well in the debates and he has been polling much lower lately with Cain stealing his top position. What can social media add to this?
His passion level is actually the highest among the top three players. When he entered the race in August his buzz just took off. Like Cain, however, his negative sentiment also took off. Then the gaffes started. He performed relatively poorly in the debates. He had the misstep on immigration. His tax plan while bold in concept didn’t have the desired punch. And this past week he stated he might start skipping the Republican debates. His net sentiment started essentially a straight shot south. Worse, since August his buzz, though still much higher than his pre-entry level, has decreased, suggesting he is losing the interest and attention of the public. A look on the graph below again suggests that net sentiment is tracking ahead of the polls and that is a bad thing for Perry, since net sentiment keeps going down.
What can we expect from Perry? Well, he isn’t making the impression he’d hoped for in the debates and now he is much better known, in a bad way. With his poll position sliding and his large cash position, he seems likely to go to the air. Will he go negative? Given Professor Lovett’s research this would suggest some negative advertisements could be in his future plans. But who he should target is less clear. Cain? Romney? Both?
This last picture pretty much sums it up. Perry is heading south rapidly, while Romney is only sliding a little and Cain dropped a lot in his first month in the full public eye. Cain is still the frontrunner on net sentiment, but he and Romney are pretty close on buzz, while Perry’s buzz has been decreasing. Perry is the clear winner on passion while Cain is a little weaker and Romney weaker yet.
Adding Obama in the picture, really gives some perspective on all of this data. Though we see differences in the GOP candidates, they are all weaker than Obama on passion and much weaker on buzz. Yet, perhaps more importantly, they are higher than Obama on net sentiment. Is passion more important to the American people than sentiment? If so, Perry might have a chance for a soft landing and a rebound. Of course, that would also suggest whichever Republican is nominated could face a tough struggle against Obama. Stay tuned for more updates.