February seems to be a month of excitement for all movie, television and sports enthusiasts. It’s that time of year – Super Bowl madness and Oscar Buzz – frenzy so electric that it transcends worlds – into the social media world. Think about it, how long does it take for you to see a Tweet or Facebook post once you hear the winner for Best Motion Picture or following the first touch-down? Seconds?
Information flows so quickly that Twitter alone is handling approximately 35MB of data a second, every second. The majority of this social media data represents public ‘streams of consciousness’, data that approximates human thought and speech, what we in the business call unstructured data. But, as anyone who has filled in a tax form, booked a flight or applied for a loan knows: computers prefer data with structure, data fields that have entries in strictly controlled formats.
The good news is change is coming. Computers are becoming smarter about unstructured data (unstructured data isn’t just natural language … it’s photos, videos, emails, tweets, audio, sensor data, mobile device data). For example, using advanced analytics technologies and natural language processing we can now begin to understand the patterns behind human expression. Not just ‘key words’ that have been identified and indexed, but all words, as we type them or say them. We may have spent most of the computing age training humans to communicate with computers, using methods optimized for the machines, but today the reverse is happening. We’re now training computers to communicate with us and understand us in our own language. It is not easy. It is as the IBM Research team behind Watson declared, a Grand Challenge. But it’s a challenge that can lead to some very important and far-reaching results.
Watson represents a pinnacle achievement in Deep QA and natural language processing but there are many routes to the top and plenty of room for additional exploration and discovery. The team of researchers, students and faculty at the University of Southern California (USC) Annenberg Innovation Lab are taking a slightly different approach to the Grand Challenge. Rather than using the Answer Question formulation of Jeopardy!, they are applying IBM analytics software, and some very smart coding and modeling, to train computers to understand and analyze Tweets. The project is part of an ongoing collaboration between the lab and IBM to explore how technology can be used by organizations from news outlets and journalists to movie studios, broadcasters and retailers to better understand, respond, and predict public sentiment. To date, the model has been applied to film forecasting, the World Series and fashion retailing trends, in an effort to identify social media trends and better understand public opinions. For example, just last week IBM and USC analyzed millions of public tweets to determine the fans’ sentimental Super Bowl Quarterback favorite – Tom Brady or Eli Manning. Just like the game, Eli Manning in a late game-changing move, overtook Tom Brady as the Social Media MVP with 66% positive sentiment vs. Brady’s 61%.
But why stop at the World Series and Super Bowl? AIL and IBM are now collaborating with the Los Angeles Times to measure moviegoer sentiment toward the upcoming Academy Awards race. Dubbed a ‘Senti Meter’, we’re analyzing Oscar- related positive and negative opinions shared via millions of tweets to determine who will win “The People’s Oscars”. The project has been profiled by the Los Angeles Times and we can all follow the evolving sentiment for Best Actor, Best Actress and Best Picture categories over the next two weeks by visiting http://graphics.latimes.com/senti-meter/.
This project is much more than just analyzing which best picture or movie star fans are rooting for – it’s an example of how movie studios can better understand their audience preferences and use social media to improve their marketing programs and in turn improve box office results. There is no doubt that the Twitterverse and other social media platforms are changing communication as we know it. Tweets, Facebook and blog posts are becoming a vital resource for many organizations including the media industry to identify trends, inform reporting and understand as well as connect with their audience.
Think of how much change in the last year has been driven or expressed or reported in social media. Think how much social value could have been derived if we’d had the ability to understand and react to these social media conversations and sentiments – in context and in real time. We can now analyze the vast river of public data that streams from Twitter in its unstructured complexity, and apply a level of sentiment to the commentary. In other words the computer can now determine, with the certainty level of a non-native speaker, that the tweet it just analyzed expressed a positive or negative sentiment and how strongly that sentiment was stated – all in real-time. We can then apply this analysis to deliver business value – the effectiveness of marketing activities, customer responses to services, products and promotions, the impact of advertising, or the reaction to real world events… the list is limitless.
This new capability will eventually deliver solutions founded on semantic analysis of Big Data that are only just now being imagined. And it will happen faster than we expect. Stay tuned, there is more on the way….
Learn more about the work IBM and USC are doing on social media sentiment http://www.ibm.com/press/us/en/pressrelease/36720.wss