Editor’s note: This article is by Christer Johnson, IBM Global Business Services’ Advanced Analytics Services Leader for North America. A member of Christer’s team, Ryan Hendricks, will participate in a panel “Big Data in the Sports Industry” at the IBM Research Colloquium “Box Office to Front Office: Winning with Big Data” on August 10, 2012. Watch over livestream beginning at 10:00 a.m. US Pacific Time.
One of the many things I’ve learned from more than 19 years of using analytics to solve challenging business problems is that the word analytics means different things to different people. So before diving into numbers, I define analytics by the objectives they intend to achieve, and the decisions they intend to improve or accelerate. In that context, analytics falls into three categories: descriptive, predictive, and prescriptive.
Descriptive analytics, also referred to as business intelligence, provide a clear understanding of what has happened in the past, through visualization of key performance metrics or other data in a report or dashboard. Today, the past can be as recent as just a millisecond ago.
The sports world has long been a leader in the use of descriptive analytics to provide fans, coaches, and players with a wide range of statistical reports that help them understand what’s happening on the field – whether a coach wants to improve play, or fans want to win their fantasy league.
However, with descriptive analytics, fans and coaches alike must rely on their intuition and ability to interpret the data in order to gain any insight on the relationship or correlation between data inputs and data outputs.
That’s where predictive analytics, the second category of analytics, comes into play.
In predictive analytics, the objective is to use advanced mathematical techniques on that past data to understand the underlying relationship between data inputs, outputs and outcomes. Effective predictive models let us quickly understand and estimate outcomes across a wide array of scenarios and conditions. Commonly used for forecasting, simulation, root cause analysis, and data mining, predictive modeling techniques provide insight into complex data that we can’t manually interpret from a report or interactive dashboard.
Billy Beane of the Oakland A’s famously used predictive modeling techniques to uncover new data inputs that were highly correlated with the outcome of winning baseball games. In tennis, IBM recently began using predictive analytics to automatically sift through a multitude of factors from seven years of data about every point played in the Grand Slam tournaments – all to estimate the top three keys to each player’s match.
Predictive analytics still requires manual evaluation of the various scenarios and the predictive results of each scenario, in order to make a decision. This works well when a decision involves just a few options and the decision maker has time to interpret the predictive results from the various scenarios (for example, a coach using past game statistics to plan for the next game).
It does not work well, however, when a decision maker is faced with thousands or millions of options. Nor does it work well when a decision is needed just seconds after key data inputs are received. This is where prescriptive analytics comes into play.
This third category of analytics, prescriptive analytics, uses mathematical optimization to take into account a multitude of data inputs and constraints related to an objective. The formulas sift through potentially millions of possible decisions to prescribe the actions that will maximize the user’s objectives.
Major League Baseball now uses a complex collection of optimization models to create its schedule each year. And some of the most-common uses of optimization outside of sports include pricing optimization for airlines, hotels, and retail chains; transportation planning and scheduling for distribution companies; and the decisions around how to allocate marketing dollars across channels and product categories.
Analyzing Big Data
Today, even small companies are armed with the software and hardware platforms that can efficiently and effectively perform these three types of analytics on enormous volumes of data – Big Data.
Big Data is defined by the four Vs: Volume (terabytes, petabytes, or more), Velocity (streaming data), Variety (structured variables in a database, versus unstructured text, voice, or video), and Veracity (the degree to which data is accurate and can be trusted).
With the explosion of unstructured data on social media, companies are rushing to analyze this type of Big Data to better understand customers’ views, preferences, and behaviors. As exemplified during the Olympics, there are few industries that generate more excitement, discussion, and ultimately data than sports.
The key for sports franchises, as with any company needing to make the most of big data, is to start with the question to be answered, and the decision to be made. Once the question and decision are clear, you have a much higher chance of collecting the right data; using the most-appropriate analytical techniques; and producing insight that you can turn into value for your customers, your company, and yourself.
Watch “Winning with Big Data”
On August 10, my colleague, Ryan Hendricks, will join a panel with Dave Kaval, the club president of the San Jose Earthquakes, Mike Zoglio, the vice president of Marketing, Electronic Arts Sports, and Rory Brown, the director of content operations and analytics with the Bleacher Report as part of “Box Office to Front Office: Winning with Big Data.”