The digital world has graduated from what scientists call “very large” databases to “extremely large” ones–which gather many terabytes (1000 gigabytes) of data per day. It’s the era of Big Data, and the situation will only get more extreme with the passage of time. Market researcher IDC predicts that between 2009 and 2020 digital data will grow 44 times to 35 zettabytes (10 to the 21st power), an amount that defies easy comparisons. This ocean of data is potentially very valuable to its owners if it’s managed and analyzed properly. But that’s a big if. IBM Fellow Laura Haas calls this situation “the full employment project” for information management experts like her.
The challenge of getting maximum value from Big Data is the subject of an IBM Research Colloquium, Planet Scale Analytics, which is being held Sept. 20 at IBM Research – Almaden in San Jose, Calif. The colloquium is part of an IBM Centennial program designed to convene thought leaders – including leading researchers and scientists, academics, leaders of industries, public policy makers and key IBM clients — for a series of talks and panel discussions on transformational technologies and their potential impact on the world.
The biggest problem with Big Data isn’t the sheer volume of data. It’s the diversity. Data comes in different types, including numbers, text, images, videos etc. It’s stored in different formats, and using different data standards. Within each type of database, companies organize their information differently in forms and tables. Finally, different kinds of data are isolated in separate storehouses of information. A bank, for instance, might store information about a single customer in six or eight databases, one for each service the bank provides for the customer. So one of the crucial factors in being able to analyze data is the ability to manage it.
But there’s a second major challenge for Big Data handlers, as well. In the Smarter Planet view, the world is a complex systems of systems–each interdependent with the others. So business and government leaders and scientists must view data holistically, Haas says. To analyze flooding danger and impact in a particular location, for instance, experts have to take into account data about water flows, weather, land topology, transportation and utility infrastructure, and human activity patterns. (A emergency response system that IBM built for Rio de Janeiro maps the the topology of the city down to the square meter. Now, that’s a ton of data!) This wide variety of data has to be integrated, often in real time.
Haas believes that the best way to make progress in data management and integration is by bringing together data and domain experts from academia and industries so they can co-develop methods for solving real-world, Big Data problems. Watch this blog for more news on this topic.
For more info about the colloquium, click here.