By Dr. John E. Kelly III
The microprocessor was one of the most important inventions of the 20th century. Those chips of silicon and copper have come to play such a vital role that they’re frequently referred to as the “brains” of the computer. Today’s computer designs put the processor at the center.
But the needs of businesses and society are changing rapidly, so the computer industry must respond with a new approach to computer design—which we at IBM call data-centric computing. In the future, much of the processing will move to where the data resides, whether that’s within a single computer, in a network or out on the cloud. Microprocessors will still be vitally important, but their work will be divided up.
This shift is necessary because of the explosion of big data. Every day, society generates an estimated 2.5 billion gigabytes of data—everything from corporate ledgers to individual health records to personal Tweets.
Because of the fundamental architecture of computing, data has to be moved repeatedly from where it’s stored to the microprocessor. That consumes a lot of time and energy. And now, with the emergence of the big data phenomenon, it’s no longer sustainable. That’s why we need to turn computing inside out—moving processing to the data.
Over time, the shift will have huge consequences for everybody—from the managers of high-end data centers to kids playing games on their smartphones.
An early step toward data-centric computing came today, when the United States’ Oak Ridge National Laboratory and Lawrence Livermore National Laboratory announced they’re investing $325 million to purchase two supercomputing systems based in part on this new approach. IBM is developing Oak Ridge’s Summit and Lawrence Livermore’s Sierra systems based on our POWER microprocessors, and in collaboration with technology partners NVIDIA and Mellanox.
Rather than relying solely on the central processing unit, or CPU, to do their data crunching, these systems in addition use specialized graphics processors, or GPUs, from NVIDIA to handle some of the data processing tasks. The GPUs and POWER CPUs are tightly coupled to memory chips where often-used data is stored, and the CPU and GPU elements talk to one another via Mellanox’s interconnect.
When Summit and Sierra are delivered starting in 2017, they are expected to achieve five to 10 times the processing performance of current supercomputers. But raw computation is only part of the story. Just as important, a series of system and software innovations will enable the computers to efficiently handle a wider array of analytics and big data applications.
IBM’s Blue Gene supercomputers made great leaps forward in energy efficiency. Summit and Sierra represent great advances in data efficiency along with significant improvements in energy efficiency.
That’s important because the national laboratories offer researchers from academia, government, and industry access to time on their computers to address grand challenges in science and engineering. Traditionally, the labs’ computers have been optimized to handle hardcore scientific problem solving, using techniques such as modeling and simulation. But, increasingly, researchers are seeking help with projects in diverse domains such as healthcare, genomics, economics, financial systems, social behavior and visualization of large and complex data sets. They need systems that help them manage and sort data, not just run algorithms.
Here are some examples of the new capabilities enabled by these systems:
Healthcare: Pharmaceutical researchers will be able to better simulate the interactions of molecules to identify patterns that will help their companies develop drugs to target specific cells.
Energy: Engineers will be able to increase the use of wind energy by designing efficient wind turbines that can withstand the elements in geographic regions known for inclement weather.
Air Travel: Metallurgists and mechanical engineers will be able to design better jet engines to withstand heat and other stresses, for faster, more efficient air travel.
Big data and analytics applications benefit from increased flexibility in the way computer systems are designed. In traditional computing, much of the innovation happens on and around the CPU. In a sense, it’s a one-size-fits-all world. As the big data phenomenon grows, we believe, innovation will increasingly take place throughout the computer system. Moving processing closer to the data will be one locus of innovation; and there will be others.
That’s why IBM has opened up its POWER processor architecture for others to use and build upon—through the OpenPOWER Foundation. By opening up POWER for innovation and collaboration, we make it easier for makers of server computers and companies like Google to design computers that fit their needs like a custom-tailored suit. Another plus: the companies who are members of the Foundation innovate continuously to create new capabilities and address new challenges. They’re not limited by the product upgrade cycles of a single microprocessor supplier.
Revolutions typically start imperceptibly. A shift takes place out of plain sight, but it sets in motion a series of actions and reactions. Eventually, a pattern begins to emerge. Then it gains momentum. So it will be with data-centric computing.
IBM’s aha! moment came in 2011, when David Turek, IBM’s vice-president for exascale computing asked a roomful of IBM technologists a seemingly dumb question: “How much does it cost to move a single bit of data from point of origin to point of computation?” People in the computer industry hadn’t been asking that question. It took a while for the team to dig up answers, but, when they did it became readily apparent that the industry had to begin to address the costs in time, money and energy of moving large volumes of data within computing systems and networks. Out of that revelation came our data-centric design initiative.
Today, the concept is beginning to find its way into mainstream computer design. For instance, IBM recently introduced a product called the Elastic Storage Server, which tightly packages servers and disk drives in a single appliance-like device.
The changes will come in technical computing first, but, eventually, data-centric computing will become pervasive. Social networking Web sites gather and move vast amounts of data. That’s a job for data-centric design. The same is true of e-commerce, and for organizational functions such as marketing, financial management and product development.
Eventually, the new approach will transform the computer on your desk and in your hand. Because of the new design paradigm, they’ll work faster and smarter. They’ll handle more data. And they’ll use less power. We’re at the front end of one of the most significant shifts in the history of computing.
To learn more about the new era of computing, read Smart Machines: IBM’s Watson and the Era of Cognitive Computing.