Suddenly, Big Data is all the rage. Organizations are deluged with huge amounts of information from a combination of their own computing systems, networks of sensors (the Internet of things) and social networking sites. You can be overwhelmed by data or do something useful with it–and many organizations are harvesting valuable insights from theirs. Today, IBM is hosting a discussion in Washington, D.C. to explore the Big Data opportunities for government, Big Data: The New Natural Resource. Please check in from 11:30 a.m. to 1 p.m. Eastern Daylight Time. Follow @IBMPolicy on Twitter, and Tweet to #BigData and #IBMPolicy.
11:45 a.m. Opening Remarks from Dianne Feinstein, U.S. Senator of California
Lawrence Livermore National Laboratory’s new Sequoia computer was recently named the world’s most powerful computer. High performance computing is important for safeguarding our nuclear stockpile, but it’s also key to national competitiveness. That’s why I have taken the lead in assuring funding for our high-performance computing initiatives.
Now, we have another program at Lawrence Livermore, the High Performance Computing Information Center. The center provides computing power and its expertise to help businesses make use of high performance computing.
I have a strong belief that maintaining America’s lead in high performance computing and analytics has bipartisan support Congress.
11:55 a.m. Big Data: The New Natural Resource
David McQueeney, vice-president for software in IBM Research, talks about the opportunity for big data in government.
Data is exploding, coming from sensors, corporate and government systems, and social networking. But much of today’s data isn’t being captured and put into the hands of decision makers.
If we harvest that data, we can take instinct and intuition and combine it with analytics to help make better decisions. And we can make decisons more quickly, down to minutes and even seconds.
We can do predictive analytics and continuous optimization of systems.
Our government has some of the largest and most important data sets in the world. A lot of the magic of the Innovation Center at Lawrence Livermore is taking their expertise in high performance computing and helping companies apply it to their business problems and opportunities.
The volume of data is exploding. The speed of data is increasing. We need a new computing paradigm that extracts insights on the fly. We also have to deal with a wide variety of data–including unstructured sources of data. We’re coming up with new techniques for handling that.
There’s also an issue with veracity of data. Big data sources are full of noise. We have to figure out what’s important, and which pieces of information are correct, and which are not.
We see big data and high performance computing converging. We can use HPC to manage and make the most of big data.
Rather than moving all of the data to processing, we move some of the processing nearer to the data. The data becomes the center of interest. It’s a different sort of computing.
We’ve done a lot of work on cities, on public safety, transportation and disaster preparedness. But there are a lot of new ways to use computational power on large volumes of data.
Can we predict infection in premature newborns 24 hours earlier than was previously possible?
Why are we talking about this now? We have sensors, social networks, and transaction records. We have new and more powerful analytics tools. So we can take on challenges in ways we could only imagine before.
We’re working with government and other industry players to push forward our common agendas. A key element of that is using computing resources to boost economic development.
12:15 Panel Discussion: How Analytics Technology Creates an Enormous Opportunity for Government
Moderated by: Dr. Steven E. Koonin – Director of the Center for Urban Science and Progress (CUSP), New York University (Former Undersecretary for Science, U.S. Department of Energy)
Tony Elder, Deputy Police Chief, Charleston, South Carolina: One of the keys is taking all of the data that’s out there and get it out to the officer on the street in an automated fashion. They need to get the data so they can take immediate action.
Today, we can start to predict where might crimes occur and when. Are the implications from weather, and other factors.
We bring together data from disparate databases. This is the future of policing.
Shantenu Jha, Associate Director of Research Cyberinfrastructure, Rutgers Discovery Informatics Institute:
We’re able to use high performance computing to make discoveries about disease. It’s key to share and disseminate information between different groups.
Koonin: A lot of the power of big data is in bringing together diverse data sets. Is there a universality there? Is there a regular formatting of data?
Elder: In Charleston, we went to consolidated dispatch. We’re moving toward consolidated records management in our county. But ultimately we have to move to states and beyond, because criminals don’t limit their activities to one county. We have to get everybody working together so we can be more successful.
Koonin: Are there cultures and turf issues?
Elder: That’s an issue. We have to get everybody together to talk about the challenges they face. There’s a need to know issue. But you have to balance that with giving enough people the information they need to do their jobs. So you have to get people together to talk through this. You have to find a balance.
Steven Ashby, Deputy Director for Science and Te chnology, Pacific Northwest National Laboratory:
In computer science, there’s a lot of focusing on people’s own solutions. We have to work on coming up with ways of sharing data so we can exploit it for the public good. The government can also play a role in promoting standards that enable the sharing of data.
Jha: Given the quantities of data we have now, business as usual is no longer possible. Earlier approaches are just not scalable.
The real advantage comes when you link data and make it possible to compute on the data.
Koonin: All the big data applications involve people. They’re operated by people and for people. How do we do a better job of dealing with people?
Jha: It’s fundamental, critical and it has never been so important. Training is vital. We’re starting courses where data analytics is the key focus. We have to teach big data and analytics. We have to take into account the science, engineering and social contexts. You can’t be scientists sitting in a silo.
Koonin: We have to bring in social sciences. It’s not easy.
Ashby: In the national labs, we’re mulidisciplinary, but we’re a bunch of scientists focusing on a few domains. We have to reach out into other areas of science and into industry. We want to make this new knowledge actionable. So we need to bring in the people who are working directly on solving the problems.
Koonin: At CUSP, we think the new data and the technology will change the social sciences as well.
Elder: At the heart of what we do is social science. In the past, most of the roles in the department were police officers. Now we’re hiring analysts and emergency management experts. They work with the data and had it to the applied social scientists in the street–the police officers.
Koonin: For government, big data often is open data. Government activities will become more transparent for citizens and non-profits. I suggest to IBM to make a version of Watson for government, so people will ask questions about government and get their answers. So, how does this change the game for government.
Jha: I spent some time in Britain. The expenses of Parliamentarians were made public. It forced a fundamental change in the way government supports government legislators and officials.
It’s important to make the open data easy for citizens to get their hands on.
Koonin: Tony, do you have crime heat maps in Charleston.
Elder: We do put up heat maps of where crimes have occured. It will tell you what has happened, who has been arrested. We’re very transparent in what we do. We also put up reports with our crime statistics.
Ashby: The big issue will be, more data will bring more transparency, which we’re all in favor of.
We also have to be able to exploit the data for the public good.
Let’s look at he power grid. it’s owned by 3200 power providers around the country. We’re creating a network of sensors that will tell us what’s going on with the power grid right now. It’s 1000 times more data. What will we do with it? Look at the blackouts. We typically analyze the data after the fact. Now we’ll take 1000 times more data and process it 1000 times faster. We can then use analytics in near real time so people with the information can address instabilities and prevent blackouts.
Koonin: People will try to twist data and misrepresent–for their own reasons. What do you do about that?
Jha: As data becomes a first class entity in our society, we’ll be able to make sure it’s credible, so you can evaluate where it comes from.
Q: What can government do to get the most out of big data?
Steven Ashby, Pacific Northwest National Lab: We need to do fundamental research that helps us exploit the data, not just gather it. We need multidisciplinary teams to get insights. Government should help organize these teams and target them at specific domains.
David McQueeney, IBM: The HPC Innovation Center at Lawrence Livermore Lab is an example of what an be done to make high performance computing available to industry.
Ashby: In some cases we have barriers to cooperation between government, industry and academia. Those barriers should be removed.
Tony Elder, Charleston PD: One important element is keeping deep databases of past events. That’s a trove of information that will remain useful or many years.
McQueeney: In the world of big data, you want to remember the answers and the questions you asked. The answers might change over time, so you have to keep asking the questions. This can yield insights that might never have come up just because of the time sequence.
Q: How can government mine personal data without violating privacy?
McQueeney: You start out with clear principles and laws. You build into the systems an understanding of combining different kinds of data. You implement those policies within the computing systems.
Steven Koonin, CUSP: You can set it up so you see patterns in human behavior without identifying the individuals involved.
Elder: You have to respect everybody’s privacy and protect the public at the same time. When you put camera systems in a city, you have to think about how it will be used. You have to work with the community to set the standards and accountability standards. And you have to be clear there will be consequences if privacy is abused. Before we set up cameras, we met with the community to talk it over.
Gregory Mullen, Charleston’s police chief, writes about using technology to reduce crime.
Here’s an IBM Smarter Planet TV Ad on stopping crime:
Professor Manish Parashar writes about RDI2 at Rutgers University, an institute focused on using high-performance computing and big data in science, finance and other industries.
Here’s Science and Technology for the Nation, a publication of Pacific Northwest National Laboratory.
Here’s a video about the Lawrence Livermore National Lab’s Sequoia supercomputer and the quest for exascale computing:
Here’s a video about the High Performance Computing Innovation Center at Lawrence Livermore: