New U.S. Research Will Aim at Flood of Digital Data - 2012-08-20


“The federal government is beginning a major research initiative in big data computing. The effort, which will be announced on Thursday, involves several government agencies and departments, and commitments for the programs total $200 million. Administration officials compare the initiative to past government research support for high-speed networking and supercomputing centers, which have had an impact in areas like climate science and Web browsing software. ‘This is that level of importance,’ said Tom Kalil, deputy director of the White House Office of Science and Technology Policy... Big data refers to the rising flood of digital data from many sources, including the Web, biological and industrial sensors, video, e-mail and social network communications. The emerging opportunity arises from combining these diverse data sources with improving computing tools to pinpoint profit-making opportunities, make scientific discoveries and predict crime waves, for example. ‘Data, in my view, is a transformative new currency for science, engineering, education, commerce and government,’ said Farnam Jahanian, head of the National Science Foundation’s computer and information science and engineering directorate. ‘Foundational research in data management and data analytics promise breakthrough discoveries and innovations across all disciplines.’ On Thursday, the National Science Foundation will announce a joint program with the National Institutes of Health seeking new techniques and technologies for data management, data analysis and machine learning, which is a branch of artificial intelligence. Other departments and agencies that will be announcing big data programs at a gathering on Thursday at the American Association for the Advancement of Science in Washington include the United States Geological Survey, the Defense Department, the Defense Advanced Research Projects Agency and the Energy Department. These initiatives will mostly be seeking the best ideas from university and corporate researchers for collaborative projects. The private sector is the leader in many applications of big data computing. Internet powers like Google and Facebook are masters at instantaneously mining Web data, click streams, search queries and messages to finely target users for online advertisements. Many major software companies, including I.B.M., Microsoft, Oracle, SAP and SAS Institute, and a growing band of start-ups, are focused on the opportunity in big data computing. Still, there is an important complementary role for the government to play where the incentives for private investment are lacking, according to administration officials and computer scientists. Such areas, they say, include scientific discovery in fields like astronomy and physics, research into policy issues like privacy, and funding for research at universities, where the high-technology work force of the future is educated. At the session on Thursday, there will be presentations by scientists who are experts in big data computing. Astronomy is a pioneering discipline for the approach. The Sloan Digital Sky Survey has used digital sensors to scan distant galaxies from an optical telescope in New Mexico, collecting vast amounts of image data that are processed with powerful computers. The resulting three-dimensional mapping has yielded a ‘visual representation of the evolution of the universe,’ said Alexander Szalay, a professor at Johns Hopkins University. He calls the digital sky program a ‘cosmic genome project.’ At Stanford University, an intriguing big-data experiment in online education is under way. Last year, three computer science courses, including videos and assignments, were put online. Hundreds of thousands of students have registered and participated in the courses. The courses generate huge amounts of data on how students learn, what teaching strategies work best and what models do not, said Daphne Koller, a professor at the Stanford Artificial Intelligence Laboratory. In most education research, teaching methods are tested in small groups, comparing results in different classrooms, Ms. Koller explained. With small sample groups, research conclusions tend to be uncertain, she said, and results are often not available until tests at the end of school semesters. But in an online class of 20,000 students, whose every mouse click is tracked in real time, the research can be more definitive and more immediate, Ms. Koller said. ‘If 5,000 people had the same wrong answer, it’s obvious a concept is not getting through, and you have a clear path that shows where students went wrong,’ she said. That kind of data tracking in education, she said, provides “an opportunity no one has exploited yet.”



08/16/2012, 06:08

From feeds:

Open Access Tracking Project (OATP) »

Tags: oa.policies oa.mining oa.comment oa.government oa.ssh oa.usa oa.nih oa.universities oa.physics oa.oer oa.social_media oa.funding oa.courseware oa.astronomy oa.usgs oa.privacy oa.ostp oa.stanford.u oa.doe oa.nsf oa.facebook oa.sloan oa.dod oa.stem oa.rdm oa.hei



Date tagged:

08/20/2012, 18:39

Date published:

03/30/2012, 20:08