Researchers aim to chart intellectual trends in Arxiv : Nature News & Comment

abernard102@gmail.com 2012-08-20

Summary:

“When physicist Paul Ginsparg goes to next week's American Physical Society meeting in Boston, Massachusetts, he plans to take with him a 64-gigabyte flash drive containing all 740,000 or so articles from Arxiv, the preprint repository he founded in 1991 that is managed by Cornell University in Ithaca, New York. He will pass the data on to researchers from the Cultural Observatory at Harvard University in Cambridge, Massachusetts. They want to break down the full text of the articles into component phrases to see how often a particular word or phrase appears relative to others — a measure of how 'meme-like' a term is. Their goals: to give Arxiv a new tool for identifying original source papers in physics, mathematics and computer science — and to enable historians to spot trends from the 20 years that the repository has existed... They have applied the new interface, which they call Bookworm, to about 1 million copyright-free books collected by the Open Library, adding an ability to screen for books by genre and place of publication. They have already tested the tool using one month of Arxiv data, but plan to add the full Arxiv data set in the coming weeks. Michel is excited not only because a new group of users will be testing the tool, but also because the knowledge embodied in Arxiv is different. “It might show different patterns than pop culture,” he says. One of Bookworm’s creators, Benjamin Schmidt, a graduate student in history at Princeton University... wants to mine Arxiv’s articles on quantitative finance to see if the adjectives surrounding the Black–Scholes equation — used to set prices for financial derivatives — changed before and after the 2008 market crash... Ginsparg says... ‘What you’re going to be registering are intellectual movements in the community.’ He suggests that science policy-makers could even use Bookworm to identify new fields that are in need of funding, or moribund fields that might require fewer grants... Schmidt and his colleagues won't be short of new data sets to mine next. Examples include not only general-interest sources such as newspapers, but also scientific ones such as PubMedCentral, an online repository containing some 2.3 million biomedical articles.”

Link:

http://www.nature.com/news/researchers-aim-to-chart-intellectual-trends-in-arxiv-1.10103

Updated:

08/16/2012, 06:08

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.data oa.pubmed oa.policies oa.mining oa.comment oa.green oa.copyright oa.societies oa.events oa.google oa.physics oa.arxiv oa.funding oa.tools oa.history oa.harvard.u oa.preprints oa.open_library oa.cornell.u oa.aps oa.bookworm oa.repositories oa.versions oa.humanities oa.ssh

Authors:

abernard

Date tagged:

08/20/2012, 14:51

Date published:

02/25/2012, 12:29