Oh, the humanity

Connotea Imports 2012-07-31

Summary:

"This Google Books data set, which is available for download along with the Google Books Ngram Viewer, is a free quantitative tool made available to supplement humanities research worldwide. It is based on the full text of about 5.2 million books, with more than 500 billion words in total. About 72 percent of its text is in English, with smaller amounts in French, Spanish, German, Chinese, and Russian. It is the largest data release in the history of the humanities, the authors note, a sequence of letters 1,000 times longer than the human genome. If written in a straight line, it would reach to the moon and back 10 times over...."

Link:

http://news.harvard.edu/gazette/story/2010/12/cultural-genome/

From feeds:

Open Access Tracking Project (OATP) ยป Connotea Imports

Tags:

oa.new oa.data oa.mining oa.humanities oa.google.books oa.ssh

Authors:

petersuber

Date tagged:

07/31/2012, 15:19

Date published:

12/16/2010, 16:02