HathiTrust Research Center Releases Massive Dataset of Features ... | HathiTrust Digital Library

ab1630's bookmarks 2015-05-13

Summary:

"The HathiTrust Research Center is pleased to announce the release of the Extracted Features Dataset (v.0.2), a dataset dervied from 4.8 million public domain volumes, totaling over 1.8 billion pages currently available in the HathiTrust Digital Library collection. The dataset includes over 734 billion words, dozens of languages, and spans multiple centuries. Features are informative, quantified characteristics of a text, and include ..."

Link:

http://www.hathitrust.org/htrc-releases-massive-dataset

From feeds:

Open Access Tracking Project (OATP) ยป abernard102@gmail.com

Tags:

oa.new oa.hathi oa.books oa.pd oa.data oa.metadata oa.licensing oa.libre oa.announcements oa.copyright oa.creative_commons

Date tagged:

05/13/2015, 08:07

Date published:

05/13/2015, 04:06