Chemistry’s web of data expands : Nature News & Comment

abernard102@gmail.com 2012-08-20

Summary:

“Other areas of science feast on free online data, but chemistry has been late to the party. Now it is catching up. In the latest effort to provide free access to chemical information, the London-based company SureChem (owned by Digital Science, a sister company to Nature Publishing Group) said this week that it would release data on 10 million molecules patented by the pharmaceutical industry since 1976. Harvested automatically from some 20 million patents, the data could lower barriers to drug discovery by academic researchers. The announcement, made on 26 March at the spring meeting of the American Chemical Society (ACS) in San Diego, California, follows a similar move by computing giant IBM last December. IBM deposited computer-harvested data on about 2.4 million small molecules into PubChem, the world’s largest free chemistry repository, which is run by the US National Library of Medicine in Bethesda, Maryland... But Michael Walters, a chemist working in academic drug discovery at the University of Minnesota in Minneapolis, thinks that the initiatives could mark ‘a sea change in the way in which patent data are accessed and analysed’. The data should make it easier for chemists to see which bioactive molecules have drawn the attention of the drug industry — and to explore new drug targets by designing compounds that are not named in patents... Academic drug discovery will get another boost in September, when a consortium of eight pharmaceutical firms, three biotechnology companies and a number of leading informaticians releases its own free, online drug-discovery platform, the Open Pharmacological Concepts Triple Store (OpenPHACTS). Supported in part by a €10-million (US$13-million) grant from the European Union’s Innovative Medicines Initiative, the website will link data on small molecules and their biological effects, to provide a library of compounds that anyone can download and explore... Until a few years ago, the market in chemical information was monopolized by the ACS Chemical Abstracts Service, a manually curated registry that now holds more than 65 million structures, charges individual users thousands of dollars a year for access and does not allow large downloads or repurposing of its information. Its SciFinder service offers tools to make sense of the data. Similar analytical services are sold by firms such as IBM, Thomson Reuters and Elsevier in Amsterdam, which offers the Reaxys tool... But in 2004, the US National Institutes of Health (NIH) created PubChem, into which anyone can deposit data on structures and their biological activity... The database has now grown to more than 32 million structures and, according to PubChem, has roughly 100,000 unique users per day. In 2007, another free repository, ChemSpider, was created by chemist Antony Williams; in 2009, it was purchased by the UK Royal Society of Chemistry in London and it now holds 27 million structures. These two databases are now the Internet’s main chemistry hubs, linking out to other sources of free online information, such as ChEMBL, a database of about 1 million bioactive drug-like small molecules hosted by the European Bioinformatics Institute... The result is a web of interconnected free data, contrasting with high-quality but closed-off subscription databases... free online data can be poorly curated — and chemical data is no exception. In a project presented at the ACS meeting in San Diego, Williams and his colleagues showed how five large online databases disagreed on the structures of 150 top-selling drugs: the best got 99% of structures correct, whereas the worst managed only 76%. In fact, notes Williams, Wikipedia proved the most reliable source of structural information in that experiment — mostly because of an effort to clean up the site’s 13,000 pages about chemicals. Williams says that more chemists need to concentrate on data standards and start actively correcting information online...”

Link:

http://www.nature.com/news/chemistry-s-web-of-data-expands-1.10328

Updated:

08/16/2012, 06:08

From feeds:

Open Access Tracking Project (OATP) » abernard102@gmail.com

Tags:

oa.new oa.data oa.npg oa.mining oa.comment oa.usa oa.universities oa.elsevier oa.societies oa.libraries oa.harvesting oa.europe oa.uk oa.funding oa.chemistry oa.acs oa.prices oa.patents oa.pharma oa.consortia oa.chemspider oa.nlm oa.thomson_reuters oa.chembl oa.pubchem oa.ibm oa.openphacts oa.surechem oa.hei oa.announcements

Authors:

abernard

Date tagged:

08/20/2012, 18:35

Date published:

04/02/2012, 21:17