Repositories: Not Just About Publications Any More
“Not that repositories ever really were only about published scholarly output, but for some organizations that was the easiest first bar to reach. But at Open Repositories 2012, it was clear that the bar has been raised. OR2012, held at the University of Edinburgh from July 9-13, 2012, had over 480 registered attendees from over 40 countries. The entire conference was live-blogged, which provided some remarkable coverage. And there is a tweet archive for the tag #or2012. Oh, and there is a Flickr pool. But if there was one word that that was woven into almost every presentation, it was this: DATA. Which made me very happy... Library, Archive, and Museum collections are now being mined as data by researchers, which requires new management strategies and new self-serve services. And, consequently, cultural institutions all have big data and need different IT infrastructures for the processing and serving of these collections. And I said as much in my presentation at OR2012 (Check out the live blog post from the session I spoke in). Just about everyone was discussing RDM, or Research Data Management. It has become clear that institutional repositories must not only manage scholarly publications, but the data that was created through observation and experimentation or collected and published, in order to support the ‘re-’ activities: review, reuse, replicability and reproducibility. RDM platforms are needed to help researches capture and share and publish their datasets. The public-facing discovery infrastructure is but a small part of this effort: the greater need and effort is in capturing data from the original instruments and formats and the transfer and documentation of datasets in a reliable, documented way to support a forensic level of authenticity for future researchers. The Digital Curation Centre has a great blog post reviewing some of the sessions on this topic. Another word which was everywhere was ‘identifiers.’ Disambiguation of researchers/authors has been a known issue since the earliest Institutional Repositories, where one publication might be by ‘Leslie Johnston,’ and another by ‘L. Johnston,’ and another by ‘Leslie L. Johnston,’ depending upon the publication’s stylebook. Is that the same person to someone searching for all my publications? ORCID is the more mature service in the assignment and resolution of identifiers, but the status of ISNI, aka ISO 27729, was also presented. There will likely never be a single unique identifier, as there are these two international services, national services, and institutional services. The catch will be crosswalking between all the identifiers. The same can be said for article or item identifiers, such as the DOI, which has a high level of buy-in in the publishing realm, but uneven adoption for other types of objects. Linked Open Data was, unsurprisingly, a topic of discussion. The opening plenary by Cameron Neylon from the Public Library of Science very much emphasized this point: ‘It’s about links; it’s about connectedness.’ And it’s not just linking between objects and repositories, but synchronization between them. Some very interesting early work was presented on the Webtracks and ResourceSync projects. A number of NDIIPP partners presented their projects at Open Repositories, including Duracloud and Chronopolis."