Some limitations of DOAJ metadata for research purposes | Sustaining the Knowledge Commons / Soutenir les savoirs communs

heather.morrison's.bookmarks 2021-02-10


by Xuan Zhaon, Luan Borges & Heather Morrison

Some of the limitations of DOAJ metadata that researchers need to be aware of are explained in this post. In brief, DOAJ metadata must be opened in Unicode to retain non-English characters. The metadata will sometimes appear in the wrong column; clean-up is needed to avoid errors in data analysis. Metadata may be inconsistent; anomalies in listing of publisher names is presented as an example. An open dataset has been released of the DOAJ metadata as of Jan. 5, 2021, with the non-English characters retained, information in the correct columns, and an additional column with standardized publisher names added; the link can be found in the post. DOAJ updating limitations are explained. As of Jan. 5, 2021, only 30% of records had been updated in the past year, and there is no way for researchers to know if the last update reflects a full review, i.e. the particular metadata of interest might not have been updated. 


