"The chart shows the same measures taken (using the same methods and data sources) over successive years. The lines should match, but in more recent years, they diverge. The data varies depending on when the readings were taken.

Notice how, for example, the number of articles published in 2016 varies by 14% depending on when the index was consulted. The data suggest that articles continued to be published after the year they were published in. The trends suggest a catastrophic fall-off in output.

Clearly something is wrong. If publication output had dropped by 90%+ since 2016, every scholarly publishing stakeholder would be both aware and on high alert!...

The reason the divergence illustrated in the chart occurs is because it takes time for the major indexes to count publication outputs. Our industry lacks common infrastructure for gathering basic measures, leaving it instead to the thousands of publishers to deposit information. Even where infrastructure exists – such as CrossRef – publishers are not consistent about how quickly, how much, or even if they deposit information about their outputs. Additionally, the formats and standards they use do not always include the most effective meta data for characterizing publications (case in point: clearly and consistently specifying open access articles in hybrid journals)....

One might be tempted to think that the state of our data in scholarly publishing is “par for the course” – surely all industries are like this. However, that is not the case....

Basic metadata in our information industry should be like basic hygiene in healthcare. Boring but necessary. If scholarly publishers are stewards of the world’s evidence base, then surely, we need to get our own evidence in order?"


