Does Clarivate understand what citations are for?

flavoursofopenscience's bookmarks 2026-02-26

A month ago Clarivate announced a new yet-to-be-released product called Nexus: "Clarivate Nexus acts as a bridge between the convenience of AI and the rigor of academic libraries". This is a pitch to librarians who have correctly identified generative AI chatbots as purveyors of endless bullshit, but also know that students and some researchers are going to use them anyway. Clarivate tells us that we can patch up the fabrications of chatbots with reassuring terms like "trusted sources", "verified academic references", and "authoritative".

Looking more carefully at Clarivate's marketing material, what they are proposing suggests that Clarivate understands neither what citations are for nor why fabricated citations are a problem. This is somewhat surprising for the company that controls and manages such key parts of the scholarly publishing systems as the citation database Web of Science, scholarly publishing and indexing company ProQuest, and the Primo/Summon Central Discovery Index.

Why we cite

It can get a little more complicated than this, but there are essentially two reasons for citations in scholarly work.

The first is to indicate where you got your data. If I write that the population of Australia in June 2025 was 27.6 million people, I need to back up this claim somehow. In this case, I would cite the Australian Bureau of Statistics as the source. This adds credibility to a claim by enabling readers to check the original source and assess whether it actually does make the same claim, and whether that claim is credible. If I said that the population of Australia in 2025 was 100 million people and cited a source which made that claim and in turn cited the ABS as their source, you could follow the chain of references back and identify that the paper I cited is where the error ocurred.

The second reason we cite a source is to give credit for a concept, term, or model for thinking. This is less about checking facts and more about academic norms and manners, though it also indicates how credible a scholar might be in terms of their understanding of a field. For example I might describe a concept whereby librarians feel that the mission of libraries is good and righteous, and this leads to burnout because they feel they can never complain about their working conditions. If I did not cite Fobazi Ettarh's Vocational Awe and Librarianship: The Lies We Tell Ourselves whilst describing this, I would rightly not be seen as a credible scholar in the field, or alternatively might be seen as surely knowing about Ettarh's work but deliberately ignoring it or even claiming her work as my own idea.

Why fabricated citations are bad

So that's the basics of why scholars include citations in their work. We can now explore why fabricated citations are a problem. There are two related but distinct reasons.

Citations that look real but are actually fake waste the time of already-busy library resource-sharing teams by making them spend time checking whether the citation is real, and sometimes looking for items that don't exist. This aspect of fabrication is bad because the cited item doesn't exist. If we match this to our first reason for citing, we can see that a claim that is backed by a citation to nothing at all is, uh, pretty problematic if the reason we cite is to link to the source data backing up a claim. It's equivalent to simply not providing a citation at all, except worse because we're claiming that our plucked-out-of-the-air "fact" is backed up by some other source.

The second problem with fabricated citations is that there is no connection between the statement being made and the source being cited. Even if the source being cited exists, the connection between the statement and the cited item is fabricated. This is slightly more difficult to understand because generative AI is based on probability, so in many cases there will appear to be a connection. But without a tightly-controlled RAG system, it's likely to simply be a lucky guess. The problem here is one of academic integrity – we've cited a source that exists, but it may or may not back up our claim, and the claim doesn't follow from the source.

A false nexus

Clarivate seems to be conflating these two issues. Their Nexus product has two core functions: checking citations to see if they are real, and suggesting references for content in chatbot conversations. The first is genuinely useful, though highly constrained – Clarivate only checks their own indexes, and defines anything that doesn't appear in those indexes as either non-existing, or "non-scholarly" (it's unclear how it would define, for example, something with a DOI that exists but doesn't appear in Web of Science). Neither academia nor the tech industry are short on hubris, but even in that context, "anything not listed in our proprietary databases isn't credible" is a pretty eyebrow-raising claim.

The second function kicks in when the citation checker defines a citation as failed – it offers to "Find Verified Alternative". That is, Nexus offers to replace both cited sources that don't exist and cited sources that "aren't scholarly" with another real source. This addresses the first problem (cited sources that don't exist) but not the second (cited sources that aren't the real source of a claim or quotation).

With Nexus, Clarivate are essentially integrity-washing synthetic text, giving it an academic sheen without any academic rigour. Far from helping librarians, Clarivate's Nexus threatens to further unravel the hard work we do to teach students information literacy skills and its sparkling variety, "AI literacy". Students are already inclined to write their argument first and go on a fishing expedition for citations to back it up later (I certainly wrote my undergraduate essays this way). The last thing we want to do is direct them to a product that encourages this academically dishonest behaviour.

ChatGPT is designed to provide something that looks like a competent answer to a question. Nexus seems to be designed to amend this answer-shaped text into something that looks like a correctly-cited academic essay. But the point of student assessments isn't to produce essays – it's to produce competent researchers and systematic thinkers. Perhaps Clarivate thinks there is a large potential market of universities who want to help their own students cheat on assignments in ways that look more credible. To that, I would say "[citation needed]".