Peer Review

Computational Complexity 2024-05-22

Daniel Lemire wrote a blog post Peer Review is Not the Gold Standard in Science. I wonder who was claiming it was. There is whole section of an online Responsible Conduct in Research we are required to take on peer review which discussing its challenges: "honesty, objectivity, quality control, confidentiality and security, fairness, bias, conflicts of interest, editorial independence, and professionalism". With apologies to Winston Churchill, Peer Review is the worst form of measuring academic quality, except for all of the others.

Peer review requires answering two questions.

  1. Has the research been done properly?
  2. What is the value of the research?
For theoretical research, the first comes down to checking the proofs, which sounds like an objective check. Here we have a "gold standard", formalizing the proof so it can be verified in a proof system like Lean. That's a heavy burden so we generally only require authors to give enough details so it's clear that we could formalize the proof given enough time. That becomes subjective and reviewers, especially for conferences, may not have the time or inclination to check the details of a 40-page proof. Maybe one day AI can take a well-written informal proof and formalize it for a proof system.
But the second question is almost entirely subjective. How does the work advance previous research? What value does it give to a field and how does it set up future research? Different researchers will give different opinions. And then there are the people who consciously or unconsciously cheat, helping their friends get papers accepted to citations rings. As we focus on metrics to judge researchers, too many people will game the system to pump up those metrics.
In 2013, NeurIPS had over 13,000 submission for 3500 slots. Even with the best or reviewer's intentions, it's impossible to maintain any sense of consistency for these large volume conferences.
Despite the problems with peer review, you'd hate to us a different system, say delegating the reviewing to some AI process, even if it could lead to more consistency. I suspect many reviews are being delegated anyway.
Peer review grew in importance as journals and conferences had to make choices to fill a limited proceedings. These days we have the capacity to distribute every papers. So perhaps the best form of measuring academic quality is no review at all.