OpenScholar: The open-source A.I. that’s outperforming GPT-4o in scientific research | VentureBeat
peter.suber's bookmarks 2025-01-08
Summary:
"At OpenScholar’s core is a retrieval-augmented language model that taps into a datastore of more than 45 million open-access academic papers....
When tasked with answering biomedical research questions, GPT-4o cited nonexistent papers in more than 90% of cases. OpenScholar, by contrast, remained firmly anchored in verifiable sources....
The OpenScholar team has released not only the code for the language model but also the entire retrieval pipeline, a specialized 8-billion-parameter model fine-tuned for scientific tasks, and a datastore of scientific papers....
Still, OpenScholar is not without limitations. Its datastore is restricted to open-access papers, leaving out paywalled research that dominates some fields...."