From OMP to DOAB, via Thoth: Open Collaboration to Amplify Slovenian OA Books

flavoursofopenscience's bookmarks 2026-07-03

Reflections from the OPERAS Open Infrastructures for Open Access Books Working Group workshop held on May 20 as part of the OPERAS Conference 2026, Warsaw. Presentation slides are available here.

On 20 May 2026, the OPERAS Open Infrastructures for Open Access Books Working Group organised a workshop during the first day of the OPERAS Conference in Warsaw, bringing together approximately 35 participants from across the OPERAS network and the wider open scholarly communication community.

The workshop, titled “From OMP to DOAB, via Thoth: Open Collaboration to Amplify Slovenian OA Books“, explored how open, community-led infrastructures can work together to improve the dissemination, discoverability, interoperability, and long-term sustainability of Open Access (OA) books.

The session brought together expertise from four key not-for-profit infrastructures – Thoth Open Metadata, the OAPEN Foundation, the Directory of Open Access Books (DOAB), and the Public Knowledge Project (PKP), all of which are OPERAS members – alongside representatives from the Slovenian OPERAS node (OPERAS-SI), including ZRC SAZU, the University of Maribor, and the University of Ljubljana.

More broadly, the workshop reflected the goals of the OPERAS Open Infrastructures for OA Books Working Group: strengthening collaboration across the fragmented landscape of scholarly book publishing infrastructures and developing practical pathways toward more interoperable and sustainable workflows for OA books.

Fig. 1: Visual summary of feedback received via Mentimeter: From Local Files to Global Discovery: The OA Book Metadata Journey, from OMP to DOAB, via Thoth.

Why Slovenia?

The workshop centred on a concrete national use case: within the Slovenian OPERAS node, the three university publishers of ZRC SAZU, University of Maribor and University of Ljubljana. Their publishing units – Založba ZRC, University of Ljubljana Press and University of Maribor Press – together produce more than 70% of the country’s open access scholarly monographs. All three publishers have about 60% of their books available in open access. They have been successfully using OMP for their e-publications for several years now, however they continue to face common challenges around metadata management, dissemination, international visibility, preservation, and workflow integration.

As workshop participants explained, OMP is widely used to manage and publish books, to assign DOIs, host files, and to expose metadata. Yet in practice, many publishers (as well as all three Slovenian publishers) do not use it to manage submissions or editorial workflows. Instead, publishing processes often evolve around local requirements, organisational capacities, and historical practices, since the peer-review process for scholarly monographs differs in many ways from that required for scholarly articles.

To complicate things further, OMP as a system only provides a basic set of metadata that does not match the more detailed requirements of downstream aggregators. As one example, the Directory of Open Access Books (DOAB), one of the key discovery platforms active in the open access books landscape, requires distinct subject classification metadata provided via the controlled Thema vocabulary that is widely used across academic libraries.

Fig. 2: Schematic representation of a publisher’s perspective vis á vis metadata and dissemination in OA book publishing

This reality provided an ideal starting point for a broader discussion: how can open infrastructures work together to support diverse publishing practices while reducing duplication of effort and improving the quality and reach of OA books?

To ground the discussion in participants’ experiences, the workshop opened with a series of live Mentimeter polls. The results immediately revealed the diversity and complexity of today’s OA book publishing ecosystem.

What are publishers using OMP for?

Among OMP users, the poll indicated the most widely used functionality was the hosting of book files and landing pages. Participants also reported significant use of metrics and usage tracking, DOI registration, and title management functions. This suggests that for many organisations represented at the workshop OMP also primarily serves as a dissemination and publishing platform rather than a complete end-to-end workflow solution.

The discussion that followed highlighted how institutions often combine multiple systems to fulfill different requirements. Metrics collection, repository deposit, preservation, metadata management, and dissemination are frequently handled by separate tools and services. As one participant noted, many publishers have developed workflows around specific institutional needs rather than around a single integrated platform.

A highly fragmented ecosystem

When participants were asked about other publishing environments they use alongside OMP, the responses revealed a remarkably heterogeneous landscape.

Fig. 3: Results from a poll asking “What other book publishing environments (beyond OMP) are you using?”

Some respondents reported using platforms such as Pressbooks, PubPub, WordPress-based publishing environments, or proprietary systems. Others relied primarily on institutional repositories, while several indicated that they operate without a dedicated publishing system altogether, using only their organisational websites.

The discussion highlighted significant national and institutional variation. Participants from library publishing programmes described repository-based workflows designed around preservation and institutional visibility. Others referenced commercial or custom-built systems developed to meet local requirements. Croatia was cited as an example of a mixed environment where OMP coexists with repository infrastructures, particularly among smaller universities.

The overall conclusion was clear: there is no single dominant workflow model for OA books. Instead, publishers operate within a complex ecosystem of interconnected platforms, repositories, services, and local practices; OMP and similar specialized publishing systems for the books were in the minority. We dare to speculate that, in a similar survey regarding journals, the results would be significantly more favorable for more integrated, system-based solutions.

The Submission Workflow Question

One of the most interesting discussions emerged around manuscript submission systems.

Participants were asked how important it is for a publishing platform to support the submission process. The results challenged some common assumptions. While a minority considered submission management essential, the largest group reported handling submissions through alternative processes rather than dedicated platform functionality. Others indicated that they simply do not use formal submission systems at all.

Fig. 4: Results from a poll asking “What are the key features of OMP you are actively using?”

Representatives from PKP noted that submission workflows are being actively considered as an area of future OMP development. However, the discussion demonstrated why the question is more complicated than it first appears.

Across the board, participants agreed that book publishing workflows differ significantly from journal publishing. Peer review cycles are often longer. Editorial relationships tend to be more personal and less transactional. Publishers producing only a small number of books each year frequently maintain direct communication with authors and therefore see limited value in requiring authors to learn and use a dedicated submission platform.

The nature of the publication also matters. Managing a single-author monograph presents very different challenges from coordinating an edited volume involving dozens of contributors. For single-author monographs, peer-reviewing is generally a longer process and can even involve close collaboration between the author and the reviewers. Participants repeatedly emphasised that flexibility remains critical. The diversity of book publishing practices means that highly standardised submission workflows may not fit many organisations’ needs.

Another recurring theme was the role of human resources. Several participants noted that barriers to adopting more sophisticated workflows are often organisational rather than technical. Publishers may have access to suitable systems but lack the staff capacity required to provide training, support, and ongoing maintenance.

Metadata: Essential, Yet Still Difficult

Metadata emerged as one of the workshop’s central themes. Across the Mentimeter responses and subsequent discussion, participants consistently described metadata management as both indispensable and labour-intensive.

Fig. 5: Results from a poll asking “Would you be able to extract essential metadata [as outlined in the two-tiered metadata recommendations framework] from your system?”

Most respondents characterised their experiences as manageable but requiring substantial effort. Only a small minority reported fully streamlined workflows. When asked whether their systems could automatically extract essential metadata, most participants indicated that only partial automation was possible. For many organisations, at least half of the metadata still requires manual input and verification.

This resonated strongly with examples shared throughout the workshop. Participants described a range of challenges, including inconsistent metadata standards, missing identifiers, formatting errors, DOI issues, incompatible export structures, and the need to clean metadata received from external systems.

Several institutions reported developing local tools specifically to bridge gaps between platforms and metadata requirements. Representatives from Masaryk University, for example, described an internally developed system designed to manage metadata across multiple services and specifications.

Yet participants also stressed that metadata becomes significantly easier to manage when it is collected early in the publishing process. Systems that integrate metadata capture from the outset reduce duplication of effort and improve downstream interoperability.

A Shared Framework for Better Metadata

These practical discussions connected directly to one of the workshop’s core presentations: the report International Metadata Recommendations for OA Books and Chapters. The report reviews existing policy recommendations and platform requirements, and then synthesises these findings into a two-tiered framework for metadata management consisting of Essential and Desirable metadata elements.

The presentation highlighted several important findings. First, metadata quality and interoperability remain fundamental requirements for discoverability, dissemination, preservation, accessibility, and reuse. Second, recent policy developments and accessibility requirements are increasing expectations around metadata completeness and quality. Finally, the OA books ecosystem still lags behind journal publishing in terms of technical and policy maturity. The proposed framework therefore seeks to establish a common baseline of metadata elements that can support interoperability across different platforms and infrastructures while remaining realistic for publishers with varying capacities.

Understanding data submissions workflows at OAPEN and DOAB

Building on the metadata framework, representatives from OAPEN and DOAB then explained how metadata moves into and through their systems, and which challenges publishers commonly encounter when trying to submit data to OAPEN. The discussion highlighted the continuing importance of high-quality metadata for successful dissemination.

Participants learned more about DOAB membership processes, quality assurance procedures, metadata requirements, usage dashboards, and ongoing work to develop improved guidance for publishers. Particular attention was given to chapter-level metadata, ONIX workflows, and the provision of Thema subject classifications, all of which are becoming increasingly important for discoverability and reporting.

The session also explored the ingestion routes of DOAB and OAPEN. While manual upload workflows remain available for publishers in the DOAB, participants acknowledged that these processes can be cumbersome for publishers. To address this, OAPEN has increasingly invested in automated metadata pipelines capable of handling larger volumes of content more efficiently.

One of these pipelines has been developed in close collaboration with Thoth Open Metadata in the context of the COPIM (2019-23) and Open Book Futures (2023-26) projects, providing an automated dissemination route into the OAPEN Library utilising a data stream facilitated through Thoth via an implementation of the SWORD protocol. The automated submissions workflow is now being used in production, with a dedicated outline provided by Thoth Open Metadata and OAPEN.

Interestingly, many technical problems encountered by OAPEN during ingestion are not caused by missing systems but by inconsistent metadata practices. Common issues include incorrect DOI formatting, inconsistent separators, malformed HTML, and incomplete records. As several speakers noted, metadata quality problems often originate far upstream in the publishing process.

This is why initiatives such as the two-tiered metadata recommendation framework, and the work that has gone into developing the Thoth platform are considered important interventions – they provide publishers with the means to streamline their metadata workflows, while also empowering them to directly embed good metadata practice into their day-to-day work.

“I’m Glad You Are Angry”

One of the workshop’s most memorable moments emerged during a discussion of metadata loss across the scholarly communications supply chain.

Reflecting on findings from the metadata report, participants observed that publishers frequently provide rich metadata only to see much of it disappear as records move between commercial systems and intermediaries. The discussion touched on broader structural issues within scholarly publishing infrastructures, pertaining to what has repeatedly been dubbed the “Leaky Metadata Pipeline”.

Participants noted that significant amounts of metadata – of up to 90% of data provided upstream by born-OA publishers – can be lost during distribution processes, creating inefficiencies that later require libraries and institutions to reconstruct or supplement missing information. The exchange sparked a wider conversation about why metadata quality remains difficult to preserve despite widespread recognition of its importance.

At its core, the discussion reinforced one of the workshop’s central messages: metadata should not be treated as an afterthought. It is infrastructure that is integral to the wider discoverability of OA books and chapters.

Connecting OMP, OAPEN, and DOAB through Thoth

The second half of the workshop focused on practical solutions. The Slovenian use case provided an opportunity to demonstrate how multiple open infrastructures can be connected into a more coherent workflow.

A key example was the development of a plugin capable of transferring metadata from OMP into the Thoth platform. Through Thoth, publishers can enrich, validate, manage, and disseminate metadata in one central space while maintaining a CC0 metadata model designed for maximum interoperability. Participants also learned about the different service pathways available through Thoth, including the freely-available self-service metadata platform and export route that is available via Thoth Oasis, and which enables publishers to export high-quality ONIX records released under CC0 that are already tailored to OAPEN’s specifications. Next to that, Thoth services offer multiple automated dissemination and archiving workflows such as the automated SWORD data transfer into OAPEN, while also providing usage statistics or the option to host full websites & catalogues under a publisher’s own domain – all of which being made possible through open metadata and interoperable, open infrastructures.

This infrastructure layer enables publishers to connect more efficiently with downstream services such as OAPEN, DOAB, repositories, and analytics platforms while reducing manual input and repeated duplication of effort. The demonstration illustrated the benefits of an interoperable open infrastructure ecosystem: in this network, each service performs a specialised role, while interoperability allows publishers to move metadata and content more effectively across the ecosystem.

Measuring Impact

The workshop also examined the impact of improved metadata practices. Examples shared during the session demonstrated how enhanced metadata quality can contribute to increased visibility, discoverability, dissemination, and usage. Participants discussed evidence from Open Book Publishers and other initiatives showing how comprehensive metadata can significantly improve the reach of OA books, via a recent snapshot of Open Book Publisher’s Crossref Participations report, in which it excelled in almost every dimension thanks to the streamlined metadata provision through the Thoth system.

Fig. 6: Snapshot of the Crossref Participation Report of Open Book Publishers, who are utilising the Thoth metadata system to manage high-quality metadata for their DOI registration with Crossref.

Doing so, the discussion also challenged persistent assumptions about the quality of OA publications. Far from representing a weakness, participants argued that community-led OA publishers increasingly demonstrate stronger metadata practices than many commercial publishers. In this sense, metadata quality becomes part of the broader argument for the value and professionalism of Diamond OA book publishing.

Looking Ahead

The workshop concluded with a discussion of future opportunities for collaboration. Several priorities emerged, including further collaboration on automated DOI registration, streamlined metadata dissemination workflows, repository integration, automated archiving workflows, improved usage statistics and reporting (via the OPERAS Metrics service), and in more general terms, greater interoperability between community-owned infrastructures.

Participants repeatedly emphasised that the OA books ecosystem has now reached a stage where numerous open infrastructures exist and provide complementary services. The challenge moving forward thus is not necessarily building entirely new systems, but to ensure that existing systems such as those of PKP, OAPEN, Thoth Open Metadata, and DOAB, work together more effectively through open data and interoperable, open APIs. Equitable collective funding models such as that of the Open Book Collective, itself a non-profit charity and OPERAS member, exist to provide sustainable pathways for libraries to support the operations of those open infrastructures.

For Slovenian publishers, the collaboration between PKP, Thoth, OAPEN, and DOAB demonstrates one possible configuration of such a collaborative model. More broadly, it offers an example of how community-governed infrastructures can collectively address challenges that individual organisations would struggle to solve alone.

Conclusion

The OPERAS Conference workshop demonstrated both the complexity of the OA books landscape and the growing maturity of the infrastructures that support it. Participants brought diverse publishing experiences, workflows, and institutional contexts. From those different backgrounds, common themes emerged throughout the discussions: the importance of open metadata, the need for interoperability, the realities of limited resources, and the value of open collaboration.

Perhaps most importantly, the workshop highlighted that sustainable solutions are increasingly being developed not by isolated platforms but through partnerships between infrastructures, publishers, libraries, and scholarly communities.

As OA book publishing continues to expand, initiatives such as the OPERAS Open Infrastructures for Open Access Books Working Group that is part of the OPERAS OA Books Special Interest Group (SIG) play a crucial role in creating the connections, standards, and collaborations needed to support a more open, visible, and resilient scholarly publishing ecosystem.

By Toby Steiner, Karla Avanço, Aleš Pogačnik, Wiktor Florian, Rupert Gatti, Anna Wałek