Enzyme Engineering Database (EnzEngDB): a platform for sharing and interpreting sequence-function relationships across protein engineering campaigns

(database[TitleAbstract]) AND (Nucleic acids research[Journal]) 2025-12-10

Nucleic Acids Res. 2025 Dec 8:gkaf1142. doi: 10.1093/nar/gkaf1142. Online ahead of print.

ABSTRACT

The discovery and engineering of new enzymes is important across the bioeconomy, with diverse applications from foods to pharmaceuticals, sensors to agriculture. However, enzyme engineering, in particular machine learning-guided engineering, is hampered by a lack of data. Currently there exists no database designed to capture and interpret datasets created in this domain, nor are there easy analysis and visualisation tools. We developed the Enzyme Engineering Database to provide a centralized resource and an online analysis tool to consolidate sequence-function data from enzyme engineering campaigns, thereby making three contributions: (i) a database into which researchers can deposit public data, (ii) visualisation and analysis tools for protein engineers to analyse their own data or compare enzyme variants to other engineering campaigns, and (iii) a gold-standard dataset for benchmarking automated extraction along with the first large language model extraction pipeline specific for enzyme engineering campaigns. The Enzyme Engineering Database is accessible at http://enzengdb.org/.

PMID:41359034 | DOI:10.1093/nar/gkaf1142