GMrepo v3: a curated human gut microbiome database with expanded disease coverage and enhanced cross-dataset biomarker analysis
database[Title] 2025-11-24
Nucleic Acids Res. 2025 Nov 24:gkaf1190. doi: 10.1093/nar/gkaf1190. Online ahead of print.
ABSTRACT
GMrepo (Gut Microbiome Data Repository) is a curated and consistently annotated database of human gut metagenomes, designed to improve data reusability and enable cross-project and cross-disease comparisons. In this latest release, GMrepo v3 has been expanded to 890 projects and 118 965 runs/samples, including 87 048 16S rRNA and 31 917 metagenomic datasets. The number of annotated diseases has increased from 133 to 302, allowing more comprehensive disease-related microbiome analyses. We systematically identified microbial markers between phenotype pairs (e.g. healthy versus diseased) at the project level and compared them across datasets to detect reproducible signatures. As of this release, GMrepo v3 includes 1299 marker taxa (726 species and 573 genera) associated with 167 phenotype pairs, derived from 275 carefully curated projects. To assess marker stability, we developed the Marker Consistency Index (MCI), which summarizes the prevalence and directional consistency of markers across studies. Among 400 markers showing altered abundances in ≥10 projects, 143 were consistently enriched in healthy controls (MCI > 75%), while 85 were enriched in diseases (MCI < 25%). A marker-centric interface enables users to explore marker behavior across diseases. The GMrepo v3 database is freely accessible at https://gmrepo.humangut.info.
PMID:41277537 | DOI:10.1093/nar/gkaf1190