HLRMDB: a comprehensive database of the human microbiome with metagenomic assembly, taxonomic classification, and functional annotation by analysis of long-read and hybrid sequencing data
(database[TitleAbstract]) AND (Nucleic acids research[Journal]) 2026-01-25
Nucleic Acids Res. 2026 Jan 6;54(D1):D763-D775. doi: 10.1093/nar/gkaf1152.
ABSTRACT
The human microbiome harbours an immense diversity of uncultivated microbes; short-read metagenomic sequencing has elucidated much of this diversity, but fragment repeats and mobile elements constrain strain-level resolution. Fortunately, long-read metagenomic sequencing can generate reads spanning tens of kilobases with single-molecule accuracies exceeding 99%, enabling near-complete genome and gene cluster recovery in a cultivation-independent manner. However, systematic resources that aggregate and standardise long-read outputs remain limited. Here, we present HLRMDB (http://www.inbirg.com/hlrmdb/), a comprehensive database of human microbiome datasets derived from long-read and hybrid metagenomic sequencing. We curated 1672 publicly available metagenomes (1291 long reads; 381 hybrids) spanning 38 studies, 39 sampling contexts and 42 host health states. A uniform assembly and binning pipeline reconstructed >98 Gb of contigs and yielded 18 721 metagenome-assembled genomes (MAGs). These MAGs span 21 phyla and 1323 bacterial species, with 6339 classified as near-complete and 5609 as medium-quality. HLRMDB integrates these genome-resolved data with extensive gene-centric functional profiles and antimicrobial resistance annotations. An interactive web interface supports flexible access to both sample-level and genome-level results, with multiple visualisations linking raw reads to assembled genomes. Overall, HLRMDB offers a harmonised, long-read-oriented repository that supports reproducible, strain-resolved comparative genomics and context-sensitive ecological investigations of the human microbiome.
PMID:41207298 | PMC:PMC12807619 | DOI:10.1093/nar/gkaf1152