PlantPan: A comprehensive multi-species plant pan-genome database

database[Title] 2025-04-20

Plant J. 2025 Apr;122(1):e70144. doi: 10.1111/tpj.70144.

ABSTRACT

The pan-genome represents the complete genomic diversity of specific species, serving as a valuable resource for studying species evolution, crop domestication, and guiding crop breeding and improvement. While there are several single-species-specific plant pan-genome databases, the availability of multi-species pan-genome databases is limited. Additionally, variations in methods and data types used for plant pan-genome analysis across different databases hinder the comparison and integration of pan-genome information from various projects at multi-species or single-species levels. To tackle this challenge, we introduce PlantPan, a comprehensive database housing the results of pan-genome analysis for 195 genomes from 11 plant species. PlantPan aims to provide extensive information, including gene-centric and sequence-centric pan-genome information, graph-based pan-genome, pan-genome openness profiles, gene functions and its variation characteristics, homologous genes, and gene clusters across different species. Statistically, PlantPan incorporates 9 163 011 genes, 694 191 gene clusters, 526 973 370 genome variations, and 1 616 089 non-redundant genome variation groups at the species level, 33 455,098 genome synteny, and 177 827 non-redundant genome synteny groups at the species level. Regarding functional genes, PlantPan contains 5 222 720 genes related to transcription factors, 395 247 literature-reported resistance genes, 455 748 predicted microbial/disease resistance genes, and 1 612 112 genes related to molecular pathways. In summary, PlantPan is a vital platform for advancing the application of pan-genomes in molecular breeding for crops and evolutionary research for plants.

PMID:40219973 | DOI:10.1111/tpj.70144