CompoDynamics: a comprehensive database for characterizing sequence composition dynamics
(database[TitleAbstract]) AND (Nucleic acids research[Journal]) 2022-02-01
Nucleic Acids Res. 2022 Jan 7;50(D1):D962-D969. doi: 10.1093/nar/gkab979.
ABSTRACT
Sequence compositions of nucleic acids and proteins have significant impact on gene expression, RNA stability, translation efficiency, RNA/protein structure and molecular function, and are associated with genome evolution and adaptation across all kingdoms of life. Therefore, a devoted resource of sequence compositions and associated features is fundamentally crucial for a wide range of biological research. Here, we present CompoDynamics (https://ngdc.cncb.ac.cn/compodynamics/), a comprehensive database of sequence compositions of coding sequences (CDSs) and genomes for all kinds of species. Taking advantage of the exponential growth of RefSeq data, CompoDynamics presents a wealth of sequence compositions (nucleotide content, codon usage, amino acid usage) and derived features (coding potential, physicochemical property and phase separation) for 118 689 747 high-quality CDSs and 34 562 genomes across 24 995 species. Additionally, interactive analytical tools are provided to enable comparative analyses of sequence compositions and molecular features across different species and gene groups. Collectively, CompoDynamics bears the great potential to better understand the underlying roles of sequence composition dynamics across genes and genomes, providing a fundamental resource in support of a broad spectrum of biological studies.
PMID:34718745 | PMC:PMC8728180 | DOI:10.1093/nar/gkab979