SHARK: web server for alignment-free homology assessment for intrinsically disordered and unalignable protein regions
(database[TitleAbstract]) AND (Nucleic acids research[Journal]) 2025-06-06
Nucleic Acids Res. 2025 May 21:gkaf408. doi: 10.1093/nar/gkaf408. Online ahead of print.
ABSTRACT
Whereas alignment has been fundamental to sequence-based assessments of protein homology, it is ineffective for intrinsically disordered regions (IDRs) due to their lowered sequence conservation and unique sequence properties. Here, we present a web server implementation of SHARK (bio-shark.org), an alignment-free algorithm for homology classification that compares the overall amino acid composition and short regions (k-mers) shared between sequences (SHARK-scores). The output of such k-mer-based comparisons is used by SHARK-dive, a machine learning classifier to detect homology between unalignable, disordered sequences. SHARK-web provides sequence-versus-database assessment of protein sequence homology akin to conventional tools such as BLAST and HMMER. Additionally, we provide precomputed sets of IDR sequences from 16 model organism proteomes facilitating searches against species-specific IDR-omes. SHARK-dive offers superior overall homology detection performance to BLAST and HMMER, driven by a large increase in sensitivity to low sequence identity homologs, and can be used to facilitate the study of sequence-function relationships in disordered, difficult-to-align regions.
PMID:40396357 | DOI:10.1093/nar/gkaf408