Bakta Web - rapid and standardized genome annotation on scalable infrastructures

(database[TitleAbstract]) AND (Nucleic acids research[Journal]) 2025-06-07

Nucleic Acids Res. 2025 Apr 24:gkaf335. doi: 10.1093/nar/gkaf335. Online ahead of print.

ABSTRACT

The Bakta command line application is widely used and one of the most established tools for bacterial genome annotation. It balances comprehensive annotation with computational efficiency via alignment-free sequence identifications. However, the usage of command line software tools and the interpretation of result files in various formats might be challenging and pose technical barriers. Here, we present the recent updates on the Bakta web server, a user-friendly web interface for conducting and visualizing annotations using Bakta without requiring command line expertise or local computing resources. Key features include interactive visualizations through circular genome plots, linear genome browsers, and searchable data tables facilitating the interpretation of complex annotation results. The web server generates standard bioinformatics outputs (GFF3, GenBank, EMBL) and annotates diverse genomic features, including coding sequences, non-coding RNAs, small open reading frames (sORFs), and many more. The development of an auto-scaling cloud-native architecture and improved database integration led to substantially faster processing times and higher throughputs. The system supports FAIR principles via extensive cross-reference links to external databases, including RefSeq, UniRef, and Gene Ontology. Also, novel features have been implemented to foster sharing and collaborative interpretation of results. The web server is freely available at https://bakta.computational.bio.

PMID:40271661 | DOI:10.1093/nar/gkaf335