<em>Leishmania infantum</em> (JPCM5) Transcriptome, Gene Models and Resources for an Active Curation of Gene Annotations
wikidata 2024-07-24
Genes (Basel). 2023 Apr 4;14(4):866. doi: 10.3390/genes14040866.
ABSTRACT
Leishmania infantum is one of the causative agents of visceral leishmaniases, the most severe form of leishmaniasis. An improved assembly for the L. infantum genome was published five years ago, yet delineation of its transcriptome remained to be accomplished. In this work, the transcriptome annotation was attained by a combination of both short and long RNA-seq reads. The good agreement between the results derived from both methodologies confirmed that transcript assembly based on Illumina RNA-seq and further delimitation according to the positions of spliced leader (SAS) and poly-A (PAS) addition sites is an adequate strategy to annotate the transcriptomes of Leishmania, a procedure previously used for transcriptome annotation in other Leishmania species and related trypanosomatids. These analyses also confirmed that the Leishmania transcripts boundaries are relatively slippery, showing extensive heterogeneity at the 5'- and 3'-ends. However, the use of RNA-seq reads derived from the PacBio technology (referred to as Iso-Seq) allowed the authors to uncover some complex transcription patterns occurring at particular loci that would be unnoticed by the use of short RNA-seq reads alone. Thus, Iso-Seq analysis provided evidence that transcript processing at particular loci would be more dynamic than expected. Another noticeable finding was the observation of a case of allelic heterozygosity based on the existence of chimeric Iso-Seq reads that might be generated by an event of intrachromosomal recombination. In addition, we are providing the L. infantum gene models, including both UTRs and CDS regions, that would be helpful for undertaking whole-genome expression studies. Moreover, we have built the foundations of a communal database for the active curation of both gene/transcript models and functional annotations for genes and proteins.
PMID:37107624 | PMC:PMC10137940 | DOI:10.3390/genes14040866