Review of autism spectrum disorder databases for the identification of candidate genes

Database (Oxford) 2025-11-26

Database (Oxford). 2025 Jan 18;2025:baaf067. doi: 10.1093/database/baaf067.

ABSTRACT

Research into the genetics of autism spectrum disorder (ASD) seeks to unravel its complex genetic background by identifying genes associated with the condition at varying levels of confidence. While these findings hold significant potential for clinical applications, the dispersed nature of scientific evidence presents a challenge for the reliable identification of ASD candidate genes. Although ASD candidate genes are gathered in genetic databases, these vary widely in the gene sets, biological information, and confidence level classification methods, leading to inconsistencies and complicating research efforts. This study aims to identify and assess the quality and reliability of ASD genetic databases to support more robust identification of ASD candidate genes. Using a Systematic Mapping Study, we identified 13 specialized databases. We then followed a Data Quality Approach in two stages, first assessing Accessibility, Currency, and Relevance dimensions to select the potentially relevant databases to be used as ASD candidate gene sources. The selected databases were analysed, assessing Completeness-at schema and data level-, and Consistency between high-confidence ASD genes. The four selected databases are: AutDB, SFARI Gene, GeisingerDBD, and SysNDD. SFARI Gene demonstrated the highest completeness at schema level (89%), while AutDB showed the highest completeness at data level (90%). However, only 1.5% consistency was observed across the four databases in their classification of high-confidence ASD candidate genes. Our findings highlight the unique contributions of each database and reveal substantial inconsistencies in gene classification, driven by differences in scoring criteria and the scientific evidence considered. These inconsistencies have important implications for both clinical users and researchers, as conclusions may vary depending on the database used. This study supports researchers when using ASD genetic databases, promoting consistent interpretation and improved clinical decisions.

PMID:41092272 | PMC:PMC12527254 | DOI:10.1093/database/baaf067