PheNormGPT: a framework for extraction and normalization of key medical findings

Database (Oxford) 2025-01-22

Database (Oxford). 2024 Oct 23;2024:baae103. doi: 10.1093/database/baae103.

ABSTRACT

This manuscript presents PheNormGPT, a framework for extraction and normalization of key findings in clinical text. PheNormGPT relies on an innovative approach, leveraging large language models to extract key findings and phenotypic data in unstructured clinical text and map them to Human Phenotype Ontology concepts. It utilizes OpenAI's GPT-3.5 Turbo and GPT-4 models with fine-tuning and few-shot learning strategies, including a novel few-shot learning strategy for custom-tailored few-shot example selection per request. PheNormGPT was evaluated in the BioCreative VIII Track 3: Genetic Phenotype Extraction from Dysmorphology Physical Examination Entries shared task. PheNormGPT achieved an F1 score of 0.82 for standard matching and 0.72 for exact matching, securing first place for this shared task.

PMID:39444329 | PMC:PMC11498178 | DOI:10.1093/database/baae103