An algorithm to identify less invasive surfactant administration using a real-world database of preterm infants
database[Title] 2026-04-20
PLoS One. 2026 Apr 15;21(4):e0345768. doi: 10.1371/journal.pone.0345768. eCollection 2026.
ABSTRACT
BACKGROUND: Surfactant replacement therapy is central to respiratory distress syndrome (RDS) management in preterm infants, which can be delivered using a variety of methods. Less invasive surfactant administration (LISA) has been increasingly adopted due to its association with improved neonatal outcomes. However, there are no procedure codes to identify LISA in large real-world data (RWD), limiting the ability to evaluate its use and effectiveness on a large scale. This study aimed to develop an algorithm to identify LISA procedures using administrative data.
METHODS: We conducted a retrospective study using chart reviews as the gold standard to identify preterm infants receiving surfactant via LISA or non-LISA procedures across Kaiser Permanente Northern California (KPNC) facilities. We selected 82 candidate variables from administrative data between birth and date of first surfactant administration. The algorithm was developed using births between 2019 and 2023, which were randomly split into a training set (n = 884) and testing set (n = 379). A least absolute shrinkage and selection operator (LASSO) regression was used for variable selection and model fitting. Model discrimination was evaluated using area under the receiver operating characteristic (AUROC). Algorithm performance was validated using a combined sample of the testing set and a 2024 birth cohort (n = 622) overall and by gestational age (GA) using sensitivity (Sn), specificity (Sp), positive predictive value (PPV) and negative predictive value (NPV).
RESULTS: Among 1,263 preterm infants who received surfactant, 462 (36.6%) received surfactant via LISA and 801 (63.4%) received surfactant via invasive modalities (ETT or INSURE). The LASSO-based model selected 21 variables predictive of LISA methods based on the training set. The model demonstrated strong discrimination (AUROC = 0.87). Using the maximum specificity cut-point (predicted probability ≥0.79), the model achieved Sn = 43.9%, Sp = 96.8%, PPV = 90.0% and NPV = 72.5%, with an overall agreement of 75.9% when evaluated in the combined testing set and 2024 birth cohort. Sn and Sp were consistent across GA subgroups.
CONCLUSIONS: We used a machine-learning approach to develop an algorithm that performed well in identifying surfactant administered via LISA among preterm infants using administrative data. The algorithm demonstrated strong performance and can support future research to evaluate the utilization and outcomes of LISA using RWD.
PMID:41984913 | PMC:PMC13082626 | DOI:10.1371/journal.pone.0345768