Early prediction of sepsis associated encephalopathy in elderly ICU patients using machine learning models: a retrospective study based on the MIMIC-IV database

database[Title] 2025-05-11

Front Cell Infect Microbiol. 2025 Apr 17;15:1545979. doi: 10.3389/fcimb.2025.1545979. eCollection 2025.

ABSTRACT

BACKGROUND: Sepsis associated encephalopathy (SAE) is prevalent among elderly patients in the ICU and significantly affects patient prognosis. Due to the symptom similarity with other neurological disorders and the absence of specific biomarkers, early clinical diagnosis remains challenging. This study aimed to develop a predictive model for SAE in elderly ICU patients.

METHODS: The data of elderly sepsis patients were extracted from the MIMIC IV database (version 3.1) and divided into training and test sets in a 7:3 ratio. Feature variables were selected using the LASSO-Boruta combined algorithm, and five machine learning (ML) models, including Extreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost),Light Gradient Boosting Machine(LGBM), Multilayer Perceptron (MLP), and Support Vector Machines (SVM), were subsequently developed using these variables. A comprehensive set of performance metrics was used to assess the predictive accuracy, calibration, and clinical applicability of these models. For the machine learning model with the best performance, we employed the SHapley Additive Explanations(SHAP) method to visualize the model.

RESULTS: Based on strict inclusion and exclusion criteria, a total of 3,156 elderly sepsis patients were enrolled in the study, with an SAE incidence rate of 48.7%. The mortality rate of elderly sepsis patients who developed SAE was significantly higher than that of patients in the non-SAE group (28.78% vs. 12.59%, P < 0.001). A total of 18 feature variables were selected for the construction of the ML model using the LASSO-Boruta combined algorithm. Compared to the other four models and traditional scoring systems, the XGBoost model demonstrated the best overall predictive performance, with Area Under the Curve(AUC)=0.898, accuracy=0.830, recall=0.819, F1-Score=0.820, specificity=0.840, and Precision=0.821. Furthermore, the results from the Decision Curve Analysis (DCA) and calibration curves demonstrated that the XGBoost model has significant clinical value and stable predictive performance. The ten-fold cross-validation method further confirmed the robustness and generalizability of the model. In addition, we simplified the model based on the SHAP feature importance ranking, and the results indicated that the simplified XGBoost model retains excellent predictive ability (AUC=0.858).

CONCLUSIONS: The XGBoost model effectively predicts SAE in elderly ICU patients and may serve as a reliable tool for clinicians to identify high-risk patients.

PMID:40313459 | PMC:PMC12043699 | DOI:10.3389/fcimb.2025.1545979