Clustering by patients-related characteristics identifies specific phenotypes with different prognosis after resection of non-small cell lung cancer: a French nationwide study from the Epithor database
database[Title] 2026-04-25
Thorax. 2026 Apr 17:thorax-2025-224424. doi: 10.1136/thorax-2025-224424. Online ahead of print.
ABSTRACT
INTRODUCTION: Lung cancer remains the leading cause of cancer mortality worldwide despite advances in treatment. Patient-related factors beyond tumour characteristics may influence prognosis but are seldom integrated into staging or risk stratification.
METHODS: We analysed data from 59 101 patients with resectable non-small cell lung cancer (NSCLC) in the French nationwide Epithor database. Using an unsupervised machine learning approach, we applied principal component analysis followed by k-means clustering on 13 patient-specific clinical variables excluding tumour or treatment features, to identify distinct patient phenotypes. Survival differences were evaluated with Kaplan-Meier analysis and Cox proportional hazards models.
RESULTS: Five distinct patient phenotypes emerged, reflecting differing distributions of sex, age, pulmonary function, comorbidity and performance status. These phenotypes also showed varied distributions of pathological stage and histology, confirming their clinical relevance. Survival outcomes differed significantly between clusters, with 5-year overall survival ranging from 76.6% (95% CI 75.8% to 77.3%) to 53.8% (95% CI 52.9% to 54.6%). Using the lowest-risk cluster (Cluster 0) as reference, crude HRs for mortality were significantly different between clusters: Cluster 1 (HR 1.48 (95% CI 1.42 to 1.54)), Cluster 2 (HR 1.94 (95% CI 1.84 to 2.03)), Cluster 3 (HR 2.45 (95% CI 2.36 to 2.55)) and Cluster 4 (HR 1.95 (95% CI 1.87 to 2.04)). Cluster assignment remained an independent predictor of survival after adjustment for usual confounders (p<0.001).
CONCLUSIONS: Phenotype-based clustering of non-tumour patient characteristics identifies clinically meaningful subgroups with distinct survival patterns after NSCLC surgery. Incorporating these phenotypes into preoperative evaluation may enhance risk stratification and support personalised treatment strategies alongside traditional tumour staging.
PMID:41997853 | DOI:10.1136/thorax-2025-224424