SHAP-explained machine-learning model for high-risk gastric cancer identification
  • Oh, Hyun Jin
  • Kim, Chung Ho
  • Jun, Jae Kwan
  • Suh, Mina
  • Choi, Kui Son
  • ... Park, Bomi
  • 외 1명
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Introduction: Gastric cancer (GC) remains a major public health concern in Asia. Risk prediction tailored to regional biological features such as Helicobacter pylori (H. pylori) status and high-risk mucosal findings such as atrophic gastritis (AG) and intestinal metaplasia (IM) may help improve the screening workflow.Methods: Using a large, real-world, nationwide screening cohort with available endoscopic AG/IM codes, we developed 2-year GC risk prediction models that integrate AG/IM with regional demographic and lifestyle factors. We compared a conventional Cox proportional hazards model (CPHM) with the following machine learning (ML) approaches: extreme gradient boosting (XGBoost), decision tree (DT), and logistic regression (LR). Discrimination and calibration were evaluated through internal and external validations. Model interpretability was assessed using Shapley Additive Explanations (SHAP).Results: The XGBoost model demonstrated the best overall performance, achieving an AUROC of 0.764 (95% CI, 0.755–0.767), a sensitivity of 0.607 (95% CI, 0.560–0.650), and a specificity of 0.746 (95% CI, 0.744–0.750) in the internal validation. In the external validation cohort, XGBoost also showed the highest discrimination with an AUROC of 0.708 (95% CI, 0.682–0.884), a sensitivity of 0.666 (95% CI, 0.470–0.830), and a specificity of 0.597 (95% CI, 0.590–0.600). SHAP analysis consistently identified Helicobacter pylori infection, age, sex, smoking, and atrophic gastritis/intestinal metaplasia (AG/IM) as the major contributors to increased predicted gastric cancer risk.Discussion: This externally validated and interpretable short-term GC risk model incorporating endoscopically ascertained AG/IM could provide a practical approach for informing risk-adapted screening workflows. The model could help identify individuals at a higher predicted risk for prospective evaluation and closer clinical review. In addition, SHAP clarifies the main contributors to each prediction by highlighting factors most strongly associated with a higher predicted risk.

키워드

atrophic gastritisgastric cancer<italic>Helicobacter pylori</italic>intestinal metaplasiarisk predictionShapley Additive Explanations
제목
SHAP-explained machine-learning model for high-risk gastric cancer identification
저자
Oh, Hyun JinKim, Chung HoJun, Jae KwanSuh, MinaChoi, Kui SonChoi, Il JuPark, Bomi
DOI
10.3389/fonc.2026.1732072
발행일
2026-03
유형
Article
저널명
Frontiers in Oncology
16

파일 다운로드