PREDICTION OF VENTILATOR-ASSOCIATED PNEUMONIA IN PATIENTS WITH ACUTE RESPIRATORY DISTRESS SYNDROME USING INTERPRETABLE MACHINE LEARNING ALGORITHMS

Authors

  • Wenwen Li First Affiliated Hospital of Dalian Medical University ,Dalian Medical University, Dalian 116011, Liaoning, China
  • Chengming Ma First Affiliated Hospital of Dalian Medical University ,Dalian Medical University, Dalian 116011, Liaoning, China
  • Shanshan Liu First Affiliated Hospital of Dalian Medical University ,Dalian Medical University, Dalian 116011, Liaoning, China
  • Xianyao Wan First Affiliated Hospital of Dalian Medical University ,Dalian Medical University, Dalian 116011, Liaoning, China

Keywords:

Machine Learning Algorithms; Ventilator-Associated Pneumonia; Acute Respiratory Distress Syndrome; Receiver Operating Characteristic; Shapley Additive Explanation

Abstract

Objective: This study aims to establish and validate predictive models utilizing machine learning algorithms for ventilator-associated pneumonia (VAP) in patients with acute respiratory distress syndrome (ARDS). Methods: 2,702 patients diagnosed with ARDS from Medical Information Mart for Intensive Care IV were included in this retrospective cohort study. The primary outcome was the VAP development. Total dataset was randomly split into the training set (80%) and testing set (20%). Predictors were screened out in the training set using the combination of Pearson correlation coefficient method, Mann-Whitney U test, and least absolute shrinkage and selection operator analysis. Based on selected predictors, logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGB), light gradient boosting machine (LightGBM), and CatBoost was developed. Receiver operating characteristic curve was used to assess the predicting performance of models. Using Shapley additive explanation (SHAP) method to interpret the model. Results: A total of 2,702 patients with ARDS were randomly divided: training set (N=2,161) and testing set (N=541). Systolic blood pressure, diastolic blood pressure, respiratory rate, positive end-expiratory pressure, simplified acute physiology score (SAPS II), ventilation duration, platelet, sodium, gender, ICU type, trauma injury, and vasopressor were identified as predictors. The area under the curve values of CatBoost model (0.715) in the testing set were higher than LR (0.666), SVM (0.556), RF (0.697), XGB (0.676), and LightGBM (0.683) prediction models. SHAP method also indicated that ventilation duration had the highest predictive value across all prediction horizons. An increase in ventilation duration has a positive effect and push the prediction toward VAP. Conclusion: CatBoost model can serve as a reliable tool for predicting VAP risk in patients with ARDS.

Published

2025-02-06