Proteomics-Enabled Deep Learning Machine Algorithms Can Enhance Prediction of Mortality

J Am Coll Cardiol. 2021 Oct 19;78(16):1621-1631. doi: 10.1016/j.jacc.2021.08.018.

Abstract

Background: Individualized risk prediction represents a prerequisite for providing personalized medicine.

Objectives: This study compared proteomics-enabled machine-learning (ML) algorithms with classical and clinical risk prediction methods for all-cause mortality in cohorts of patients with cardiovascular risk factors in the LIFE-Heart Study, followed by validation in the PLIC (Progressione della Lesione Intimale Carotidea) study.

Methods: Using the OLINK-Cardiovascular-II panel, 92 proteins were measured in a cohort of 1,998 individuals from the LIFE-Heart Study (derivation) and 772 subjects from the PLIC cohort (external validation). We constructed protein-based mortality prediction models using eXtreme Gradient Boosting (XGBoost) and a neural network, comparing the prediction performance with classical clinical risk scores (Systemic Coronary Risk Evaluation, Framingham), logistic and Cox regression models.

Results: All-cause mortality occurred in 156 (8%) patients in the internal validation and 68 (9%) patients in the external validation cohort, within a median follow-up of 10 and 11 years, respectively. On internal and external validation, the Framingham Risk Score achieved areas under the curve (AUCs) of 0.64 (95% CI: 0.59-0.68) and 0.65 (95% CI: 0.58-0.74), logistic regression AUCs of 0.65 (95% CI: 0.57-0.73) and 0.67 (95% CI: 0.59-0.74), Cox regression AUCs of 0.55 (95% CI: 0.51-0.59) and 0.65 (95% CI: 0.57-0.73), the XGBoost classifier AUCs of 0.83 (95% CI: 0.79-0.87) and 0.91 (95% CI: 0.86-0.95), the XGBoost survival estimator AUCs of 0.83 (95% CI: 0.79-0.87) and 0.93 (95% CI: 0.88-0.97), and the neural network AUCs of 0.87 (95% CI: 0.83-0.91) and 0.94 (95% CI: 0.90-0.98), respectively (modern vs classical ML: P < 0.001).

Conclusions: ML-driven multiprotein risk models outperform classical regression models and clinical scores for prediction of all-cause mortality in patients at increased cardiovascular risk.

Keywords: deep learning; machine learning; mortality prediction; proteomics; risk score.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Algorithms*
  • Cardiovascular Diseases / mortality*
  • Cohort Studies
  • Deep Learning*
  • Female
  • Follow-Up Studies
  • Humans
  • Male
  • Middle Aged
  • Models, Cardiovascular
  • Neural Networks, Computer
  • Proteomics*
  • Risk Assessment*