Proteomics-Enabled Deep Learning Machine Algorithms Can Enhance Prediction of Mortality

Matthias Unterhuber; Karl-Patrik Kresoja; Karl-Philipp Rommel; Christian Besler; Andrea Baragetti; Nora Klöting; Uta Ceglarek; Matthias Blüher; Markus Scholz; Alberico L Catapano; Holger Thiele; Philipp Lurz

doi:10.1016/j.jacc.2021.08.018

Proteomics-Enabled Deep Learning Machine Algorithms Can Enhance Prediction of Mortality

J Am Coll Cardiol. 2021 Oct 19;78(16):1621-1631. doi: 10.1016/j.jacc.2021.08.018.

Affiliations

¹ Department of Cardiology, Heart Center Leipzig at University Leipzig, Leipzig, Germany.
² Department of Pharmacological and Biomolecular Sciences, University of Milan, and I.R.C.C.S MultiMedica, Milan, Italy.
³ Medical Department III - Endocrinology, Nephrology, Rheumatology, University of Leipzig Medical Center, Leipzig, Germany; Helmholtz Institute for Metabolic, Obesity and Vascular Research (HI-MAG) of the Helmholtz, Leipzig, Germany.
⁴ Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, Leipzig University, Leipzig, Germany.
⁵ Institute for Medical Informatics, Statistics and Epidemiology, Medical Faculty, University of Leipzig, Leipzig, Germany; LIFE Research Center of Civilization Diseases, Leipzig, Germany.
⁶ Department of Cardiology, Heart Center Leipzig at University Leipzig, Leipzig, Germany. Electronic address: Philipp.Lurz@medizin.uni-leipzig.de.

PMID: 34649700
DOI: 10.1016/j.jacc.2021.08.018

Abstract

Background: Individualized risk prediction represents a prerequisite for providing personalized medicine.

Objectives: This study compared proteomics-enabled machine-learning (ML) algorithms with classical and clinical risk prediction methods for all-cause mortality in cohorts of patients with cardiovascular risk factors in the LIFE-Heart Study, followed by validation in the PLIC (Progressione della Lesione Intimale Carotidea) study.

Methods: Using the OLINK-Cardiovascular-II panel, 92 proteins were measured in a cohort of 1,998 individuals from the LIFE-Heart Study (derivation) and 772 subjects from the PLIC cohort (external validation). We constructed protein-based mortality prediction models using eXtreme Gradient Boosting (XGBoost) and a neural network, comparing the prediction performance with classical clinical risk scores (Systemic Coronary Risk Evaluation, Framingham), logistic and Cox regression models.

Results: All-cause mortality occurred in 156 (8%) patients in the internal validation and 68 (9%) patients in the external validation cohort, within a median follow-up of 10 and 11 years, respectively. On internal and external validation, the Framingham Risk Score achieved areas under the curve (AUCs) of 0.64 (95% CI: 0.59-0.68) and 0.65 (95% CI: 0.58-0.74), logistic regression AUCs of 0.65 (95% CI: 0.57-0.73) and 0.67 (95% CI: 0.59-0.74), Cox regression AUCs of 0.55 (95% CI: 0.51-0.59) and 0.65 (95% CI: 0.57-0.73), the XGBoost classifier AUCs of 0.83 (95% CI: 0.79-0.87) and 0.91 (95% CI: 0.86-0.95), the XGBoost survival estimator AUCs of 0.83 (95% CI: 0.79-0.87) and 0.93 (95% CI: 0.88-0.97), and the neural network AUCs of 0.87 (95% CI: 0.83-0.91) and 0.94 (95% CI: 0.90-0.98), respectively (modern vs classical ML: P < 0.001).

Conclusions: ML-driven multiprotein risk models outperform classical regression models and clinical scores for prediction of all-cause mortality in patients at increased cardiovascular risk.

Keywords: deep learning; machine learning; mortality prediction; proteomics; risk score.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Aged
Algorithms*
Cardiovascular Diseases / mortality*
Cohort Studies
Deep Learning*
Female
Follow-Up Studies
Humans
Male
Middle Aged
Models, Cardiovascular
Neural Networks, Computer
Proteomics*
Risk Assessment*