Estimation of low-density lipoprotein cholesterol levels using machine learning

Int J Cardiol. 2022 Apr 1:352:144-149. doi: 10.1016/j.ijcard.2022.01.029. Epub 2022 Jan 20.

Abstract

Background: Low-density lipoprotein-cholesterol (LDL-C) is used as a threshold and target for treating dyslipidemia. Although the Friedewald equation is widely used to estimate LDL-C, it has been known to be inaccurate in the case of high triglycerides (TG) or non-fasting states. We aimed to propose a novel method to estimate LDL-C using machine learning.

Methods: Using a large, single-center electronic health record database, we derived a ML algorithm to estimate LDL-C from standard lipid profiles. From 1,029,572 cases with both standard lipid profiles (total cholesterol, high-density lipoprotein-cholesterol, and TG) and direct LDL-C measurements, 823,657 tests were used to derive LDL-C estimation models. Patient characteristics such as sex, age, height, weight, and other laboratory values were additionally used to create separate data sets and algorithms.

Results: Machine learning with gradient boosting (LDL-CX) and neural network (LDL-CN) showed better correlation with directly measured LDL-C, compared with conventional methods (r = 0.9662, 0.9668, 0.9563, 0.9585; for LDL-CX, LDL-CN, Friedewald [LDL-CF], and Martin [LDL-CM] equations, respectively). The overall bias of LDL-CX (-0.27 mg/dL, 95% CI -0.30 to -0.23) and LDL-CN (-0.01 mg/dL, 95% CI -0.04-0.03) were significantly smaller compared with both LDL-CF (-3.80 mg/dL, 95% CI -3.80 to -3.60) or LDL-CM (-2.00 mg/dL, 95% CI -2.00 to -1.94), especially at high TG levels.

Conclusions: Machine learning algorithms were superior in estimating LDL-C compared with the conventional Friedewald or the more contemporary Martin equations. Through external validation and modification, machine learning could be incorporated into electronic health records to substitute LDL-C estimation.

Keywords: Cost-effectiveness; Hypercholesterolemia; Low-density lipoprotein cholesterol; Machine-learning; Triglycerides.

MeSH terms

  • Algorithms
  • Cholesterol, HDL
  • Cholesterol, LDL / analysis*
  • Dyslipidemias / diagnosis*
  • Humans
  • Machine Learning*
  • Triglycerides

Substances

  • Cholesterol, HDL
  • Cholesterol, LDL
  • Triglycerides