Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Heterogeneity in phenotype, disease progression and drug response in type 2 diabetes

Subjects

Abstract

Type 2 diabetes (T2D) is a complex chronic disease characterized by considerable phenotypic heterogeneity. In this study, we applied a reverse graph embedding method to routinely collected data from 23,137 Scottish patients with newly diagnosed diabetes to visualize this heterogeneity and used partitioned diabetes polygenic risk scores to gain insight into the underlying biological processes. Overlaying risk of progression to outcomes of insulin requirement, chronic kidney disease, referable diabetic retinopathy and major adverse cardiovascular events, we show how these risks differ by patient phenotype. For example, patients at risk of retinopathy are phenotypically different from those at risk of cardiovascular events. We replicated our findings in the UK Biobank and the ADOPT clinical trial, also showing that the pattern of diabetes drug monotherapy response differs for different drugs. Overall, our analysis highlights how, in a European population, underlying phenotypic variation drives T2D onset and affects subsequent diabetes outcomes and drug response, demonstrating the need to incorporate these factors into personalized treatment approaches for the management of T2D.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: A visual representation of the phenotypic characteristics of 23,137 patients at diagnosis of T2D.
Fig. 2: Visualizing the heterogeneity in diabetes progression in Scottish patients with T2D.
Fig. 3: Visualizing the heterogeneity in diabetes progression in UKBB data.
Fig. 4: Visualizing the heterogeneity in anti-diabetic drug failure using ADOPT trial data.
Fig. 5: Distribution of T2D pPSs across the phenotypic tree.

Similar content being viewed by others

Data availability

The data that support the findings of this study are from anonymized real-world medical records available through the Scottish Care Information-Diabetes Collaboration, Tayside & Fife, Scotland unit (https://www.sci-diabetes.scot.nhs.uk/). UKBB primary care data are not publicly available but are accessible for research on approval from the UKBB (https://www.ukbiobank.ac.uk/enable-your-research). ADOPT trial data are not publicly available due to governance limitations but are available for research by approval from GlaxoSmithKline.

Code availability

All the R code that supports this analysis is specific to the Scottish Care Information-Diabetes Collaboration data and UKBB and ADOPT trial variables; thus, data fields are not made available. Codes used for implementing the DDRTree (version 0.1.5) algorithm are available publicly in the ‘monocle’ package in the Bioconductor repository (https://www.bioconductor.org/packages/release/bioc/html/monocle.html).

References

  1. McCarthy, M. I. Painting a new picture of personalised medicine for diabetes. Diabetologia 60, 793–799 (2017).

    Article  Google Scholar 

  2. Pearson, E. R. Type 2 diabetes: a multifaceted disease. Diabetologia 62, 1107–1112 (2019).

    Article  Google Scholar 

  3. Ahlqvist, E. et al. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 6, 361–369 (2018).

    Article  Google Scholar 

  4. Zaharia, O. P. et al. Risk of diabetes-associated diseases in subgroups of patients with recent-onset diabetes: a 5-year follow-up study. Lancet Diabetes Endocrinol. 7, 684–694 (2019).

    Article  Google Scholar 

  5. Zou, X., Zhou, X., Zhu, Z. & Ji, L. Novel subgroups of patients with adult-onset diabetes in Chinese and US populations. Lancet Diabetes Endocrinol. 7, 9–11 (2019).

  6. Anjana, R. M. et al. Novel subgroups of type 2 diabetes and their association with microvascular outcomes in an Asian Indian population: a data-driven cluster analysis: the INSPIRED study. BMJ Open Diabetes Res. Care 8, e001506 (2020).

    Article  Google Scholar 

  7. Prasad, R. B. et al. Subgroups of patients with young-onset type 2 diabetes in India reveal insulin deficiency as a major driver. Diabetologia 65, 65–78 (2022).

  8. Mansour Aly, D. et al. Genome-wide association analyses highlight etiological differences underlying newly defined subtypes of diabetes. Nat. Genet. 53, 1534–1542 (2021).

    Article  CAS  Google Scholar 

  9. Dennis, J. M., Shields, B. M., Henley, W. E., Jones, A. G. & Hattersley, A. T. Disease progression and treatment response in data-driven subgroups of type 2 diabetes compared with models based on simple clinical features: an analysis using clinical trial data. Lancet Diabetes Endocrinol. 7, 442–451 (2019).

    Article  Google Scholar 

  10. Mahajan, A. et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes article. Nat. Genet. 50, 559–571 (2018).

    Article  CAS  Google Scholar 

  11. Udler, M. S. et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis. PLoS Med. 15, e1002654 (2018).

  12. Udler, M. S., McCarthy, M. I., Florez, J. C. & Mahajan, A. Genetic risk scores for diabetes diagnosis and precision medicine. Endocr. Rev. 40, 1500–1520 (2019).

    Article  Google Scholar 

  13. Tan, V. Y. F. & Févotte, C. Automatic relevance determination in nonnegative matrix factorization with the β-divergence. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1592–1605 (2013).

    Article  Google Scholar 

  14. Scottish Care Information-Diabetes Collaboration (SCI-Diabetes). https://www.sci-diabetes.scot.nhs.uk/

  15. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).

    Article  CAS  Google Scholar 

  16. Mao, Q., Wang, L., Goodison, S. & Sun, Y. Dimensionality reduction via graph structure learning. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery. 765–774 (2015).

  17. Hubert, L. & Arabie, P. Comparing partitions. J. Classif. 2, 193–218 (1985).

    Article  Google Scholar 

  18. UK Biobank: a large scale prospective epidemiological resource. NHS Health Research Authority. https://www.hra.nhs.uk/planning-and-improving-research/application-summaries/research-summaries/uk-biobank-a-large-scale-prospective-epidemiological-resource/

  19. Viberti, G. et al. A Diabetes Outcome Progression Trial (ADOPT): an international multicenter study of the comparative efficacy of rosiglitazone, glyburide, and metformin in recently diagnosed type 2 diabetes. Diabetes Care 25, 1737–1743 (2002).

    Article  CAS  Google Scholar 

  20. Austin, P. C., Lee, D. S. & Fine, J. P. Introduction to the analysis of survival data in the presence of competing risks. Circulation 133, 601–609 (2016).

    Article  Google Scholar 

  21. Turner, R. et al. Tight blood pressure control and risk of macrovascular and microvascular complications in type 2 diabetes: UKPDS 38. BMJ 317, 703–713 (1998).

    Article  Google Scholar 

  22. ACCORD Study Group et al. Effects of medical therapies on retinopathy progression in type 2 diabetes. N. Engl. J. Med. 363, 233–244 (2010).

    Article  Google Scholar 

  23. Bizzotto, R. et al. Processes underlying glycemic deterioration in type 2 diabetes: an IMI DIRECT study. Diabetes Care 44, 511–518 (2021).

    Article  CAS  Google Scholar 

  24. Dennis, J. M. et al. Sex and BMI alter the benefits and risks of sulfonylureas and thiazolidinediones in type 2 diabetes: a framework for evaluating stratification using routine clinical and individual trial data. Diabetes Care 41, 1844–1853 (2018).

    Article  Google Scholar 

  25. Owen, K. R., Donohoe, M., Ellard, S. & Hattersley, A. T. Response to treatment with rosiglitazone in familial partial lipodystrophy due to a mutation in the LMNA gene. Diabet. Med. 20, 823–827 (2003).

    Article  Google Scholar 

  26. Li, L. et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174 (2015).

  27. Slieker, R. C. et al. Replication and cross-validation of type 2 diabetes subtypes based on clinical variables: an IMI-RHAPSODY study. Diabetologia 64, 1982–1989 (2021).

  28. Lugner, M. et al. Comparison between data-driven clusters and models based on clinical features to predict outcomes in type 2 diabetes: nationwide observational study. Diabetologia 74, 1973–1981 (2021).

  29. Buse, J. B. et al. 2019 update to: Management of hyperglycaemia in type 2 diabetes, 2018. A consensus report by the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetologia 63, 221–228 (2020).

    Article  Google Scholar 

  30. Hébert, H. L. et al. Cohort profile: Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS). Int J. Epidemiol. 47, 380–381 (2018).

    Article  Google Scholar 

  31. Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  Google Scholar 

  32. UK Biobank. UK Biobank Primary Care Linked Data. Version 1.0. https://biobank.ndph.ox.ac.uk/showcase/showcase/docs/primary_care_data.pdf (2019).

  33. Mao, Q., Wang, L., Tsang, I. W. & Sun, Y. Principal graph and structure learning based on reversed graph embedding. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2227–2241 (2017).

    Article  Google Scholar 

  34. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

    Article  CAS  Google Scholar 

  35. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods 14, 309–315 (2017).

    Article  CAS  Google Scholar 

  36. R Core Team. R: a language and environment for statistical computing. https://www.R-project.org (R Foundation for Statistical Computing, 2018).

  37. Zhou, K. et al. Clinical and genetic determinants of progression of type 2 diabetes: a direct study. Diabetes Care 37, 718–724 (2014).

    Article  CAS  Google Scholar 

  38. Zachariah, S., Wykes, W. & Yorston, D. Grading diabetic retinopathy (DR) using the Scottish grading protocol. Community Eye Health 28, 72–73 (2015).

    PubMed  PubMed Central  Google Scholar 

  39. Andrassy, K. M. Comments on: KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int. 84, 622–623 (2013).

    Article  Google Scholar 

  40. Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612 (2009).

    Article  Google Scholar 

  41. Fine, J. P. & Gray, R. J. A proportional hazards model for the subdistribution of a competing risk. J. Am. Stat. Assoc. 94, 496–509 (1999).

    Article  Google Scholar 

  42. Harrell, F. E., Lee, K. L. & Mark, D. B. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat. Med. 15, 361–387 (1996).

    Article  Google Scholar 

  43. Wolbers, M., Koller, M. T., Witteman, J. C. M. & Steyerberg, E. W. Prognostic models with competing risks methods and application to coronary risk prediction. Epidemiology 20, 555–561 (2009).

    Article  Google Scholar 

  44. Ord, J. K. & Getis, A. Local spatial autocorrelation statistics: distributional issues and an application. Geogr. Anal. 27, 286–306 (1995).

    Article  Google Scholar 

Download references

Acknowledgements

We thank all personnel at the Health Informatics Centre for linking different datasets, maintaining all statistical packages and providing the data. The research was supported by the National Institute for Health Research using Official Development Assistance funding (INSPIRED 16/136/102) and Health Data Research UK, which receives its funding from HDR UK Ltd. (HDR-5012), funded by the UK Medical Research Council, the Engineering and Physical Sciences Research Council, the Economic and Social Research Council, the Department of Health and Social Care (England), the Chief Scientist Office of the Scottish Government Health and Social Care Directorates, the Health and Social Care Research and Development Division (Welsh Government), the Public Health Agency (Northern Ireland), the British Heart Foundation and the Wellcome Trust. The views expressed in this publication are those of the authors and not necessarily those of the National Institute for Health Research or the UK Department of Health and Social Care. J.M.D. is supported by an independent fellowship funded by Research England’s Expanding Excellence in England (E3) fund. E.R.P. holds a Wellcome Trust New Investigator Award. This research was funded in whole or in part by the Wellcome Trust (102820/Z/13/Z).

Author information

Authors and Affiliations

Authors

Contributions

E.R.P., A.W.A., C.B. and A.T.N. conceived and designed the study. A.L.R., S.H., L.D., J.M.D., A.D. and A.T.N. were involved in data preparation and data analysis. E.R.P., L.D. and A.T.N. interpreted the results and wrote the manuscript. A.D., S.G., A.W.A., R.M.A., V.M., C.N.A.P., R.M.C., A.S.F.D., J.M.D., A.T.H. and C.B. provided critical inputs to the revision of the manuscript.

Corresponding author

Correspondence to Ewan R. Pearson.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Medicine thanks Jose Florez, Robert Sladek and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Study sample selection flow chart.

Study sample selection flow chart from (a) Tayside and Fife (SCI-DC), (b) UKBB and (c) ADOPT data (data flow for merging all phenotypes).

Extended Data Fig. 2 A visual representation of the phenotypic characteristics of (n = 7332) patients at diagnosis of T2D from UKBB.

A mapping function was used to position individuals in UKBB (n = 7332) onto the Scottish (reference tree) tree. The phenotype values are overlaid on the tree structure to visualise the distribution of nine phenotypes (HbA1c, BMI, HDL-c, TC, TG, ALT, creatinine, and SBP and DBP) over the reduced tree structure. Each point in the figure represents one individual. The magenta colour of the point indicates a higher value of the phenotypic variable for that individual and the green colour indicates lower values.

Extended Data Fig. 3 A visual representation of the phenotypic characteristics of (n = 4150) patients at diagnosis of T2D from ADOPT data.

A mapping function was used to position individuals in ADOPT (n = 4150) to the Scottish (reference) tree. The phenotype values are overlaid on the tree structure to visualise the distribution of nine phenotypes (HbA1c, BMI, HDL-c, TC, TG, ALT, creatinine, and SBP and DBP) over the reduced tree structure. Each point in the figure represents one individual. The magenta colour of the point indicates a higher value of the phenotypic variable for that individual and the green colour indicates lower values. Grey areas represent Scottish patients not represented in the clinical trial.

Extended Data Fig. 4 Distribution of C-peptide, adiponectin, and leptin over the tree structure.

For a sub-population of Scottish patients, c-peptide (a), adiponectin (b) and leptin (c) measurements were available and are overlaid on the tree. The magenta color indicates higher values of the markers, and the green color indicates lower values (A-C). (d) shows the beta for the linear regression (with 95% CI) of C-peptide against the tree dimensions. (e) shows the beta for the linear regression (with 95% CI) of Adiponectin against the tree dimensions. (f) shows the beta for the linear regression (with 95% CI) of leptin against the tree dimensions. C-peptide and leptin showed a significant positive correlation with dimension 1 and negative correlation with dimension 2 whilst Adiponectin was significantly positively correlated with dimension 2 and negatively correlated with dimension 1.

Extended Data Fig. 5 Distribution of measured beta-cell function and insulin resistance across the tree.

Panel a & b: A visual representation of the phenotypic characteristics (log HOMA B and log HOMA IR) of n = 4150 patients at diagnosis of T2D from ADOPT trial. A mapping function was used to position individuals in ADOPT trial to the Scottish (reference) tree. Panel c shows the beta for the regression of each phenotype against the tree dimensions with 95% CI (n = 4150) – HOMA B showed a positive correlation with dimension 1 and negative correlation with dimension 2; while HOMA IR positively correlated with dimension 1 and negatively correlated with dimension 2.

Extended Data Fig. 6 Distribution of albuminuria at T2D diagnosis over the tree structure.

Albuminuria at T2D diagnosis over the tree structure (N = 15977; individuals with albumin measurement close to T2D diagnosis), B: Linear regression estimates (with 95% CI) (n = 15977) between the DDRTree dimensions and albuminuria showing the association between phenotypes and dimensions [The Global Moran’s I value: 0.003, p value<0.0001] (T2D: Type 2 Diabetes, DDRTree: Discriminative Dimensionality Reduction via Learning a Tree).

Extended Data Fig. 7 Pattern of distribution of local Moran’s I of phenotype and diabetes outcome event probability.

a) Local Moran’s I values for nine phenotypes at T2D diagnosis, magenta color indicates high values and green indicate lower values of Local Moran’s I. (b) Local Moran’s I values for diabetes progression event probabilities indicated by insulin initiation, CKD, MACE and DR, Magenta color indicate high values and green indicate lower values.

Extended Data Fig. 8 Visualizing the heterogeneity in diabetes progression in Scottish patients with T2DM.

All predictions are from models with DDRTree dimensions fitted with a spline function. a. Predicted probability of insulin initiation (use of insulin for more than 6 months or a clinical requirement for insulin, indicated as two or more HbA1c reading > =8.5% more than three months apart while taking two or more oral antihyperglycaemic agents) at 5 year from the diagnosis of T2D. b. Probability of severe or referrable (R3/R4) diabetic retinopathy at 5-year period. c. Probability of incident major adverse cardiac events (identified from SMR based on ICD 9 and ICD 10 codes) at 5-year period. d. Probability of incident chronic kidney disease (eGFR < = 60 ml/min/1.73m2 on at least 2 readings which were 90 days apart) at 5-year period. All outcomes (A-D) probabilities were generated from a competing risk model constructed with the spline function of DDRTree dimensions.

Extended Data Fig. 9 Pattern of distribution of event probabilities derived from competing risk models constructed with continuous variables as covariates.

a. Predicted probability of insulin initiation (use of insulin for more than 6 months or a clinical requirement for insulin, indicated as two or more HbA1c reading > =8.5% more than three months apart while taking two or more oral antidiabetic agents) at 5- year period from the diagnosis of T2D (n = 22595). b. Probability of incident diabetic retinopathy (R3/R4) at 5-year period (n = 22759). c. Probability of incident major adverse cardiac events (identified from SMR and GRO based on ICD 9 and ICD 10 codes) at 5-year period (n = 18239). d. Probability of incident chronic kidney disease (eGFR < = 60 ml/min/1.73m2 on at least 2 readings which were 90 days apart) at 5-year period (n = 19956). For all outcomes (A-D) probabilities were generated from a competing risk model constructed with continuous variables (age of diagnosis, sex, HbA1c, BMI, HDL-C, TG, TC, ALT, BP, and Creatinine) and competing risk of death. e. Linear regression estimates (with 95% CI) between the DDRTree dimensions and the four diabetes outcomes probability f. Spatial autocorrelation diabetes outcome probability; The Moran’s I statistic is shown on the X-axis, with higher values representing phenotypes that are more strongly autocorrelated; all values are p < 0.001.

Extended Data Fig. 10

Graphical representation of the complete analysis.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nair, A.T.N., Wesolowska-Andersen, A., Brorsson, C. et al. Heterogeneity in phenotype, disease progression and drug response in type 2 diabetes. Nat Med 28, 982–988 (2022). https://doi.org/10.1038/s41591-022-01790-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-022-01790-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing