Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer

Abstract

Although it has long been known that the immune cell composition has a strong prognostic and predictive value in colorectal cancer (CRC), scoring systems such as the immunoscore (IS) or quantification of intraepithelial lymphocytes are only slowly being adopted into clinical routine use and have their limitations. To address this we established and evaluated a multistain deep learning model (MSDLM) utilizing artificial intelligence (AI) to determine the AImmunoscore (AIS) in more than 1,000 patients with CRC. Our model had high prognostic capabilities and outperformed other clinical, molecular and immune cell-based parameters. It could also be used to predict the response to neoadjuvant therapy in patients with rectal cancer. Using an explainable AI approach, we confirmed that the MSDLM’s decisions were based on established cellular patterns of anti-tumor immunity. Hence, the AIS could provide clinicians with a valuable decision-making tool based on the tumor immune microenvironment.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Clinical characteristics and CONSORT diagrams for the prognostic cohorts.
Fig. 2: Clinical characteristics and CONSORT diagrams for the neoadjuvant cohort.
Fig. 3: Training and cross-validation of the MSDLM.
Fig. 4: Determination and performance of the AIS.
Fig. 5: Assessment of the MSDLM using xAI.
Fig. 6: Predictive performance of the MSDLM in rectal cancer.

Similar content being viewed by others

Data availability

No publicly available datasets were used in this study. The datasets and models were generated from institutional cohorts (Technical University Munich cohort, Comprehensive Cancer Centre Erlangen-EMN cohort, Mainz cohort, and the neoadjuvant cohort consisting of FFPE material in the form of TMAs) and cannot be made publicly available due to general data protection regulations and institutional guidelines. Example data from another use case (for trying out our approach) is available at https://zenodo.org/record/6791937.

Code availability

An open source version of the code base is available at https://github.com/AGFoersch/MultiStainDeepLearning. For more information please contact the corresponding author at sebastian.foersch@unimedizin-mainz.de.

References

  1. Galon, J. et al. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964 (2006).

    Article  CAS  PubMed  Google Scholar 

  2. Bindea, G. et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity 39, 782–795 (2013).

    Article  CAS  PubMed  Google Scholar 

  3. Pages, F. et al. International validation of the consensus Immunoscore for the classification of colon cancer: a prognostic and accuracy study. Lancet 391, 2128–2139 (2018).

    Article  PubMed  Google Scholar 

  4. Nagtegaal, I. D. et al. The 2019 WHO classification of tumours of the digestive system. Histopathology 76, 182–188 (2020).

    Article  PubMed  Google Scholar 

  5. Argiles, G. et al. Localised colon cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 31, 1291–1305 (2020).

    Article  CAS  PubMed  Google Scholar 

  6. Denkert, C. et al. Standardized evaluation of tumor-infiltrating lymphocytes in breast cancer: results of the ring studies of the International Immuno-oncology Biomarker Working Group. Mod. Pathol. 29, 1155–1164 (2016).

    Article  CAS  PubMed  Google Scholar 

  7. Angell, H. K., Bruni, D., Barrett, J. C., Herbst, R. & Galon, J. The immunoscore: colon cancer and beyond. Clin. Cancer Res. 26, 332–339 (2020).

    Article  CAS  PubMed  Google Scholar 

  8. Sinicrope, F. A. & Graham, R. P. Tumor-infiltrating lymphocytes for prognostic stratification in nonmetastatic colon cancer: are we there yet? JAMA Oncol. 7, 969–970 (2021).

    Article  PubMed  Google Scholar 

  9. Schurch, C. M. et al. Coordinated cellular neighborhoods orchestrate antitumoral immunity at the colorectal cancer invasive front. Cell 182, 1341–1359 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Nearchou, I. P. et al. Automated analysis of lymphocytic infiltration, tumor budding, and their spatial relationship improves prognostic accuracy in colorectal cancer. Cancer Immunol. Res. 7, 609–620 (2019).

    Article  CAS  PubMed  Google Scholar 

  11. Vayrynen, J. P. et al. Spatial organization and prognostic significance of NK and NKT-like cells via multimarker analysis of the colorectal cancer microenvironment. Cancer Immunol. Res. 10, 215–227 (2022).

    Article  CAS  PubMed  Google Scholar 

  12. Ribeiro, A. H. et al. Automatic diagnosis of the 12-lead ECG using a deep neural network. Nat. Commun. 11, 1760 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. De Fauw, J. et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 24, 1342–1350 (2018).

    Article  PubMed  Google Scholar 

  14. Ardila, D. et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25, 954–961 (2019).

    Article  CAS  PubMed  Google Scholar 

  15. Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Saldanha, O. L. et al. Swarm learning for decentralized artificial intelligence in cancer histopathology. Nat. Med. 28, 1232–1239 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Woerl, A. C. et al. Deep learning predicts molecular subtype of muscle-invasive bladder cancer from conventional histopathological slides. Eur. Urol. 78, 256–264 (2020).

    Article  CAS  PubMed  Google Scholar 

  18. Cifci, D., Foersch, S. & Kather, J. N. Artificial intelligence to identify genetic alterations in conventional histopathology. J. Pathol. 257, 430–444 (2022).

    Article  PubMed  Google Scholar 

  19. Foersch, S. et al. Deep learning for diagnosis and survival prediction in soft tissue sarcoma. Ann. Oncol. 32, 1178–1187 (2021).

    Article  CAS  PubMed  Google Scholar 

  20. Schulz, S. et al. Multimodal deep learning for prognosis prediction in renal cancer. Front. Oncol. 11, 788740 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Huang, S. C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit. Med. 3, 136 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Chen, R. J. et al. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41, 757–770 (2022).

    Article  PubMed  Google Scholar 

  23. Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 865–878 (2022).

    Article  CAS  PubMed  Google Scholar 

  24. Bubendorf, L., Nocito, A., Moch, H. & Sauter, G. Tissue microarray (TMA) technology: miniaturized pathology archives for high-throughput in situ studies. J. Pathol. 195, 72–79 (2001).

    Article  CAS  PubMed  Google Scholar 

  25. Skrede, O. J. et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet 395, 350–360 (2020).

    Article  CAS  PubMed  Google Scholar 

  26. El Sissy, C. et al. A diagnostic biopsy-adapted immunoscore predicts response to neoadjuvant treatment and selects patients with rectal cancer eligible for a watch-and-wait strategy. Clin. Cancer Res. 26, 5198–5207 (2020).

    Article  CAS  PubMed  Google Scholar 

  27. Galon, J. et al. Towards the introduction of the ‘Immunoscore’ in the classification of malignant tumours. J. Pathol. 232, 199–209 (2014).

    Article  CAS  PubMed  Google Scholar 

  28. Kather, J. N. et al. Topography of cancer-associated immune cells in human solid tumors. Elife 7, e36967 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Saltz, J. et al. Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Rep. 23, 181–193 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Reichling, C. et al. Artificial intelligence-guided tissue analysis combined with immune infiltrate assessment predicts stage III colon cancer outcomes in PETACC08 study. Gut 69, 681–690 (2020).

    Article  CAS  PubMed  Google Scholar 

  31. Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer 3, 723–733 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Fridman, W. H. et al. B cells and cancer: to B or not to B? J. Exp. Med. 218, e20200851 (2021).

    Article  CAS  PubMed  Google Scholar 

  33. Aras, S. & Zaidi, M. R. TAMeless traitors: macrophages in cancer progression and metastasis. Br. J. Cancer 117, 1583–1591 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Wulczyn, E. et al. Interpretable survival prediction for colorectal cancer using deep learning. NPJ Digit. Med. 4, 71 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Brockmoeller, S. et al. Deep learning identifies inflamed fat as a risk factor for lymph node metastasis in early colorectal cancer. J. Pathol. 256, 269–281 (2022).

    Article  CAS  PubMed  Google Scholar 

  36. Kleppe, A. et al. Designing deep learning studies in cancer diagnostics. Nat. Rev. Cancer 21, 199–211 (2021).

    Article  CAS  PubMed  Google Scholar 

  37. Dworak, O., Keilholz, L. & Hoffmann, A. Pathological features of rectal cancer after preoperative radiochemotherapy. Int. J. Colorectal Dis. 12, 19–23 (1997).

    Article  CAS  PubMed  Google Scholar 

  38. Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Reinhard, E., Adhikhmin, M., Gooch, B. & Shirley, P. Color transfer between images. IEEE Comput. Graph. Appl. 21, 34–41 (2001).

    Article  Google Scholar 

  40. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).

  41. Selvaraju, R.R., et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV) 618–626 (IEEE, 2020).

Download references

Acknowledgements

S.F. was supported by the Federal Ministry of Education and Research (16SV8167), the Stage-I-Program of the University Medical Center Mainz, the Mainz Research School of Translational Biomedicine (TransMed) and the Manfred-Stolte-Foundation. J.N.K. is supported by the German Federal Ministry of Health (DEEP LIVER, ZMVI1-2520DAT111) and the Max-Eder-Program of the German Cancer Aid (grant 70113864). Aspects of this work are part of the medical doctoral theses of S.S., F.K. and K.T.

Author information

Authors and Affiliations

Authors

Contributions

Conception and design: S.F., C. Glasner, A.-C.W., D.-C.W.; acquisition of data: S.F., M.E., D.-C.W., S.S., F.K., K.T., C. Geppert, M.J.; analysis and interpretation of data: S.F., C. Glasner, D.-C.W., M.E., M.J.; drafting of the manuscript: S.F.; critical revision of the manuscript for important intellectual content: C. Glasner, M.E., D.-C.W., J.N.K., M.J.; statistical analysis: S.F., C. Glasner, A.-C.W.; obtaining funding: S.F., W.R.; administrative, technical or material support: A.F., A. Hartmann, A. Heintz, W.W., M.K., C. Geppert, M.J.; supervision: S.F., W.R., M.J.

Corresponding author

Correspondence to Sebastian Foersch.

Ethics declarations

Competing interests

J.N.K. reports consulting services for Owkin (France), Panakeia (UK), and DoMore Diagnostics (Norway) and has received honoraria for lectures by MSD, Eisai, and Fresenius. The other authors declare no competing interests.

Peer review

Peer review information

Nature Medicine thanks Tae Hyun Hwang, Timothy Maughan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Ulrike Harjes and Saheli Sadanand, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1

a: Scatter plots of CD3, CD8, CD4, CD20, and CD68 positive immune cell counts derived from whole slide images and tissue microarrays. Spearman’s correlation coefficients and p-values from two-sided testing are given. Note that for all markers but CD3 the scale is logarithmic. The line indicates a linear regression model with indication of the 95% confidence interval. b: Confusion matrices of the IS 2 and IS 3 derived from both TMA and WSI. Results of Fisher’s exact and Chi2 test are provided. c: Scatter plots of MLH1, MSH2, MSH6, and PMS2 expression derived from whole slide images and tissue microarrays. Spearman’s correlation coefficients and p-values from two-sided testing are given. The line indicates a linear regression model with indication of the 95% confidence interval. N = 51 for all stainings (except PMS2 where n = 45). (WSI: Whole slide image. TMA: Tissue microarray).

Extended Data Fig. 2

a: Procedure of the TMA generation for the TUM–CCC-EMN cohort. For the Erlangen Cohort three cores from the invasive margin and three cores form the tumor center were used. For the Munich cohort, one core from the invasive margin and one core from the tumor center was used. b: Procedure of the TMA generation for the Mainz cohort. Three TMA cores were taken according to the ratio of invasive margin to tumor center. c, d: Further preprocessing included serial sectioning, immunohistochemistry, and preprocessing of the TMA cores and the tiles. Scale bars on the left subpanel represent ca. 200 µm. Scale bars on the right represent ca. 100 µm. e: Examples of excluded cores. Scale bars represent ca. 200 µm. (TMA: Tissue microarray, NAT: Normal adjacent tissue (not used in this study)). Some illustrations were generated with BioRender.com.

Extended Data Fig. 3 Overview of the SSDLM (Single-stain deep learning model).

Arrow labels indicate each component’s output dimensions. Dim, mmhid, num_classes, etc. are defined model parameters. Fc(x, y) represents a fully connected layer with the input dimension x and the output dimension y.

Extended Data Fig. 4 Overview of the MSDLM (Multistain deep learning model).

Arrow labels indicate each component’s output dimensions. Dim, mmhid, num_classes, etc. are defined model parameters. Fc(x, y) represents a fully connected layer with the input dimension x and the output dimension y. Subblocks 1, … n-1 are structured identically to subblock 0.

Extended Data Fig. 5

a-d: Accuracy (A), AUPRC (B), AUROC (C), and F1-Score (D) of the MSDLM and different classical machine learning techniques after training validation on the Mainz cohort. N = 11 models trained during 11-fold cross validation per group. One-way ANOVA with Dunnett Test to correct for multiple testing was used. Statistical significance is indicated by asterisks as described in the M&M section for the comparison to the MSDLM (p > 0.05:ns, p ≤ 0.05:*, p ≤ 0.01:**, p ≤ 0.001:***, p ≤ 0.0001:****). The 10th, 50th (Median), and 90th quantile as well as the minimum and maximum are depicted. E, F: Precision-recall and receiver operator characteristics curves of the MSDLM and the classical machine learning techniques. The mean of the 11-fold cross validation is shown. Shaded area indicates 1 standard deviation (std. dev.). (AUPRC: Area under the precision recall curve. AUROC: Area under the receiver operator characteristic. MSDLM: multistain deep learning model. SVM: Support vector machine classifier. RF: Random forest classifier. LR: Logistic regression classifier. GB: Gradient boosted (decision tree) classifier).

Extended Data Fig. 6

a: Kaplan–Meier curves of the test cohort for each UICC stage stratified by AImmunoscore (AIS). N = 61 for UICC stage I, n = 128 for UICC stage II, n = 114 for UICC stage III, and n = 36 for UICC stage IV. Censors are indicated with a ‘+’. Log-rank test was used.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Foersch, S., Glasner, C., Woerl, AC. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat Med 29, 430–439 (2023). https://doi.org/10.1038/s41591-022-02134-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-022-02134-1

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing