The role of exome sequencing in newborn screening for inborn errors of metabolism

Adhikari, Aashish N.; Gallagher, Renata C.; Wang, Yaqiong; Currier, Robert J.; Amatuni, George; Bassaganyas, Laia; Chen, Flavia; Kundu, Kunal; Kvale, Mark; Mooney, Sean D.; Nussbaum, Robert L.; Randi, Savanna S.; Sanford, Jeremy; Shieh, Joseph T.; Srinivasan, Rajgopal; Sunderam, Uma; Tang, Hao; Vaka, Dedeepya; Zou, Yangyun; Koenig, Barbara A.; Kwok, Pui-Yan; Risch, Neil; Puck, Jennifer M.; Brenner, Steven E.

doi:10.1038/s41591-020-0966-5

Letter
Published: 10 August 2020

The role of exome sequencing in newborn screening for inborn errors of metabolism

Nature Medicine volume 26, pages 1392–1397 (2020)Cite this article

10k Accesses
99 Citations
216 Altmetric
Metrics details

Subjects

Abstract

Public health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently used to screen newborns for a panel of rare inborn errors of metabolism (IEMs)^1,2,3,4. The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS. We obtained archived residual dried blood spots and data for nearly all IEM cases from the 4.5 million infants born in California between mid-2005 and 2013 and from some infants who screened positive by MS/MS, but were unaffected upon follow-up testing. WES had an overall sensitivity of 88% and specificity of 98.4%, compared to 99.0% and 99.8%, respectively for MS/MS, although effectiveness varied among individual IEMs. Thus, WES alone was insufficiently sensitive or specific to be a primary screen for most NBS IEMs. However, as a secondary test for infants with abnormal MS/MS screens, WES could reduce false-positive results, facilitate timely case resolution and in some instances even suggest more appropriate or specific diagnosis than that initially obtained. This study represents the largest, to date, sequencing effort of an entire population of IEM-affected cases, allowing unbiased assessment of current capabilities of WES as a tool for population screening.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Low positive predictive value and complex differential diagnoses of MS/MS newborn screening for glutaric acidemia.**

**Fig. 2: Whole-exome pipeline design and analysis.**

Methods and feasibility study for exome sequencing as a universal second-tier test in newborn screening

Article 13 January 2021

Nicole Ruiz-Schultz, David Sant, … Andreas Rohrwasser

Clinical utility of 24-h rapid trio-exome sequencing for critically ill infants

Article Open access 05 May 2020

Huijun Wang, Yanyan Qian, … Wenhao Zhou

Exome sequencing compared with standard genetic tests for critically ill infants with suspected genetic conditions

Article 27 April 2020

Hadley Stevens Smith, John M. Swint, … Brendan H. Lee

Data availability

The de-identified residual DBS from the California Biobank for this project (SIS request number 496) were obtained with a waiver of consent from the Committee for the Protection of Human Subjects of the State of California, under project no. 14-07-1650 and in compliance with CDPH Biospecimen/Data Use and Confidentiality Agreement. California blood specimens and any data derived from the newborn screening program are confidential and subject to strict administrative, physical and technical protections. California law precludes any researcher from sharing blood specimens or uploading individual data derived from these blood specimens into any genomic data repository. Researchers desiring access to these data would need to make a separate application to the CPDH. Data in Fig. 2b,c and Extended Data Figs. 6, 9 and 10 can be found in Supplementary Table 3.

Code availability

Variant calling and annotation for the exome sequences were performed using previously published methods as described above. The code used for the screening analysis of exome data and subsequent assessments are deposited in GitHub (https://github.com/nbseq1200/NBSeq1200paper).

References

Hall, P. L. et al. Postanalytical tools improve performance of newborn screening by tandem mass spectrometry. Genet. Med. 16, 889–895 (2014).
Article CAS PubMed PubMed Central Google Scholar
Mak, C. M., Lee, H. C., Chan, A. Y. & Lam, C. W. Inborn errors of metabolism and expanded newborn screening: review and update. Crit. Rev. Clin. Lab. Sci. 50, 142–162 (2013).
Article CAS PubMed Google Scholar
McHugh, D. et al. Clinical validation of cutoff target ranges in newborn screening of metabolic disorders by tandem mass spectrometry: a worldwide collaborative project. Genet. Med. 13, 230–254 (2011).
Article PubMed Google Scholar
Wilcken, B., Wiley, V., Hammond, J. & Carpenter, K. Screening newborns for inborn errors of metabolism by tandem mass spectrometry. N. Engl. J. Med. 348, 2304–2312 (2003).
Article CAS PubMed Google Scholar
Tang, H. et al. Damaged goods?: an empirical cohort study of blood specimens collected 12 to 23 hours after birth in newborn screening in California. Genet. Med. 18, 259–264 (2016).
Article CAS PubMed Google Scholar
Adams, D. R. & Eng, C. M. Next-generation sequencing to diagnose suspected genetic disorders. N. Engl. J. Med. 379, 1353–1362 (2018).
Article CAS PubMed Google Scholar
Biesecker, L. G. & Green, R. C. Diagnostic clinical genome and exome sequencing. N. Engl. J. Med. 371, 1170 (2014).
Article PubMed Google Scholar
Farnaes, L. et al. Rapid whole-genome sequencing decreases infant morbidity and cost of hospitalization. NPJ Genom. Med. 3, 10 (2018).
Article PubMed PubMed Central CAS Google Scholar
French, C. E. et al. Whole genome sequencing reveals that genetic conditions are frequent in intensively ill children. Intensive Care Med. 45, 627–636 (2019).
Article CAS PubMed PubMed Central Google Scholar
Friedman, J. M. et al. Genome-wide sequencing in acutely ill infants: genomic medicine’s critical application? Genet. Med. 21, 498–504 (2018).
Article PubMed PubMed Central CAS Google Scholar
Berg, J. S. et al. Newborn sequencing in genomic medicine and public health. Pediatrics. 139, e20162252 (2017).
Article PubMed PubMed Central Google Scholar
Regaldo, A. in Technology Review (2017).
Hoffmann, G. F. in Inherited Metabolic Diseases: A Clinical Approach (eds Hoffmann, G. F., Zschocke, J. & Nyhan, W. L.) 31–32 (Springer Berlin Heidelberg, 2017).
Bassaganyas, L. et al. Whole exome and whole genome sequencing with dried blood spot DNA without whole genome amplification. Hum. Mutat. 39, 167–171 (2018).
Article CAS PubMed Google Scholar
Biesecker, L. G. Secondary findings in exome slices, virtual panels, and anticipatory sequencing. Genet. Med. 21, 41–43 (2019).
Article PubMed Google Scholar
Stenson, P. D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
Article CAS PubMed Google Scholar
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Article CAS PubMed Google Scholar
Feuchtbaum, L., Yang, J. & Currier, R. Follow-up status during the first 5 years of life for metabolic disorders on the federal recommended uniform screening panel. Genet. Med. 20, 831–839 (2018).
Article PubMed Google Scholar
Feuchtbaum, L., Carter, J., Dowray, S., Currier, R. J. & Lorey, F. Birth prevalence of disorders detectable through newborn screening by race/ethnicity. Genet. Med. 14, 937–945 (2012).
Article PubMed Google Scholar
Consortium, G. P. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article CAS Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Matern, D. et al. Prospective diagnosis of 2-methylbutyryl-CoA dehydrogenase deficiency in the Hmong population by newborn screening using tandem mass spectrometry. Pediatrics. 112, 74–78 (2003).
Article PubMed Google Scholar
Tiranti, V. et al. Ethylmalonic encephalopathy is caused by mutations in ETHE1, a gene encoding a mitochondrial matrix protein. Am. J. Hum. Genet. 74, 239–252 (2004).
Article CAS PubMed PubMed Central Google Scholar
Henriques, B. J. et al. Ethylmalonic encephalopathy ETHE1 R163W/R163Q mutations alter protein stability and redox properties of the iron centre. PLoS ONE 9, e107157 (2014).
Article PubMed PubMed Central CAS Google Scholar
Wang, Z. Q., Chen, X. J., Murong, S. X., Wang, N. & Wu, Z. Y. Molecular analysis of 51 unrelated pedigrees with late-onset multiple acyl-CoA dehydrogenation deficiency (MADD) in southern China confirmed the most common ETFDH mutation and high carrier frequency of c.250G>A. J. Mol. Med. (Berl.) 89, 569–576 (2011).
Article CAS Google Scholar
Goldfeder, R. L. et al. Medical implications of technical accuracy in genome sequencing. Genome Med. 8, 24 (2016).
Article PubMed PubMed Central CAS Google Scholar
Sulonen, A. M. et al. Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol. 12, R94 (2011).
Article CAS PubMed PubMed Central Google Scholar
Peng, G. et al. Combining newborn metabolic and DNA analysis for second-tier testing of methylmalonic acidemia. Genet. Med. 21, 896–903 (2019).
Article CAS PubMed Google Scholar
Vockley, J., Rinaldo, P., Bennett, M. J., Matern, D. & Vladutiu, G. D. Synergistic heterozygosity: disease resulting from multiple partial defects in one or more metabolic pathways. Mol. Genet. Metab. 71, 10–18 (2000).
Article CAS PubMed Google Scholar
Batshaw, M. L., Msall, M., Beaudet, A. L. & Trojak, J. Risk of serious illness in heterozygotes for ornithine transcarbamylase deficiency. J. Pediatr. 108, 236–241 (1986).
Article CAS PubMed Google Scholar
Bodian, D. L. et al. Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates. Genet. Med. 18, 221–230 (2016).
Article PubMed Google Scholar
Clark, M. M. et al. Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Sci. Transl. Med. 11, eaat6177 (2019).
Article PubMed CAS Google Scholar
Kingsmore, S. F. et al. A randomized, controlled trial of the analytic and diagnostic performance of singleton and trio, rapid genome and exome sequencing in Ill infants. Am. J. Hum. Genet. 105, 719–733 (2019).
Article CAS PubMed PubMed Central Google Scholar
Calonge, N. et al. Committee report: method for evaluating conditions nominated for population-based screening of newborns and children. Genet. Med. 12, 153–159 (2010).
Article PubMed Google Scholar
Rodriguez, J. M. et al. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res. 41, D110–D117 (2013).
Article CAS PubMed Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Article CAS PubMed PubMed Central Google Scholar
Jian, X., Boerwinkle, E. & Liu, X. In silico prediction of splice-altering single nucleotide variants in the human genome. Nucleic Acids Res. 42, 13534–13544 (2014).
Article CAS PubMed PubMed Central Google Scholar
Fromer, M. et al. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am. J. Hum. Genet. 91, 597–607 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chamberlin, M. E., Ubagai, T., Mudd, S. H., Levy, H. L. & Chou, J. Y. Dominant inheritance of isolated hypermethioninemia is associated with a mutation in the human methionine adenosyltransferase 1A gene. Am. J. Hum. Genet. 60, 540–546 (1997).
CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. A. et al. From FASTQ data to high confidence variant calls: the genome analysis toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Article Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Punwani, D. et al. Multisystem anomalies in severe combined immunodeficiency with mutant BCL11B. N. Engl. J. Med. 375, 2165–2176 (2016).
Article CAS PubMed PubMed Central Google Scholar
Harrow, J. et al. GENCODE: the reference human genome annotation for the ENCODE project. Genome Res. 22, 1760–1774 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tabor, H. K. et al. Pathogenic variants for Mendelian and complex traits in exomes of 6,517 European and African Americans: implications for the return of incidental results. Am. J. Hum. Genet. 95, 183–193 (2014).
Article CAS PubMed PubMed Central Google Scholar
Jian, X. & Liu, X. In silico prediction of deleteriousness for nonsynonymous and splice-altering single nucleotide variants in the human genome. Methods Mol. Biol. 1498, 191–197 (2017).
Article CAS PubMed Google Scholar
Sunderam, U., et al. DNA from dried blood spots yields high quality sequences for exome analysis. Preprint at bioRxiv https://doi.org/10.1101/2020.05.19.105304 (2020).
Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012).
Article CAS PubMed PubMed Central Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American college of medical genetics and genomics and the association for molecular pathology. Genet. Med. 17, 405–424 (2015).
Article PubMed PubMed Central Google Scholar
Wang, Y. et al. Perturbation robustness analyses reveal important parameters in variant interpretation pipelines. Preprint at bioRxiv https://doi.org/10.1101/2020.06.29.173815 (2020).
Yorifuji, T. et al. X-inactivation pattern in the liver of a manifesting female with ornithine transcarbamylase (OTC) deficiency. Clin. Genet. 54, 349–353 (1998).
Article CAS PubMed Google Scholar
Hu, J. et al. Association of CPT II gene with risk of acute encephalitis in Chinese children. Pediatr. Infect. Dis. J. 33, 1077–1082 (2014).
Article PubMed Google Scholar
Bell, C. J. et al. Carrier testing for severe childhood recessive diseases by next-generation sequencing. Sci. Transl. Med. 3, 65ra64 (2011).
Article CAS Google Scholar
Bergeron, A., D’Astous, M., Timm, D. E. & Tanguay, R. M. Structural and functional analysis of missense mutations in fumarylacetoacetate hydrolase, the gene deficient in hereditary tyrosinemia type 1. J. Biol. Chem. 276, 15225–15231 (2001).
Article CAS PubMed Google Scholar
Gallant, N. M. et al. Biochemical, molecular, and clinical characteristics of children with short chain acyl-CoA dehydrogenase deficiency detected by newborn screening in California. Mol. Genet. Metab. 106, 55–61 (2012).
Article CAS PubMed Google Scholar
Jethva, R., Bennett, M. J. & Vockley, J. Short-chain acyl-coenzyme A dehydrogenase deficiency. Mol. Genet. Metab. 95, 195–200 (2008).
Article CAS PubMed PubMed Central Google Scholar
Wolfe, L., et al. Short-chain acyl-CoA dehydrogenase deficiency. GeneReviews https://www.ncbi.nlm.nih.gov/books/NBK63582/ (2018).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Article CAS PubMed PubMed Central Google Scholar
Corvelo, A., Hallegger, M., Smith, C. W. & Eyras, E. Genome-wide association between branch point properties and alternative splicing. PLoS Comput. Biol. 6, e1001016 (2010).
Article PubMed PubMed Central CAS Google Scholar
Sterne-Weiler, T., Howard, J., Mort, M., Cooper, D. N. & Sanford, J. R. Loss of exon identity is a common mechanism of human inherited disease. Genome Res. 21, 1563–1571 (2011).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

The authors are grateful for expert technical and computational assistance from many diligent contributors, including W. Chan, J.-M. Chandonia, A. Chellappan, N. Dabbiru, B. Dispensa, A. Neumann, A. Nguyen, A. Rao, S. Rana and Z.-Y. Wu. The work was funded by the National Institutes of Health grant U19HD077627 as part of the NSIGHT project, a joint program between the National Human Genome Research Institute and the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health. This work was also supported by a research agreement with Tata Consultancy Services. The biospecimens and/or data used in this study were obtained from the California Biobank Program (SIS request no. 496). The CPDH is not responsible for the results or conclusions drawn by the authors of this publication.

Author information

These authors contributed equally and jointly supervised the work: Jennifer M. Puck, Steven E. Brenner.

Authors and Affiliations

Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA, USA
Aashish N. Adhikari, Yaqiong Wang, Kunal Kundu, Yangyun Zou & Steven E. Brenner
Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
Aashish N. Adhikari, Renata C. Gallagher, Laia Bassaganyas, Flavia Chen, Mark Kvale, Robert L. Nussbaum, Joseph T. Shieh, Dedeepya Vaka, Barbara A. Koenig, Pui-Yan Kwok, Neil Risch, Jennifer M. Puck & Steven E. Brenner
Department of Pediatrics, University of California San Francisco, San Francisco, CA, USA
Renata C. Gallagher, Robert J. Currier, George Amatuni, Joseph T. Shieh & Jennifer M. Puck
Program in Bioethics, University of California San Francisco, San Francisco, CA, USA
Flavia Chen & Barbara A. Koenig
Innovation Labs, Tata Consultancy Services, Hyderabad, India
Kunal Kundu, Rajgopal Srinivasan & Uma Sunderam
Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
Sean D. Mooney
Invitae, San Francisco, CA, USA
Robert L. Nussbaum
Department of Molecular, Cellular and Developmental Biology, Center for the Molecular Biology of RNA, UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
Savanna S. Randi & Jeremy Sanford
Genetic Disease Screening Program, California Department of Public Health, Richmond, CA, USA
Hao Tang
Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA, USA
Pui-Yan Kwok & Jennifer M. Puck
Department of Dermatology, University of California San Francisco, San Francisco, CA, USA
Pui-Yan Kwok
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
Neil Risch
Division of Allergy, Immunology and Blood and Marrow Transplantation, UCSF Benioff Children’s Hospital, San Francisco, CA, USA
Jennifer M. Puck
Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA
Steven E. Brenner
Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
Steven E. Brenner

Authors

Aashish N. Adhikari
View author publications
You can also search for this author in PubMed Google Scholar
Renata C. Gallagher
View author publications
You can also search for this author in PubMed Google Scholar
Yaqiong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Robert J. Currier
View author publications
You can also search for this author in PubMed Google Scholar
George Amatuni
View author publications
You can also search for this author in PubMed Google Scholar
Laia Bassaganyas
View author publications
You can also search for this author in PubMed Google Scholar
Flavia Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kunal Kundu
View author publications
You can also search for this author in PubMed Google Scholar
Mark Kvale
View author publications
You can also search for this author in PubMed Google Scholar
Sean D. Mooney
View author publications
You can also search for this author in PubMed Google Scholar
Robert L. Nussbaum
View author publications
You can also search for this author in PubMed Google Scholar
Savanna S. Randi
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Sanford
View author publications
You can also search for this author in PubMed Google Scholar
Joseph T. Shieh
View author publications
You can also search for this author in PubMed Google Scholar
Rajgopal Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
Uma Sunderam
View author publications
You can also search for this author in PubMed Google Scholar
Hao Tang
View author publications
You can also search for this author in PubMed Google Scholar
Dedeepya Vaka
View author publications
You can also search for this author in PubMed Google Scholar
Yangyun Zou
View author publications
You can also search for this author in PubMed Google Scholar
Barbara A. Koenig
View author publications
You can also search for this author in PubMed Google Scholar
Pui-Yan Kwok
View author publications
You can also search for this author in PubMed Google Scholar
Neil Risch
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer M. Puck
View author publications
You can also search for this author in PubMed Google Scholar
Steven E. Brenner
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.J.C., R.L.N, B.A.K., P.-Y.K., N.R., J.M.P. and S.E.B. conceived and designed the study. A.N.A., R.J.C., G.A., L.B., F.C., M.K., S.S.R., J.S., U.S., H.T., D.V., P.-Y.K. and J.M.P. acquired data. A.N.A., R.C.G., Y.W., R.J.C., G.A., K.K., M.K., S.R., R.S., U.S., Y.Z., N.R., J.M.P. and S.E.B. analyzed data. A.N.A., R.C.G., Y.W., R.J.C., M.K., S.S.R., J.T.S, R.S., H.T., N.R., J.M.P. and S.E.B interpreted data. A.N.A., Y.W. and U.S. created software. A.N.A. wrote the first draft of the manuscript. R.C.G., Y.W., R.J.C., F.C., M.K., S.D.M., R.L.N., J.T.S., R.S., H.T., B.A.K., P.-Y.K., N.R., J.M.P. and S.E.B. provided critical revisions. All authors approved the final version of the manuscript.

Corresponding authors

Correspondence to Aashish N. Adhikari, Jennifer M. Puck or Steven E. Brenner.

Ethics declarations

Competing interests

A.A. is currently an employee of Illumina, Inc. K.K. was an employee of Tata Consultancy Services (TCS); U.S. and R.S. are employees of TCS. Y.Z. is currently an employee of Yikon Genomics Co., Ltd. R.N. is an employee of Invitae. J.P. is the spouse of R. Nussbaum, an employee of Invitae. S.E.B. receives support at the University of California Berkeley through a research agreement from TCS.

Additional information

Peer review information Kate Gao was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Metrics for WES reads and coverage.

a, Percentage of reads unmapped to the reference genome. b, Percentage of high quality read pairs (MQ > 20), without duplicates and properly paired. c, Percentage of duplicates in the reads across three sequencing batches d-e, Number of reads and high quality reads plotted batchwise. f, Inferred insert sizes plotted batchwise. g, Median coverage across Nimblegen capture region plotted batchwise h, Median coverage across 78 genes region plotted batchwise. i, Median fraction of capture covered at coverage depths of 1x to 30x plotted batchwise. j, Median fraction of 78 genes region covered at coverage depths of 1x to 30x plotted batchwise. In figures a-f and i-j, individual sample values are plotted, and adjacent box plots display the median (red) and interquartile ranges for the dataset, whiskers extend to the last data point within 1.5 times the interquartile range. The sample sizes for the boxplots in a-h were: batch1 (n = 180), batch2 (n = 292), batch3 (n = 744). Violin plots superimposed on the box plots show the data density and mean value (blue).

Extended Data Fig. 2 DNA damage related metrics for the three sequencing batches.

a, b, Fraction of reads with 0 (green), 1 (yellow), 2 (orange), and ≥3 (red) mismatches with reference genome considering (a) all bases of the reads and (b) first 100 bases of the reads. Batches 1 and 2 had read lengths of 101 bases and batch 3 had read length of 151 bases. All three batches had similar mismatch rates when only the first 100 bases were considered. c, Nucleotide mismatches by base change (NMBC) in the 1,216 samples plotted batch wise. d, Frequencies of all single nucleotide changes by base type in high quality SNVs in the 1,216 samples plotted batchwise. High quality SNVs from the VCF calls defined as marked PASS by GATK VQSR algorithm and with GQ ≥ 30. In both c and d, box plots display the median and interquartile ranges for the dataset, whiskers extend to the last data point within 1.5 times the interquartile range and outliers beyond this are marked with circles. The sample sizes for the boxplots were batch1 (n = 180), batch2 (n = 292), batch3 (n = 744).

Extended Data Fig. 3 Variant related quality metrics for 1,216 samples plotted batch wise.

a, Confident sites across capture (from the GVCF file) b, Confident sites across 78 genes (from the GVCF file) c, Common high quality SNVs d, Rare high quality SNVs e, Common high quality indels f, Rare high quality indels g, Transition/Transversion ratios for high quality common SNVs h, Transition/Transition ratios for high quality rare SNVs. High quality variants are those marked as PASS by GATK VQSR and have GQ ≥ 30. Common variants have a frequency greater than 0.001 in 1000 Genomes Project phase 3 database and rare variants have a frequency less than 0.001 in the database. Individual sample values are plotted and adjacent box plots display the median (red) and interquartile ranges for the dataset, whiskers extend to the last data point within 1.5 times the interquartile range. Violin plots superimposed on the box plots show the data density and mean value (blue). The sample sizes for the boxplots were batch1 (n = 180), batch2 (n = 292), batch3 (n = 744).

Extended Data Fig. 4 Example showing variability of gene coverage in two IEM genes in the study across 1,216 samples.

MCCC2, top, has poor coverage in the first exon across all samples. In contrast, ACADM, bottom, has good coverage across the gene. The blue vertical lines indicate positions with known pathogenic variants in HGMD and ClinVar. Plot of log₁₀ of the median, 20^th percentile and minimum coverage for each coding exon across all samples for a given sample set. Dark grey: Median coverage, medium grey: 20^th percentile coverage, light grey: minimum coverage for each position. Coverage quality of each exon is indicated by colored blocks beneath the exon. Coverage quality of each exon is indicated by colored blocks beneath the coverage plot. Red: Greater than 15% of exon has less than 10x median coverage; green: 95% of the exon has minimum 20x coverage. UTRs that are part of the coding exons have a smaller indicator thickness. Regions of the exon that overlap with the capture array are indicated in blue just below the coverage plot. Exon scale in bases is shown in each plot.

Extended Data Fig. 5 Alternative pipelines derived from the final exome analysis pipeline to explore sensitivity-specificity tradeoffs.

We created several alternate pipelines, altering or truncating different parts of the final exome analysis pipeline to probe contributions to overall sensitivity and specificity from various components of the pipeline. For each pipeline, the overall sensitivity and specificity on the NBSeq test set are shown. a, Final exome analysis pipeline b-i) Alternatives: b) Altering final pipeline by considering every CNV call homozygous c-e) Truncating the CNV arm, curation arm and predicted impact arm, respectively. f-g, Retaining the predicted impact arm or curation arm only, respectively h) Retaining only the rare pathogenic HGMD & ClinVar databases i) Allowing multiple gene calls for each sample if more than one gene predicted.

Extended Data Fig. 6 Distribution of variants reported by the exome analysis pipeline in the NBSeq test set.

a, Number of different variant types reported by the pipeline in IEM-affected individuals in genes associated with their IEMs the NBSeq test set (n = 674 individuals). b, Distribution of the types of variants responsible for the predictions of disease status in the 571 affected individuals correctly identified by the exome analysis pipeline.

Extended Data Fig. 7 Whole genome sequencing confirms potential IVD deletions in two individuals diagnosed with isovaleric acidemia initially missed in exome.

In two cases where we performed WGS upon follow up of an exome false negative, we identified large deletions in the associated IVD gene. The WGS read alignments in the genomic region spanning the IVD is shown on the right for the two cases. The first case had almost no coverage in the region spanning the first three exons of IVD. The second case had almost no coverage of exon 12 of IVD along with low coverage across the whole gene. The first case had 11 split reads spanning the deleted region confirming the deletion event of the first three exons.

Extended Data Fig. 8 Experimental splicing assay of a potentially pathogenic intronic variant in an exome false negative case.

a, In an individual affected with MCADD, the exome analysis pipeline reported only a single rare nonsynonymous variant. A second rare intronic variant 14 bases from the splice site (NM_000016.4:c.388-14A>G) was a suspected pathogenic modification of the branchpoint A nucleotide. b, Diagram of the heterologous HBB splicing reporter construct containing the wild type ACADM sequence or the c.388-14A>G variant. c, RT-PCR analysis of reporter transcripts from wild type or mutant (lanes 1 and 2, respectively) reporter plasmids expressed in HEK293T cells (amplicons resolved by 12% PAGE and stained with SYBR Gold). The two spliced products are shown to the right of the gel image. The experiments were performed three times independently with similar results. d, Chromatograms corresponding to the sequence spliced junctions between HBB exon 1 and the wild type or mutant ACADM exon 6 constructs (left and right panel, respectively). e, Open reading frame of aberrant ACADM mRNA containing a 13 nt extension of exon 6 (red), resulting in a premature termination codon (PTC, *). Top, DNA sense strand; middle, predicted polypeptide; bottom, DNA reverse complement.

Extended Data Fig. 9 Stratification of IEM-affected and MS/MS false positives by alleles reported by the exome analysis pipeline for NPV estimation of NPV of exome as a follow-up test after a positive MS/MS screen.

For six MS/MS screens (VLCADD, PKU, LCHADD/TFP, IVA, MSUD, and GA-II), IEM-affected and MS/MS false positive cases in the NBSeq test set are stratified by the number of alleles reported by the exome analysis pipeline in the genes associated with those screens.

Extended Data Fig. 10 Zygosity distribution of variants reported by the pipeline in relevant gene(s).

For each IEM, bars show the zygosity distribution of the variants in relevant genes reported by the exome pipeline for the 674 IEM-affected cases from the test set. The numbers of cases correctly identified by the pipeline are broken down into those that had homozygous variants in relevant gene(s) (dark blue) and those that had two heterozygous variants in relevant genes(s) (orange). The number of cases that failed to be identified by the pipeline are broken down into those that had one heterozygous variant in relevant gene(s) (light blue) and those that had no reported variants in the relevant gene(s) (dark red). Left, core IEMs screened by California; right, secondary/add-on IEMs. IEMs sharing a common causative gene were not distinguished by the exome predictions alone. These included TFP and LCHADD (blue shading), PKU and hyperphenylalaninemia (pink shading), and the various MMA subtypes (yellow shading).

Supplementary information

Supplementary Information

Supplementary Fig.1

Reporting Summary

Supplementary Table

Supplementary Tables 1–7.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adhikari, A.N., Gallagher, R.C., Wang, Y. et al. The role of exome sequencing in newborn screening for inborn errors of metabolism. Nat Med 26, 1392–1397 (2020). https://doi.org/10.1038/s41591-020-0966-5

Download citation

Received: 23 December 2019
Accepted: 08 June 2020
Published: 10 August 2020
Issue Date: September 2020
DOI: https://doi.org/10.1038/s41591-020-0966-5

This article is cited by

Genetic ancestry and diagnostic yield of exome sequencing in a diverse population
- Yusuph Mavura
- Nuriye Sahin-Hodoglugil
- Neil Risch
npj Genomic Medicine (2024)
Extending inherited metabolic disorder diagnostics with biomarker interaction visualizations
- Denise N. Slenter
- Irene M. G. M. Hemel
- Laura K. M. Steinbusch
Orphanet Journal of Rare Diseases (2023)
Chinese genetic variation database of inborn errors of metabolism: a systematic review of published variants in 13 genes
- Yongchao Guo
- Jianhui Jiang
- Zhongyao Xu
Orphanet Journal of Rare Diseases (2023)
ClinVar and HGMD genomic variant classification accuracy has improved over time, as measured by implied disease burden
- Andrew G. Sharo
- Yangyun Zou
- Steven E. Brenner
Genome Medicine (2023)
Scalable, high quality, whole genome sequencing from archived, newborn, dried blood spots
- Yan Ding
- Mallory Owen
- Stephen F. Kingsmore
npj Genomic Medicine (2023)