Genomic analyses implicate noncoding de novo variants in congenital heart disease

Richter, Felix; Morton, Sarah U.; Kim, Seong Won; Kitaygorodsky, Alexander; Wasson, Lauren K.; Chen, Kathleen M.; Zhou, Jian; Qi, Hongjian; Patel, Nihir; DePalma, Steven R.; Parfenov, Michael; Homsy, Jason; Gorham, Joshua M.; Manheimer, Kathryn B.; Velinder, Matthew; Farrell, Andrew; Marth, Gabor; Schadt, Eric E.; Kaltman, Jonathan R.; Newburger, Jane W.; Giardini, Alessandro; Goldmuntz, Elizabeth; Brueckner, Martina; Kim, Richard; Porter, George A.; Bernstein, Daniel; Chung, Wendy K.; Srivastava, Deepak; Tristani-Firouzi, Martin; Troyanskaya, Olga G.; Dickel, Diane E.; Shen, Yufeng; Seidman, Jonathan G.; Seidman, Christine E.; Gelb, Bruce D.

doi:10.1038/s41588-020-0652-z

Article
Published: 29 June 2020

Genomic analyses implicate noncoding de novo variants in congenital heart disease

Nature Genetics volume 52, pages 769–777 (2020)Cite this article

12k Accesses
82 Citations
54 Altmetric
Metrics details

Subjects

Abstract

A genetic etiology is identified for one-third of patients with congenital heart disease (CHD), with 8% of cases attributable to coding de novo variants (DNVs). To assess the contribution of noncoding DNVs to CHD, we compared genome sequences from 749 CHD probands and their parents with those from 1,611 unaffected trios. Neural network prediction of noncoding DNV transcriptional impact identified a burden of DNVs in individuals with CHD (n = 2,238 DNVs) compared to controls (n = 4,177; P = 8.7 × 10⁻⁴). Independent analyses of enhancers showed an excess of DNVs in associated genes (27 genes versus 3.7 expected, P = 1 × 10⁻⁵). We observed significant overlap between these transcription-based approaches (odds ratio (OR) = 2.5, 95% confidence interval (CI) 1.1–5.0, P = 5.4 × 10⁻³). CHD DNVs altered transcription levels in 5 of 31 enhancers assayed. Finally, we observed a DNV burden in RNA-binding-protein regulatory sites (OR = 1.13, 95% CI 1.1–1.2, P = 8.8 × 10⁻⁵). Our findings demonstrate an enrichment of potentially disruptive regulatory noncoding DNVs in a fraction of CHD at least as high as that observed for damaging coding DNVs.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

**Fig. 2: Enrichment of noncoding de novo variants with functionally relevant HeartENN scores.**

**Fig. 3: Genes with multiple de novo variants in prioritized human fetal heart enhancers.**

**Fig. 4: Massively parallel reporter assays for selected de novo variants.**

**Fig. 5: Enrichment of variants in RNA-binding-protein category annotations.**

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Anoushka Joglekar, Wen Hu, … Hagen U. Tilgner

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Data availability

Whole-genome sequencing data are deposited in the database of Genotypes and Phenotypes (dbGaP) under accession numbers phs001194.v2.p2 and phs001138.v2.p2.

Code availability

Documentation, links, and availability of source code and select supplementary data are detailed at https://github.com/frichter/wgs_chd_analysis. The DNV identification pipeline is available at https://github.com/ShenLab/igv-classifier and https://github.com/frichter/dnv_pipeline. The HeartENN algorithmic framework is available at https://github.com/FunctionLab/selene/archive/0.4.8.tar.gz. HeartENN model weights and scripts for burden tests are available at https://github.com/frichter/wgs_chd_analysis. All source code is distributed under the Massachusetts Institute of Technology license.

References

van der Linde, D. et al. Birth prevalence of congenital heart disease worldwide. J. Am. Coll. Cardiol. 58, 2241–2247 (2011).
PubMed Google Scholar
Pediatric Cardiac Genomics Consortium et al.The Congenital Heart Disease Genetic Network Study: rationale, design, and early results. Circ. Res. 112, 698–706 (2013).
PubMed Central Google Scholar
Zaidi, S. et al. De novo mutations in histone-modifying genes in congenital heart disease. Nature 498, 220–223 (2013).
CAS PubMed PubMed Central Google Scholar
Homsy, J. et al. De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies. Science 350, 1262–1266 (2015).
CAS PubMed PubMed Central Google Scholar
Jin, S. C. et al. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands. Nat. Genet. 49, 1593–1601 (2017).
CAS PubMed PubMed Central Google Scholar
Fischbach, G. D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron 68, 192–195 (2010).
CAS PubMed Google Scholar
Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at https://arxiv.org/abs/1207.3907v2 (2012).
Richter, F. et al. Whole genome de novo variant identification with FreeBayes and neural network approaches. Preprint at bioRxiv https://doi.org/10.1101/2020.03.24.994160 (2020).
Zhou, J. et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat. Genet. 51, 973–980 (2019).
CAS PubMed PubMed Central Google Scholar
An, J.-Y. et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science 362, eaat6576 (2018).
PubMed PubMed Central Google Scholar
Jónsson, H. et al. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. Nature 549, 519–522 (2017).
PubMed Google Scholar
Goldmann, J. M. et al. Parent-of-origin-specific signatures of de novo mutations. Nat. Genet. 48, 935–939 (2016).
CAS PubMed Google Scholar
Seiden, A. H. et al. Elucidation of de novo small insertion/deletion biology with parent-of-origin phasing. Hum. Mutat. 41, 800–806 (2020).
CAS PubMed PubMed Central Google Scholar
Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
CAS PubMed PubMed Central Google Scholar
Bernstein, B. E. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Google Scholar
Mei, S. et al. Cistrome Data Browser: a data portal for ChIP–Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 45, D658–D662 (2017).
CAS PubMed Google Scholar
He, A. et al. Dynamic GATA4 enhancers shape the chromatin landscape central to heart development and disease. Nat. Commun. 5, 4907 (2014).
CAS PubMed Google Scholar
Sayed, D., Yang, Z., He, M., Pfleger, J. M. & Abdellatif, M. Acute targeting of general transcription factor IIB restricts cardiac hypertrophy via selective inhibition of gene transcription. Circ. Heart Fail. 8, 138–148 (2015).
CAS PubMed Google Scholar
Stefanovic, S. et al. GATA-dependent regulatory switches establish atrioventricular canal specificity during heart development. Nat. Commun. 5, 3680 (2014).
PubMed Google Scholar
Sayed, D., He, M., Yang, Z., Lin, L. & Abdellatif, M. Transcriptional regulation patterns revealed by high resolution chromatin immunoprecipitation during cardiac hypertrophy. J. Biol. Chem. 288, 2546–2558 (2013).
CAS PubMed Google Scholar
Zhang, L. et al. KLF15 establishes the landscape of diurnal expression in the heart. Cell Rep. 13, 2368–2375 (2015).
CAS PubMed Google Scholar
Anand, P. et al. BET bromodomains mediate transcriptional pause release in heart failure. Cell 154, 569–582 (2013).
CAS PubMed PubMed Central Google Scholar
Attanasio, C. et al. Tissue-specific SMARCA4 binding at active and repressed regulatory elements during embryogenesis. Genome Res. 24, 920–929 (2014).
CAS PubMed PubMed Central Google Scholar
Sakabe, N. J. et al. Dual transcriptional activator and repressor roles of TBX20 regulate adult cardiac structure and function. Hum. Mol. Genet. 21, 2194–2204 (2012).
CAS PubMed PubMed Central Google Scholar
Consortium, R. E. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Google Scholar
May, D. et al. Large-scale discovery of enhancers from human heart tissue. Nat. Genet. 44, 89–93 (2012).
CAS Google Scholar
Dickel, D. E. et al. Genome-wide compendium and functional assessment of in vivo heart enhancers. Nat. Commun. 7, 12923 (2016).
CAS PubMed PubMed Central Google Scholar
Nord, A. S. et al. Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. Cell 155, 1521–1531 (2013).
CAS PubMed PubMed Central Google Scholar
Blow, M. J. et al. ChIP–Seq identification of weakly conserved heart enhancers. Nat. Genet. 42, 806–810 (2010).
CAS PubMed PubMed Central Google Scholar
Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).
CAS PubMed PubMed Central Google Scholar
Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).
CAS PubMed PubMed Central Google Scholar
van den Boogaard, M. et al. Genetic variation in T-box binding element functionally affects SCN5A/SCN10A enhancer. J. Clin. Invest. 122, 2519–2530 (2012).
PubMed PubMed Central Google Scholar
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 (2015).
CAS PubMed PubMed Central Google Scholar
Huang, Y.-F., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat. Genet. 49, 618–624 (2017).
CAS PubMed PubMed Central Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
CAS PubMed PubMed Central Google Scholar
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
CAS PubMed Google Scholar
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
PubMed PubMed Central Google Scholar
Ritchie, G. R. S., Dunham, I., Zeggini, E. & Flicek, P. Functional annotation of noncoding sequence variants. Nat. Methods 11, 294–296 (2014).
CAS PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
CAS PubMed PubMed Central Google Scholar
Melnikov, A., Zhang, X., Rogov, P., Wang, L. & Mikkelsen, T. S. Massively parallel reporter assays in cultured mammalian cells. J. Vis. Exp. https://doi.org/10.3791/51719 (2014).
Article PubMed PubMed Central Google Scholar
Werling, D. M. et al. An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder. Nat. Genet. 50, 727–736 (2018).
CAS PubMed PubMed Central Google Scholar
Turner, T. N. et al. Genomic patterns of de novo mutation in simplex autism. Cell 171, 710–722.e12 (2017).
CAS PubMed PubMed Central Google Scholar
C Yuen, R. K. et al. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder. Nat. Neurosci. 20, 602–611 (2017).
PubMed Google Scholar
Hamdan, F. F. et al. High rate of recurrent de novo mutations in developmental and epileptic encephalopathies. Am. J. Hum. Genet. 101, 664–685 (2017).
CAS PubMed PubMed Central Google Scholar
Peacock, J. D., Lu, Y., Koch, M., Kadler, K. E. & Lincoln, J. Temporal and spatial expression of collagens during murine atrioventricular heart valve development and maintenance. Dev. Dyn. 237, 3051–3058 (2008).
PubMed PubMed Central Google Scholar
Kurosaka, S. et al. Arginylation regulates myofibrils to maintain heart function and prevent dilated cardiomyopathy. J. Mol. Cell. Cardiol. 53, 333–341 (2012).
CAS PubMed PubMed Central Google Scholar
Kleffmann, W. et al. 5q31 microdeletions: definition of a critical region and analysis of LRRTM2, a candidate gene for intellectual disability. Mol. Syndromol. 3, 68–75 (2012).
CAS PubMed PubMed Central Google Scholar
Mehta, G. et al. MITF interacts with the SWI/SNF subunit, BRG1, to promote GATA4 expression in cardiac hypertrophy. J. Mol. Cell. Cardiol. 88, 101–110 (2015).
CAS PubMed PubMed Central Google Scholar
Tshori, S. et al. Transcription factor MITF regulates cardiac growth and hypertrophy. J. Clin. Invest. 116, 2673–2681 (2006).
CAS PubMed PubMed Central Google Scholar
Nicholson, T. B. et al. A hypomorphic lsd1 allele results in heart development defects in mice. PLoS One 8, e60913 (2013).
CAS PubMed PubMed Central Google Scholar
Hamidi, T. et al. Identification of Rpl29 as a major substrate of the lysine methyltransferase Set7/9. J. Biol. Chem. 293, 12770–12780 (2018).
CAS PubMed PubMed Central Google Scholar
Siggs, O. M. et al. Mutation of Fnip1 is associated with B-cell deficiency, cardiomyopathy, and elevated AMPK activity. Proc. Natl Acad. Sci. USA 113, E3706–E3715 (2016).
CAS PubMed PubMed Central Google Scholar
Chen, C.-Y. et al. Accumulation of the inner nuclear envelope protein Sun1 is pathogenic in progeric and dystrophic laminopathies. Cell 149, 565–577 (2012).
CAS PubMed PubMed Central Google Scholar
Meinke, P. et al. Muscular dystrophy-associated SUN1 and SUN2 variants disrupt nuclear-cytoskeletal connections and myonuclear organization. PLoS Genet. 10, e1004605 (2014).
PubMed PubMed Central Google Scholar
Röseler, S. et al. Lethal phenotype of mice carrying a Sept11 null mutation. Biol. Chem. 392, 779–781 (2011).
PubMed Google Scholar
Guo, A. et al. E–C coupling structural protein junctophilin-2 encodes a stress-adaptive transcription regulator. Science 362, eaan3303 (2018).
CAS PubMed PubMed Central Google Scholar
Yamagishi, H. et al. A history and interaction of outflow progenitor cells implicated in “Takao Syndrome.” In Etiology and Morphogenesis of Congenital Heart Disease: From Gene Function and Cellular Interaction to Morphology (eds. Nakanishi, T. et al.) 201–209 (Springer, 2016).
Masuda, T. & Taniguchi, M. Congenital diseases and semaphorin signaling: overview to date of the evidence linking them. Congenit. Anom. (Kyoto). 55, 26–30 (2015).
CAS PubMed Google Scholar
Pierpont, M. E. et al. Genetic basis for congenital heart disease: revisited: a scientific statement from the American Heart Association. Circulation 138, e653–e711 (2018).
PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
CAS PubMed PubMed Central Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
CAS PubMed PubMed Central Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. et al. From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Google Scholar
Kim, B.-Y., Park, J. H., Jo, H.-Y., Koo, S. K. & Park, M.-H. Optimized detection of insertions/deletions (INDELs) in whole-exome sequencing data. PLoS One 12, e0182272 (2017).
PubMed PubMed Central Google Scholar
Bailey, J. A., Yavor, A. M., Massa, H. F., Trask, B. J. & Eichler, E. E. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 11, 1005–1017 (2001).
CAS PubMed PubMed Central Google Scholar
Derrien, T. et al. Fast computation and applications of genome mappability. PLoS One 7, e30377 (2012).
CAS PubMed PubMed Central Google Scholar
Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).
CAS PubMed PubMed Central Google Scholar
Ostrander, B. E. P. et al. Whole-genome analysis for effective clinical diagnosis and gene discovery in early infantile epileptic encephalopathy. NPJ Genom. Med. 3, 22 (2018).
PubMed PubMed Central Google Scholar
Blake, J. A. et al. Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. Nucleic Acids Res. 45, D723–D729 (2017).
CAS PubMed Google Scholar
Chen, K. M., Cofer, E. M., Zhou, J. & Troyanskaya, O. G. et al. Selene: a PyTorch-based deep learning library for sequence data. Nat. Methods 16, 315–318 (2019).
CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).
PubMed PubMed Central Google Scholar
Lian, X. et al. Directed cardiomyocyte differentiation from human pluripotent stem cells by modulating Wnt/β-catenin signaling under fully defined conditions. Nat. Protoc. 8, 162–175 (2013).
CAS PubMed Google Scholar
Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC–seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).
Google Scholar
Corces, M. R. et al. An improved ATAC–seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).
CAS PubMed PubMed Central Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
CAS PubMed PubMed Central Google Scholar
Yu, G., Wang, L.-G. & He, Q.-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383 (2015).
CAS PubMed Google Scholar
Spurrell, C. H. et al. Genome-wide fetalization of enhancer architecture in heart disease. Preprint at bioRxiv https://doi.org/10.1101/591362 (2019).
Sharma, A., Toepfer, C. N., Schmid, M., Garfinkel, A. C. & Seidman, C. E. Differentiation and contractile analysis of GFP-sarcomere reporter hiPSC-cardiomyocytes. Curr. Protoc. Hum. Genet. 96, 21.12.1–21.12.12 (2018).
CAS Google Scholar
Shah, A., Qian, Y., Weyn-Vanhentenryck, S. M. & Zhang, C. CLIP Tool Kit (CTK): a flexible and robust pipeline to analyze CLIP sequencing data. Bioinformatics 33, 566–567 (2017).
CAS PubMed Google Scholar
Feng, H. et al. Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites. Mol. Cell 74, 1189–1204.e6 (2019).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We are enormously grateful to the patients and families who participated in this research. We thank the following for patient recruitment: A. Julian, M. MacNeal, Y. Mendez, T. Mendiz-Ramdeen and C. Mintz (Icahn School of Medicine at Mount Sinai); N. Cross (Yale School of Medicine); J. Ellashek and N. Tran (Children’s Hospital of Los Angeles); B. McDonough, J. Geva and M. Borensztein (Harvard Medical School); K. Flack, L. Panesar and N. Taylor (University College London); E. Taillie (University of Rochester School of Medicine and Dentistry); S. Edman, J. Garbarini, J. Tusi and S. Woyciechowski (Children’s Hospital of Philadelphia); D. Awad, C. Breton, K. Celia, C. Duarte, D. Etwaru, N. Fishman, E. Griffin, M. Kaspakoval, J. Kline, R. Korsin, A. Lanz, E. Marquez, D. Queen, A. Rodriguez, J. Rose, J. K. Sond, D. Warburton, A. Wilpers and R. Yee (Columbia Medical School); D. Gruber (Cohen Children’s Medical Center, Northwell Health). These data were generated by the PCGC, under the auspices of the Bench to Bassinet Program (https://benchtobassinet.com) of the NHLBI. The results analyzed and published here are based in part on data generated by Gabriella Miller Kids First Pediatric Research Program projects phs001138.v1.p2/phs001194.v1.p2, and were accessed from the Kids First Data Resource Portal (https://kidsfirstdrc.org/) and/or dbGaP (www.ncbi.nlm.nih.gov/gap). This manuscript was prepared in collaboration with investigators of the PCGC and has been reviewed and/or approved by the PCGC. PCGC investigators are listed at https://benchtobassinet.com/?page_id=119. This work was supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai. We are grateful to all of the families at the participating Simons Simplex Collection (SSC) sites, as well as the principal investigators (A. Beaudet, R. Bernier, J. Constantino, E. Cook, E. Fombonne, D. Geschwind, R. Goin-Kochel, E. Hanson, D. Grice, A. Klin, D. Ledbetter, C. Lord, C. Martin, D. Martin, R. Maxim, J. Miles, O. Ousley, K. Pelphrey, B. Peterson, J. Piggot, C. Saulnier, M. State, W. Stone, J. Sutcliffe, C. Walsh, Z. Warren and E. Wijsman). We appreciate the access obtained to phenotypic and/or genetic data on SFARI Base. Approved researchers can obtain the SSC population dataset described in this study (https://www.sfari.org/resource/simons-simplex-collection) by applying at https://base.sfari.org. This work was supported by the Mount Sinai Medical Scientist Training Program (5T32GM007280 to F.R.), National Institute of Dental and Craniofacial Research Interdisciplinary Training in Systems and Developmental Biology and Birth Defects (T32HD075735 to F.R.), Harvard Medical School Epigenetic and Gene Dynamics Award (S.U.M. and C.E.S.), American Heart Association Post-Doctoral Fellowship (S.U.M.), and Howard Hughes Medical Institute (C.E.S.). Research conducted at the E.O. Lawrence Berkeley National Laboratory was supported by National Institutes of Health (NIH) grants (UM1HL098166 and R24HL123879) and performed under Department of Energy Contract DE-AC02-05CH11231, University of California. O.T. is a CIFAR fellow and this work was partially supported by NIH grant R01GM071966. The PCGC program is funded by the NHLBI, NIH, US Department of Health and Human Services through grants UM1HL128711, UM1HL098162, UM1HL098147, UM1HL098123, UM1HL128761 and U01HL131003. The PCGC Kids First study includes data sequenced by the Broad Institute (U24 HD090743-01).

Author information

These authors contributed equally: Felix Richter, Sarah U. Morton, Seong Won Kim, Alexander Kitaygorodsky, Lauren K. Wasson, Kathleen M. Chen.
These authors jointly supervised this work: Deepak Srivastava, Martin Tristani-Firouzi, Olga G. Troyanskaya, Diane E. Dickel, Yufeng Shen, Jonathan G. Seidman, Christine E. Seidman, Bruce D. Gelb.

Authors and Affiliations

Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Felix Richter & Kathryn B. Manheimer
Department of Pediatrics, Harvard Medical School, Boston, MA, USA
Sarah U. Morton
Division of Newborn Medicine, Boston Children’s Hospital, Boston, MA, USA
Sarah U. Morton
Department of Genetics, Harvard Medical School, Boston, MA, USA
Seong Won Kim, Lauren K. Wasson, Steven R. DePalma, Michael Parfenov, Jason Homsy, Joshua M. Gorham, Jonathan G. Seidman & Christine E. Seidman
Departments of Systems Biology and Biomedical Informatics, Columbia University, New York, NY, USA
Alexander Kitaygorodsky, Hongjian Qi & Yufeng Shen
Flatiron Institute, Simons Foundation, New York, NY, USA
Kathleen M. Chen, Jian Zhou & Olga G. Troyanskaya
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
Jian Zhou & Olga G. Troyanskaya
Lyda Hill Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
Jian Zhou
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Nihir Patel, Eric E. Schadt & Bruce D. Gelb
Center for External Innovation, Takeda Pharmaceuticals USA, Cambridge, MA, USA
Jason Homsy
Sema4, Stamford, CT, USA
Kathryn B. Manheimer & Eric E. Schadt
Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah School of Medicine, Salt Lake City, UT, USA
Matthew Velinder, Andrew Farrell & Gabor Marth
Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Eric E. Schadt
Heart Development and Structural Diseases Branch, Division of Cardiovascular Sciences, NHLBI/NIH, Bethesda, MD, USA
Jonathan R. Kaltman
Boston Children’s Hospital, Boston, MA, USA
Jane W. Newburger
Cardiorespiratory Unit, Great Ormond Street Hospital, London, UK
Alessandro Giardini
Division of Cardiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
Elizabeth Goldmuntz
Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
Elizabeth Goldmuntz
Departments of Pediatrics and Genetics, Yale University School of Medicine, New Haven, CT, USA
Martina Brueckner
Children’s Hospital Los Angeles, Los Angeles, CA, USA
Richard Kim
Department of Pediatrics, University of Rochester, Rochester, NY, USA
George A. Porter Jr.
Department of Pediatrics, Stanford University, Palo Alto, CA, USA
Daniel Bernstein
Departments of Pediatrics and Medicine, Columbia University Medical Center, New York, NY, USA
Wendy K. Chung
Gladstone Institute of Cardiovascular Disease and University of California San Francisco, San Francisco, CA, USA
Deepak Srivastava
Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
Martin Tristani-Firouzi
Department of Computer Science, Princeton University, Princeton, NJ, USA
Olga G. Troyanskaya
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, USA
Diane E. Dickel
Department of Cardiology, Brigham and Women’s Hospital, Boston, MA, USA
Christine E. Seidman
Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Bruce D. Gelb
Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Bruce D. Gelb

Authors

Felix Richter
View author publications
You can also search for this author in PubMed Google Scholar
Sarah U. Morton
View author publications
You can also search for this author in PubMed Google Scholar
Seong Won Kim
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Kitaygorodsky
View author publications
You can also search for this author in PubMed Google Scholar
Lauren K. Wasson
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen M. Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hongjian Qi
View author publications
You can also search for this author in PubMed Google Scholar
Nihir Patel
View author publications
You can also search for this author in PubMed Google Scholar
Steven R. DePalma
View author publications
You can also search for this author in PubMed Google Scholar
Michael Parfenov
View author publications
You can also search for this author in PubMed Google Scholar
Jason Homsy
View author publications
You can also search for this author in PubMed Google Scholar
Joshua M. Gorham
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn B. Manheimer
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Velinder
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Farrell
View author publications
You can also search for this author in PubMed Google Scholar
Gabor Marth
View author publications
You can also search for this author in PubMed Google Scholar
Eric E. Schadt
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan R. Kaltman
View author publications
You can also search for this author in PubMed Google Scholar
Jane W. Newburger
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Giardini
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth Goldmuntz
View author publications
You can also search for this author in PubMed Google Scholar
Martina Brueckner
View author publications
You can also search for this author in PubMed Google Scholar
Richard Kim
View author publications
You can also search for this author in PubMed Google Scholar
George A. Porter Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Bernstein
View author publications
You can also search for this author in PubMed Google Scholar
Wendy K. Chung
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Martin Tristani-Firouzi
View author publications
You can also search for this author in PubMed Google Scholar
Olga G. Troyanskaya
View author publications
You can also search for this author in PubMed Google Scholar
Diane E. Dickel
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan G. Seidman
View author publications
You can also search for this author in PubMed Google Scholar
Christine E. Seidman
View author publications
You can also search for this author in PubMed Google Scholar
Bruce D. Gelb
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

F.R., S.U.M., S.W.K., A.K., L.K.W., K.M.C., J.R.K., O.G.T., D.E.D., Y.S., J.G.S., C.E.S. and B.D.G. conceived and designed the experiments/analyses. J.R.K., J.W.N., A.G., E.G., M.B., R.K., G.A.P., D.B., W.K.C., D.S., M.T.-F., J.G.S., C.E.S. and B.D.G. contributed to cohort ascertainment, phenotypic characterization and recruitment. F.R., S.U.M., A.K., H.Q., N.P., S.R.D., M.P., J.H., J.M.G., K.B.M., M.V., A.F., G.M., W.K.C., Y.S., J.G.S., C.E.S. and B.D.G. contributed to whole-genome sequencing production, validation and analysis. F.R., S.U.M., A.K., K.M.C., H.Q., E.E.S., O.G.T., Y.S., J.G.S., C.E.S. and B.D.G. contributed to statistical analyses. F.R., K.M.C., J.Z., O.G.T. and B.D.G. developed the HeartENN model. S.U.M., S.W.K., L.K.W., D.E.D., J.G.S. and C.E.S. generated and analyzed fetal heart and iPSC data. F.R., S.U.M., S.W.K., A.K., L.K.W., K.M.C., Y.S., J.G.S., C.E.S. and B.D.G. wrote and reviewed the manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Bruce D. Gelb.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Other pipelines identified 94% of DNVs in control trios.

Overlaps with DNVs identified in 1,470 control trios with two other pipelines^9,10. Of note, a third analysis of these trios did not include de novo calls⁴². For consistency with other pipelines, only SNVs were included and variants in LCRs, blacklists, segmental duplications, and repeats were excluded. Together, 94% of de novo SNVs were called by at least one other pipeline.

Extended Data Fig. 2 Correlation between parental age at proband birth and DNVs/trio.

Multiple linear regression (β_{paternal_age}x + β_{maternal_age}x + β_intercept + ε) was fitted on 763 CHD and 1,611 unaffected individuals to calculate the associations between paternal and maternal age for SNVs, indels, and combined. Regression coefficients and P-values are shown, uncorrected for multiple hypotheses. Sequencing metric comparisons between the centers, colored by cases (n = 763) and controls (n = 1,611), found moderate bias in DNV quantity, so the background statistical parameter throughout the manuscript is total number of DNVs. Box plots show medians and interquartile ranges.

Extended Data Fig. 3 De novo variant (DNV) CHD-unaffected burden.

The number of DNVs in 184 noncoding annotations (points) genome-wide and within 10 kb of TSSs for 6 gene sets (facets) was counted in CHD (n = 749) and Simons unaffected (n = 1,611) individuals. The P value threshold (1.5 x 10^-4, horizontal blue line) is 0.05 divided by the product of the number of effective annotations (n = 47) and number of gene sets (n = 7). The P value (y-axis) was calculated with a two-sided Fisher’s exact test, the odds ratio (x-axis) was DNVs_{annotation,CHD}/DNVs_total,CHD vs. DNVs_{annotation,unaffected}/DNVs_{total, unaffected}. No annotations surpassed the P value threshold. CHD, congenital heart disease; HHE, high heart expression.

Extended Data Fig. 4 HeartENN performance was comparable to DeepSEA.

HearENN ROC AUC mean = 0.93 and AUPRC mean = 0.34. ROC AUC, receiver operator characteristics area under the curve; AUPRC, area under the precision recall curve.

Extended Data Fig. 5 Determining an absolute functional difference score range.

a, Comparison of HGMD disease mutations (blue, n = 1,564) and polymorphism (gray, n = 642) DeepSEA absolute functional difference scores at varying functional cut-offs illustrates a similar distribution and functionally impactful range ≥0.1 (arrow) for disease mutations. No statistical significance testing was performed. b, The similarity of null distributions for DeepSEA (gray, downsampled to 184 features) and HeartENN (heart) HGMD polymorphism scores suggested that the DeepSEA functional score range was also applicable to HeartENN (gray and red n = 642). Scores of 0 set off to left (as 10^-4).

Extended Data Fig. 6 Support for HeartENN ≥ 0.1 functional ranking.

For all DNVs (n = 170,171), overlap between HeartENN ≥0.1 (n = 6,415) and other noncoding scores was assessed with a two-sided Fisher’s exact test (left panel). Case–control burden for these other noncoding scores (right panel) was statistically significant for CADD ≥15 (P_Bonferroni = 0.019) with a two-sided Fisher’s exact test (cases n = 56,164 and controls n = 114,065). For both panels, unadjusted P-values are tabulated, and red indicates a Benjamini-Hochberg-adjusted P value false discovery rate (FDR) < 0.05.

Extended Data Fig. 7 Relationship between sequence length inserted into the pMRPA1 plasmid and the transcript reads/plasmid copies in MPRAs.

The length of the sequences inserted into the pMPRA1 plasmid (x-axis) ranged from 300 to 1,600 bp. After transfection of four libraries (color coded as per key) into the iPSC–CMs, the resulting ratios of transcript reads (mRNA) per plasmid copies (DNA) are graphed on the y-axis, showing no systematic relationship between insert length and transcriptional level.

Extended Data Fig. 8 DNVs with a trend towards decreased expression by MPRA assay.

Box plots for two DNVs for which two MPRA replicates were significantly different but overall statistical significance across all replicates was not attained. Boxplots show the median fold change (FC), first and third quartiles (lower and upper hinges), and range of values (whiskers and outlying points). Statistical significance was assessed with two-sided t-test Benjamini-Hochberg-adjusted P-values. Each boxplot has at least 3 independent experiments with 4 technical replicates each.

Extended Data Fig. 9 Fraction of DNVs in each of the canonical variant classes.

The fraction was calculated separately within CHD and unaffected subjects for each of the three methods (including overlaps) and the total number of variants in each group (right table).

Extended Data Fig. 10 DNV enrichment in phenotype subgroups.

a, Enrichment of DNVs with predicted functional impacts (score ≥0.1) for HeartENN (left) and DeepSEA (right) within phenotype subgroups. b, Enrichment of de novo SNVs with H3K36me3 marks implicated in RNA-binding protein disruption in different subgroups for the most significant (left) and highest effect size (right) hits. Both a and b were performed with a two-sided Fisher’s exact test (unadjusted P-values and 95% C.I.s shown) comparing the fraction of DNVs in each subgroup (HeartENN ≥ 0.1, DeepSEA ≥ 0.1, etc.) to the same control cohort. For HeartENN, there were n = 4,177 control DNVs with HeartENN ≥ 0.1 and n = 109,888 control DNVs with HeartENN < 0.1. NDD, neurodevelopmental disorder; ECA, extracardiac anomaly.

Supplementary information

Supplementary Information

Supplementary Note and Fig. 1

Reporting Summary

Supplementary Table

Supplementary Tables 1–16

Rights and permissions

Reprints and permissions

About this article

Cite this article

Richter, F., Morton, S.U., Kim, S.W. et al. Genomic analyses implicate noncoding de novo variants in congenital heart disease. Nat Genet 52, 769–777 (2020). https://doi.org/10.1038/s41588-020-0652-z

Download citation

Received: 09 March 2019
Accepted: 22 May 2020
Published: 29 June 2020
Issue Date: August 2020
DOI: https://doi.org/10.1038/s41588-020-0652-z

This article is cited by

Statistical methods for assessing the effects of de novo variants on birth defects
- Yuhan Xie
- Ruoxuan Wu
- Hongyu Zhao
Human Genomics (2024)
Functional dissection of human cardiac enhancers and noncoding de novo variants in congenital heart disease
- Feng Xiao
- Xiaoran Zhang
- William T. Pu
Nature Genetics (2024)
Deciphering complex breakage-fusion-bridge genome rearrangements with Ambigram
- Chaohui Li
- Lingxi Chen
- Shuai Cheng Li
Nature Communications (2023)
Copy number variation-associated lncRNAs may contribute to the etiologies of congenital heart disease
- Yibo Lu
- Qing Fang
- Bo Wang
Communications Biology (2023)
Heterozygous rare variants in NR2F2 cause a recognizable multiple congenital anomaly syndrome with developmental delays
- Mythily Ganapathi
- Leticia S. Matsuoka
- Elizabeth Bhoj
European Journal of Human Genetics (2023)