Whole genome sequencing of 55 exome negative trios identifies putative rare pathogenic noncoding variants in infants with congenital anomalies
Laboratory Genetics and Genomics
-
Primary Categories:
- Genomic Medicine
-
Secondary Categories:
- Genomic Medicine
Introduction:
Children born with congenital anomalies represent a population enriched for genetically mediated disease. Clinical exome sequencing fails to identify a genetic cause in more than half of children with congenital anomalies. Exome sequencing focuses on coding regions which ignores noncoding variants, which make up the majority of human variation. Noncoding variants are difficult to interpret due to their high volume and lack of standards for predicting their functional consequences. Therefore, identifying pathogenic noncoding variants requires additional epigenetic and functional evidence. We hypothesize that rare high effect noncoding variants are responsible for congenital anomalies in a subset of infants. We leveraged trio whole genome sequencing and functional genomic approaches to systematically identify and analyze pathogenic noncoding variation.
Methods:
Whole genome sequencing data was generated from blood samples from infants enrolled in the Birth Defects Biorepository at the Children’s Hospital of Philadelphia. Only probands with negative clinical exome sequencing were included. Parental samples were also analyzed to identify variants with appropriate segregation. Informatic analysis prioritized variants new in babies (de Novo) and recessively inherited variants as putative pathogenic variants. These variants were filtered with public variant databases to prioritize rare variation. Variants were annotated for regulatory activity using public and in-house resources. Clinical information was gathered and manually analyzed to determine primary and secondary phenotypes.
Results:
55 exome-negative probands with full trio sequencing were included in the analysis. Clinical phenotypes were varied, but the cohort was enriched for infants born with congenital diaphragmatic hernia and congenital heart disease. We identified ~80 de Novo variants and ~100 recessive variants per proband on average. Most probands harbored variants around genes with known mendelian phenotypes. A majority of variants overlapped a gene regulatory annotation. These variants were further nominated for functional testing using a massively parallel reporter assay.
Conclusion:
Informatic analysis was able to prioritize noncoding variation with highest likelihood of pathogenicity. This approach significantly reduces the variant burden when searching for putative pathogenic noncoding variation. The final variant count is amenable to high throughput approaches for functional validation, which further nominate pathogenic noncoding variants. This protocol can readily be applied to clinical sequencing data. This serves as an initial step to provide a genetic diagnosis in undiagnosed cases and increase the understanding of pathogenic noncoding variation.
Children born with congenital anomalies represent a population enriched for genetically mediated disease. Clinical exome sequencing fails to identify a genetic cause in more than half of children with congenital anomalies. Exome sequencing focuses on coding regions which ignores noncoding variants, which make up the majority of human variation. Noncoding variants are difficult to interpret due to their high volume and lack of standards for predicting their functional consequences. Therefore, identifying pathogenic noncoding variants requires additional epigenetic and functional evidence. We hypothesize that rare high effect noncoding variants are responsible for congenital anomalies in a subset of infants. We leveraged trio whole genome sequencing and functional genomic approaches to systematically identify and analyze pathogenic noncoding variation.
Methods:
Whole genome sequencing data was generated from blood samples from infants enrolled in the Birth Defects Biorepository at the Children’s Hospital of Philadelphia. Only probands with negative clinical exome sequencing were included. Parental samples were also analyzed to identify variants with appropriate segregation. Informatic analysis prioritized variants new in babies (de Novo) and recessively inherited variants as putative pathogenic variants. These variants were filtered with public variant databases to prioritize rare variation. Variants were annotated for regulatory activity using public and in-house resources. Clinical information was gathered and manually analyzed to determine primary and secondary phenotypes.
Results:
55 exome-negative probands with full trio sequencing were included in the analysis. Clinical phenotypes were varied, but the cohort was enriched for infants born with congenital diaphragmatic hernia and congenital heart disease. We identified ~80 de Novo variants and ~100 recessive variants per proband on average. Most probands harbored variants around genes with known mendelian phenotypes. A majority of variants overlapped a gene regulatory annotation. These variants were further nominated for functional testing using a massively parallel reporter assay.
Conclusion:
Informatic analysis was able to prioritize noncoding variation with highest likelihood of pathogenicity. This approach significantly reduces the variant burden when searching for putative pathogenic noncoding variation. The final variant count is amenable to high throughput approaches for functional validation, which further nominate pathogenic noncoding variants. This protocol can readily be applied to clinical sequencing data. This serves as an initial step to provide a genetic diagnosis in undiagnosed cases and increase the understanding of pathogenic noncoding variation.