Improving Rare Disease Diagnosis: Performance of an Automated Pipeline for Genomic Reanalysis
Laboratory Genetics and Genomics
-
Primary Categories:
- Laboratory Genetics
-
Secondary Categories:
- Laboratory Genetics
Introduction:
Reanalysis of existing genomic data is highly effective in increasing diagnostic yield for rare disease; however the incentive to pursue reanalysis remains low given its time consuming nature (in our hands, typically requiring ~0.5-3 hours per case when performed manually) and challenges in securing research funding or insurance reimbursement. Here, we present the Broad Institute Center for Mendelian Genomics’ approach to reanalysis, including Talos, a pipeline developed in collaboration with the Garvan Institute and Murdoch Children’s Research Institute (Australia). Talos leverages frequently updated data on gene-disease relationships and variant pathogenicity in public databases to efficiently identify cases most likely to benefit from reanalysis.
Methods:
Talos is an open-source, cloud-enabled variant prioritization tool that uses both static annotations, such as variant impact, and continuously updated annotations (e.g. ClinVar variant classifications and PanelApp Australia gene-disease associations) to prioritize variants that could plausibly be pathogenic within a user-provided joint-called Variant Call Format (VCF) file. Talos also considers whether a variant or variant pair segregates with the expected mode(s) of inheritance for each gene. Phenotype match annotations can be generated if HPO-encoded phenotype information is provided. To evaluate the performance of Talos, we ran it on a cohort of 688 rare disease families recruited through the Broad Institute Rare Genomes Project. Prior to this study, these families underwent comprehensive WGS analysis with annual “manual” reanalysis on unsolved cases. We assessed the sensitivity of Talos to detect known diagnostic variants and evaluated the number of irrelevant variants returned (specificity). Following this benchmarking evaluation, Talos was re-run monthly to assess its potential for identifying new diagnostic variants over time.
Results:
Talos achieved a maximal sensitivity of 83.9% (78/93) for in-scope diagnoses and 73.6% (78/106) for all diagnoses without phenotype matching. Out-of-scope diagnostic variants included short tandem repeats, mitochondrial DNA variants, and loss of SMN1, which was detected by a bespoke caller. Missed in-scope variants included variants that were classified as likely pathogenic based on internal data that is not yet available in public databases, de novo variants on the X chromosome in female participants, and a subset of variants with conflicting assertions in ClinVar. While an average of 1.8 variants was returned per family without phenotype matching, this was reduced to an average of 1 variant per family with phenotype match. Phenotype matching resulted in slightly reduced sensitivity, with 79.6% (74/93) of in-scope diagnoses returned. Review of all Talos-prioritized variants led to the clinical return of 4 previously unrecognized diagnoses to families.
Conclusion:
Talos has potential to streamline the reanalysis of genomic data, demonstrating high sensitivity and specificity in variant prioritization within our rare disease cohort. By enabling ongoing, automated re-evaluation of unsolved cases, Talos offers a scalable solution to increase diagnostic yield with significantly reduced time commitment as compared to traditional reanalysis.
Reanalysis of existing genomic data is highly effective in increasing diagnostic yield for rare disease; however the incentive to pursue reanalysis remains low given its time consuming nature (in our hands, typically requiring ~0.5-3 hours per case when performed manually) and challenges in securing research funding or insurance reimbursement. Here, we present the Broad Institute Center for Mendelian Genomics’ approach to reanalysis, including Talos, a pipeline developed in collaboration with the Garvan Institute and Murdoch Children’s Research Institute (Australia). Talos leverages frequently updated data on gene-disease relationships and variant pathogenicity in public databases to efficiently identify cases most likely to benefit from reanalysis.
Methods:
Talos is an open-source, cloud-enabled variant prioritization tool that uses both static annotations, such as variant impact, and continuously updated annotations (e.g. ClinVar variant classifications and PanelApp Australia gene-disease associations) to prioritize variants that could plausibly be pathogenic within a user-provided joint-called Variant Call Format (VCF) file. Talos also considers whether a variant or variant pair segregates with the expected mode(s) of inheritance for each gene. Phenotype match annotations can be generated if HPO-encoded phenotype information is provided. To evaluate the performance of Talos, we ran it on a cohort of 688 rare disease families recruited through the Broad Institute Rare Genomes Project. Prior to this study, these families underwent comprehensive WGS analysis with annual “manual” reanalysis on unsolved cases. We assessed the sensitivity of Talos to detect known diagnostic variants and evaluated the number of irrelevant variants returned (specificity). Following this benchmarking evaluation, Talos was re-run monthly to assess its potential for identifying new diagnostic variants over time.
Results:
Talos achieved a maximal sensitivity of 83.9% (78/93) for in-scope diagnoses and 73.6% (78/106) for all diagnoses without phenotype matching. Out-of-scope diagnostic variants included short tandem repeats, mitochondrial DNA variants, and loss of SMN1, which was detected by a bespoke caller. Missed in-scope variants included variants that were classified as likely pathogenic based on internal data that is not yet available in public databases, de novo variants on the X chromosome in female participants, and a subset of variants with conflicting assertions in ClinVar. While an average of 1.8 variants was returned per family without phenotype matching, this was reduced to an average of 1 variant per family with phenotype match. Phenotype matching resulted in slightly reduced sensitivity, with 79.6% (74/93) of in-scope diagnoses returned. Review of all Talos-prioritized variants led to the clinical return of 4 previously unrecognized diagnoses to families.
Conclusion:
Talos has potential to streamline the reanalysis of genomic data, demonstrating high sensitivity and specificity in variant prioritization within our rare disease cohort. By enabling ongoing, automated re-evaluation of unsolved cases, Talos offers a scalable solution to increase diagnostic yield with significantly reduced time commitment as compared to traditional reanalysis.