Real-World Evaluation of ExpansionHunter for Detecting STR Expansions in Whole-Exome Sequencing Data: Insights from a Clinical Diagnostic Setting
Laboratory Genetics and Genomics
-
Primary Categories:
- Laboratory Genetics
-
Secondary Categories:
- Laboratory Genetics
Introduction:
ExpansionHunter is a bioinformatics tool designed to detect and analyze repeat expansions in short tandem repeats (STRs) using short-read sequencing data. It identifies reads that span, flank, or are fully contained within each repeat in BAM/CRAM files and maps them to the reference genome to estimate the size of the repeat expansion at each locus. STR expansions are implicated in a variety of genetic disorders, including Huntington’s disease and several spinocerebellar ataxias. ExpansionHunter has demonstrated considerable sensitivity and specificity in detecting STR expansions from PCR-free whole-genome sequencing data. However, its application to whole-exome sequencing (WES) remains underexplored, even though numerous disease-associated STR loci are located within coding regions. Here, we present a real-world evaluation of ExpansionHunter’s performance for analyzing WES data over a one-year period in a commercial diagnostic center. This study provides insights into the practical utility of the tool in a clinical diagnostic setting.
Methods:
ExpansionHunter was applied to 4,658 WES samples. All participants provided informed consent for the use of their genetic data in research. Samples with detected STR expansions, based on reference values from the STRipy database, were included in the analysis. Visual inspection of all alignments was performed using REViewer, a specialized tool for examining read alignments in tandem repeat regions. Reports from physicians who ordered the WES were reviewed to correlate the identified STR genotypes with the clinical information provided.
Results:
Nineteen samples were identified with expanded alleles: 1 in ATN1, 1 in ATXN2, 2 in ATXN7, 1 in GLS, 13 in DMPK, and 1 in FMR1. Five samples (ATN1, FMR1, and 3 in DMPK) were excluded after visual inspection using REViewer due to likely false-positive results. Clinical data were analyzed for the remaining 14 samples. Of these, 8 (ATXN2, ATXN7, and 5 in DMPK) showed strong concordance between the identified genotype and clinical symptoms, with the ATXN2 expansion later confirmed by PCR. One case (GLS) also exhibited strong correlation, although the expansion was in a heterozygous state with no second variant identified. In the remaining 5 cases, limited or absent clinical data prevented adequate assessment of genotype-phenotype correlation.
Conclusion:
This study highlights the utility of ExpansionHunter for detecting STR expansions in WES data, emphasizing its potential in clinical diagnostic workflows. Of the 19 identified cases, 14 were supported after visual inspection, with 9 showing strong genotype-phenotype correlation, including cases linked to ATXN2, ATXN7, DMPK, and GLS. This represents an improved accuracy of approximately 0,2% in an unselected pool of WES samples. The exclusion of five likely false-positive results underscores the importance of integrating visual validation tools such as REViewer to ensure diagnostic accuracy. While ExpansionHunter proves to be a valuable tool for analyzing STR expansions, its application would benefit from further optimization to reduce false-positive rates in WES. This real-world evaluation supports the integration of STR expansion analysis into routine diagnostic pipelines for inherited disorders.
ExpansionHunter is a bioinformatics tool designed to detect and analyze repeat expansions in short tandem repeats (STRs) using short-read sequencing data. It identifies reads that span, flank, or are fully contained within each repeat in BAM/CRAM files and maps them to the reference genome to estimate the size of the repeat expansion at each locus. STR expansions are implicated in a variety of genetic disorders, including Huntington’s disease and several spinocerebellar ataxias. ExpansionHunter has demonstrated considerable sensitivity and specificity in detecting STR expansions from PCR-free whole-genome sequencing data. However, its application to whole-exome sequencing (WES) remains underexplored, even though numerous disease-associated STR loci are located within coding regions. Here, we present a real-world evaluation of ExpansionHunter’s performance for analyzing WES data over a one-year period in a commercial diagnostic center. This study provides insights into the practical utility of the tool in a clinical diagnostic setting.
Methods:
ExpansionHunter was applied to 4,658 WES samples. All participants provided informed consent for the use of their genetic data in research. Samples with detected STR expansions, based on reference values from the STRipy database, were included in the analysis. Visual inspection of all alignments was performed using REViewer, a specialized tool for examining read alignments in tandem repeat regions. Reports from physicians who ordered the WES were reviewed to correlate the identified STR genotypes with the clinical information provided.
Results:
Nineteen samples were identified with expanded alleles: 1 in ATN1, 1 in ATXN2, 2 in ATXN7, 1 in GLS, 13 in DMPK, and 1 in FMR1. Five samples (ATN1, FMR1, and 3 in DMPK) were excluded after visual inspection using REViewer due to likely false-positive results. Clinical data were analyzed for the remaining 14 samples. Of these, 8 (ATXN2, ATXN7, and 5 in DMPK) showed strong concordance between the identified genotype and clinical symptoms, with the ATXN2 expansion later confirmed by PCR. One case (GLS) also exhibited strong correlation, although the expansion was in a heterozygous state with no second variant identified. In the remaining 5 cases, limited or absent clinical data prevented adequate assessment of genotype-phenotype correlation.
Conclusion:
This study highlights the utility of ExpansionHunter for detecting STR expansions in WES data, emphasizing its potential in clinical diagnostic workflows. Of the 19 identified cases, 14 were supported after visual inspection, with 9 showing strong genotype-phenotype correlation, including cases linked to ATXN2, ATXN7, DMPK, and GLS. This represents an improved accuracy of approximately 0,2% in an unselected pool of WES samples. The exclusion of five likely false-positive results underscores the importance of integrating visual validation tools such as REViewer to ensure diagnostic accuracy. While ExpansionHunter proves to be a valuable tool for analyzing STR expansions, its application would benefit from further optimization to reduce false-positive rates in WES. This real-world evaluation supports the integration of STR expansion analysis into routine diagnostic pipelines for inherited disorders.