Skip to main content

Conference Program

Subpage Hero

Loading

Validating array genotype data in the Million Veteran Program (MVP) for actionable, clinically significant rare variants

Clinical Genetics and Therapeutics
  • Primary Categories:
    • Genomic Medicine
  • Secondary Categories:
    • Genomic Medicine
Introduction:
The Million Veteran Program (MVP) is a national research program initiated by the United States Department of Veterans Affairs (VA), designed to advance precision healthcare by learning how genes, lifestyle, military experience and exposure affect health and illness. MVP is a diverse cohort and one of the largest to be associated with a healthcare system. MVP has released over 660,000 array genotypes, 104,924 Whole Genome Sequences (WGS) and 45,460 methylation data to the research community.



 

Methods:
The design of the MVP custom Axiom array for genotyping includes ~40% rare markers that of clinical significance. Since the rare variants are hard to accurately genotype, we developed a workflow with the aim of increasing the accuracy of the Axiom array genotyping results. We adopted the Rare Heterozygous Adjustment algorithm (rareHet algorithm) for improving rare heterozygous genotype calling and developed an advanced SNP QC pipeline to harmonize the data. We also developed a Support Vector Machine (SVM) Model to predict true rare heterozygous calls, which was used for a return-of-results project.

 

Results:
Out of 11,296,220 rare heterozygous genotypes from 159,122 participants on 145,312 markers genotyped with rareHet algorithm and passing our advanced SNP QC pipeline, 10,992,489 (97.3%) were predicted as true heterozygous by SVM model. We compared the genotypes from array and WGS and 98.59%, 98.8%, 98.99%, and 99.16% concordance was obtained for MAF bins [0%-0.001%], [0.001%-0.005%], [0.005%-0.01%], [0.01%-1%]respectively, based on 3,990 samples. For familial hypercholesterolemia (FH) actionable gene ALT, 100% of the rare variant calls for individual MVP participants were confirmed by clinical tests. We further interrogated the pharmacogenomic genes for accuracy of the star allele calls.

 

Conclusion:
In our preliminary analysis, we found 93.5% concordance of star alleles called between clinical test and MVP genotype array data. We are performing three-way validation for the star allele calls between MVP array genotypes, MVP WGS data and clinical tests. We plan to our present the cross-validation results for the pharmacogenomic genes.

 

Agenda

Sponsors