Provenance/Identity Testing in Genome Sequencing: A Single Institution’s Experience
Laboratory Genetics and Genomics
-
Primary Categories:
- Laboratory Genetics
-
Secondary Categories:
- Laboratory Genetics
Introduction:
The increasing use of genome sequencing (GS) as a first-tier diagnostic tool for pediatric and adolescent individuals with suspected genetic conditions has brought significant improvements in diagnostic accuracy and efficiency. However, issues related to sample integrity, such as identity mismatches (sample swap), sample contamination, and incidental findings (e.g. misattributed parentage (MP) in duo/trio testing), remain a challenge for clinical laboratories. To address these issues, clinical laboratories are required to establish stringent quality control (QC) measures in sample handling and provenance to ensure accurate bioinformatics processing and integrity of identity throughout the testing process. This study reports our experience with sample provenance and identity testing in GS in a pediatric clinical setting.
Methods:
Between January 2024 to October 2024, genome sequencing was performed on 572 pediatric and adolescent individuals with suspected genetic disorders. To ensure sample provenance, integrity, and assurance of familial relationships, multiple QC steps are employed in our GS workflow. A customed mass spectrometry (MS)-based genotyping panel of 30 autosomal single-nucleotide polymorphisms (SNPs) and 3 sex markers is run concurrently with GS. The SNPs have a high minor allele frequency (MAF) in the population to facilitate the discernment of uniqueness in identity. Genotype output from the MS with end-point PCR is compared against data from the aligned GS sequencing reads in the BAM. Genotypes are displayed in a custom user interface enabling a comparison of SNPs both within and between individuals for Next-generation sequencing (NGS) and MS-derived data.
Furthermore, we employ a panel of 224 identity-informative SNP sites selected for their high (~0.5) population MAF in multiple ancestral groups to compute genetic kinship between samples undergoing NGS analysis. Samples are also assessed for contamination via VerifyBamID2, which simultaneously infers and reports sample ancestry. Sample sex is confirmed by a coverage depth-based count of X and Y chromosomes, which is compared to reported sex of the individual undergoing testing.
Results:
Our QC process identified 5 cases with provenance issues: 2 cases of non-paternity, 1 case of non-maternity (undisclosed egg donor), and 2 sample handling errors between a proband and a father (one pre-analytical, prior to receipt in the laboratory, and one analytical). In total, 1342 individuals were processed for GS consisting of singleton (n=125), duo (n=124), and trio (n=323) case analyses. These issues impacted 0.87% (5/572) of the total case cohort, and 0.45% (6/1342) of total individuals tested. Similar to other published clinical cohorts, MP was identified in 0.67% of this cohort (3 individuals among 447 total duo+trio cases). However, this rate is lower than that reported for MP in the general population, likely due to the reluctance of families with known or suspected MP to undergo duo/trio, or possibly, any GS testing. In response to MP issues, the ordering clinicians opted to convert duo/trio tests to singleton/duo tests without disclosure of MP in the clinical reports. Orthogonal data to ensure sample identity, including short tandem repeat testing, sample redraw, or the provision of additional documented clinical information on parentage was used to resolve the underlying contribution to identity mismatch.
Conclusion:
Our study demonstrates the importance and effectiveness of the QC measures in identifying sample provenance issues in GS. Confirming the identity and genetic relatedness of samples, especially in family-based testing, is essential for accurate variant interpretation and for ensuring proper clinical management and effective familial risk assessment.
The increasing use of genome sequencing (GS) as a first-tier diagnostic tool for pediatric and adolescent individuals with suspected genetic conditions has brought significant improvements in diagnostic accuracy and efficiency. However, issues related to sample integrity, such as identity mismatches (sample swap), sample contamination, and incidental findings (e.g. misattributed parentage (MP) in duo/trio testing), remain a challenge for clinical laboratories. To address these issues, clinical laboratories are required to establish stringent quality control (QC) measures in sample handling and provenance to ensure accurate bioinformatics processing and integrity of identity throughout the testing process. This study reports our experience with sample provenance and identity testing in GS in a pediatric clinical setting.
Methods:
Between January 2024 to October 2024, genome sequencing was performed on 572 pediatric and adolescent individuals with suspected genetic disorders. To ensure sample provenance, integrity, and assurance of familial relationships, multiple QC steps are employed in our GS workflow. A customed mass spectrometry (MS)-based genotyping panel of 30 autosomal single-nucleotide polymorphisms (SNPs) and 3 sex markers is run concurrently with GS. The SNPs have a high minor allele frequency (MAF) in the population to facilitate the discernment of uniqueness in identity. Genotype output from the MS with end-point PCR is compared against data from the aligned GS sequencing reads in the BAM. Genotypes are displayed in a custom user interface enabling a comparison of SNPs both within and between individuals for Next-generation sequencing (NGS) and MS-derived data.
Furthermore, we employ a panel of 224 identity-informative SNP sites selected for their high (~0.5) population MAF in multiple ancestral groups to compute genetic kinship between samples undergoing NGS analysis. Samples are also assessed for contamination via VerifyBamID2, which simultaneously infers and reports sample ancestry. Sample sex is confirmed by a coverage depth-based count of X and Y chromosomes, which is compared to reported sex of the individual undergoing testing.
Results:
Our QC process identified 5 cases with provenance issues: 2 cases of non-paternity, 1 case of non-maternity (undisclosed egg donor), and 2 sample handling errors between a proband and a father (one pre-analytical, prior to receipt in the laboratory, and one analytical). In total, 1342 individuals were processed for GS consisting of singleton (n=125), duo (n=124), and trio (n=323) case analyses. These issues impacted 0.87% (5/572) of the total case cohort, and 0.45% (6/1342) of total individuals tested. Similar to other published clinical cohorts, MP was identified in 0.67% of this cohort (3 individuals among 447 total duo+trio cases). However, this rate is lower than that reported for MP in the general population, likely due to the reluctance of families with known or suspected MP to undergo duo/trio, or possibly, any GS testing. In response to MP issues, the ordering clinicians opted to convert duo/trio tests to singleton/duo tests without disclosure of MP in the clinical reports. Orthogonal data to ensure sample identity, including short tandem repeat testing, sample redraw, or the provision of additional documented clinical information on parentage was used to resolve the underlying contribution to identity mismatch.
Conclusion:
Our study demonstrates the importance and effectiveness of the QC measures in identifying sample provenance issues in GS. Confirming the identity and genetic relatedness of samples, especially in family-based testing, is essential for accurate variant interpretation and for ensuring proper clinical management and effective familial risk assessment.