Long-read sequencing approach for RPGR-related retinopathy
Laboratory Genetics and Genomics
-
Primary Categories:
- Laboratory Genetics
-
Secondary Categories:
- Laboratory Genetics
Introduction:
Pathogenic variants in the retinitis pigmentosa GTPase regulator (RPGR) gene are responsible for the majority of X-linked retinitis pigmentosa (XLRP) cases. Affected males present with severe retinal degenerations while a large percentage of female carriers manifest a phenotypic spectrum typically milder than their male relatives. An RPGR isoform including a terminal exon, known as open reading frame 15 (ORF15), is predominantly expressed in photoreceptors and its protein product plays a critical role in intraflagellar transport processes. This region contains, in part, a highly repetitive and purine-rich sequence (97.5%) of 1,061 bp, making it a hotspot for small deletions and duplications resulting from polymerase slippage during replication. As such, the ORF15 region accounts for approximately 80% of variants that cause RPGR-related XLRP in males and often requires a specialized diagnostic testing approach for this gene. In this study, we describe a method to sequence and analyze the RPGR gene.
Methods:
This study included 55 individuals (30 males and 25 females) harboring 49 unique, previously confirmed ORF15 variants and 11 control samples (3 males and 8 females). Long-range PCR (LR-PCR) was used to amplify either the 2.1 kb region containing ORF15 or an 18 kb fragment encompassing exon 9 through ORF15. The Native Barcoding Kit (SQK-NBD114.24, Oxford Nanopore Technologies (ONT)) was used for library preparation according to the manufacturer’s recommendations. In brief, the purified PCR products were enzymatically end-prepped followed by barcode ligation. The samples were then pooled for sequencing adapter ligation. Each library of 24 samples was loaded onto a primed MinION R10.4.1 flow cell and sequenced for 2-3 hours on an Mk1C sequencer (ONT). Fastq files were generated using Dorado (v0.8.1) in super-accuracy mode and full-length amplicons were aligned to the human genome (GRCh38) using minimap2 (v2.26). Variant was called using Clair3 (v1.0.10) and DeepVariant (v1.6.0). Next-generation sequencing (NGS) analyses were conducted on the NIH HPC Biowulf cluster.
Results:
All frameshift or nonsense ORF15 variants were detected in 100% of test samples with a mean coverage of 374X (ranging from 30 - 2103X). The sensitivity and specificity for DeepVariant (filter flag PASS) were 91% and 100%, respectively. The sensitivity and specificity for clair3 (filter flag PASS) were also 91% and 100%, respectively. 5.5% (n=3) of cases were called only by DeepVariant, 5.5% (n=3) of cases were called only by Clair3, and 1.8% (n=1) were not called by either. After manual visualization of BAM files on Integrated Genomics Viewer (IGV) for all frameshift and nonsense variants including the “Ref Call”, both sensitivity and specificity of variants reach 100%. Two complex indels were called as two neighboring variants by both callers that required post-calling processing. In addition, six variants in male samples were mistakenly called heterozygous by one or both variant callers. No frameshift or nonsense variants were found in control samples.
Conclusion:
We demonstrate that ONT nanopore long-read sequencing effectively genotypes the challenging low-complexity ORF15 region, suggesting its applicability to other difficult genomic regions. By integrating multiple variant callers in our analysis pipeline, we successfully automated the detection of 46 out of 49 unique ORF15 variants. However, a manual review of the sequence data at each variant call locus is necessary for confirming accurate variant calls and zygosity. Furthermore, by pooling samples in batches of 24, with the potential to scale up to 96, nanopore sequencing emerges as a cost-effective, high-throughput method for diagnostic testing.
Pathogenic variants in the retinitis pigmentosa GTPase regulator (RPGR) gene are responsible for the majority of X-linked retinitis pigmentosa (XLRP) cases. Affected males present with severe retinal degenerations while a large percentage of female carriers manifest a phenotypic spectrum typically milder than their male relatives. An RPGR isoform including a terminal exon, known as open reading frame 15 (ORF15), is predominantly expressed in photoreceptors and its protein product plays a critical role in intraflagellar transport processes. This region contains, in part, a highly repetitive and purine-rich sequence (97.5%) of 1,061 bp, making it a hotspot for small deletions and duplications resulting from polymerase slippage during replication. As such, the ORF15 region accounts for approximately 80% of variants that cause RPGR-related XLRP in males and often requires a specialized diagnostic testing approach for this gene. In this study, we describe a method to sequence and analyze the RPGR gene.
Methods:
This study included 55 individuals (30 males and 25 females) harboring 49 unique, previously confirmed ORF15 variants and 11 control samples (3 males and 8 females). Long-range PCR (LR-PCR) was used to amplify either the 2.1 kb region containing ORF15 or an 18 kb fragment encompassing exon 9 through ORF15. The Native Barcoding Kit (SQK-NBD114.24, Oxford Nanopore Technologies (ONT)) was used for library preparation according to the manufacturer’s recommendations. In brief, the purified PCR products were enzymatically end-prepped followed by barcode ligation. The samples were then pooled for sequencing adapter ligation. Each library of 24 samples was loaded onto a primed MinION R10.4.1 flow cell and sequenced for 2-3 hours on an Mk1C sequencer (ONT). Fastq files were generated using Dorado (v0.8.1) in super-accuracy mode and full-length amplicons were aligned to the human genome (GRCh38) using minimap2 (v2.26). Variant was called using Clair3 (v1.0.10) and DeepVariant (v1.6.0). Next-generation sequencing (NGS) analyses were conducted on the NIH HPC Biowulf cluster.
Results:
All frameshift or nonsense ORF15 variants were detected in 100% of test samples with a mean coverage of 374X (ranging from 30 - 2103X). The sensitivity and specificity for DeepVariant (filter flag PASS) were 91% and 100%, respectively. The sensitivity and specificity for clair3 (filter flag PASS) were also 91% and 100%, respectively. 5.5% (n=3) of cases were called only by DeepVariant, 5.5% (n=3) of cases were called only by Clair3, and 1.8% (n=1) were not called by either. After manual visualization of BAM files on Integrated Genomics Viewer (IGV) for all frameshift and nonsense variants including the “Ref Call”, both sensitivity and specificity of variants reach 100%. Two complex indels were called as two neighboring variants by both callers that required post-calling processing. In addition, six variants in male samples were mistakenly called heterozygous by one or both variant callers. No frameshift or nonsense variants were found in control samples.
Conclusion:
We demonstrate that ONT nanopore long-read sequencing effectively genotypes the challenging low-complexity ORF15 region, suggesting its applicability to other difficult genomic regions. By integrating multiple variant callers in our analysis pipeline, we successfully automated the detection of 46 out of 49 unique ORF15 variants. However, a manual review of the sequence data at each variant call locus is necessary for confirming accurate variant calls and zygosity. Furthermore, by pooling samples in batches of 24, with the potential to scale up to 96, nanopore sequencing emerges as a cost-effective, high-throughput method for diagnostic testing.