EXAMINING THE PHENOTYPIC DEPTH OF THE U03 GENOMIC RESEARCH TO ELUCIDATE THE GENETICS OF RARE DISEASE (GREGoT) DATASET
Clinical Genetics and Therapeutics
-
Primary Categories:
- Clinical Genetics
-
Secondary Categories:
- Clinical Genetics
Introduction:
Three hundred million people are affected by rare genetic diseases, many of which lack a molecular diagnosis. Addressing this challenge, the Genomic Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium was launched in 2021. I worked with Human Phenotype Ontology, a standardized database of clinical features (“HPO terms”) used to encode clinical presentations from patients with genetic disorders. My goal was assessing phenotype richness, specifically HPO terms in the U03 GREGoR Consortium-wide dataset, comparing it to rare genetic disease cases from the literature.
Methods:
The depth of a patient’s phenotype is assessed by the number of specific branches navigated within the ontology, with more specific terms (e.g., ‘nail pits’) providing higher information content than less specific terms (e.g., ‘developmental delay’). I selected one hundred rare genetic disease case reports from the British Medical Journal, Oxford Medical Reports, & the Journal of Clinical Psychiatry. I extracted and matched each patient case’s phenotypic features with specific HPO IDs (e.g., Proteinuria-HP:0000093). Using R-studio’s ontology Index package, the phenotypic depth in case report controls and the GREGoR dataset was compared
Results:
On average, there were 15.8 HPO IDs in the case reports and 4.5 in the GREGoR dataset per patient (p<0.001). The top HPO IDs in the GREGoR dataset included developmental delay (207 times), seizure (134), hypotonia (119), intellectual disability (111), and muscle weakness (93). The top case report HPO IDs were jaundice (15), anemia (13), short stature (12), respiratory distress (10), and fever (10). A symmetrical distribution was found between the case reports and the GREGoR dataset phenotypic depth (median: 7; mode: 6; IQR: 4; mean ~8), indicating that the GREGoR dataset’s phenotypic depth matches the case reports (p = 0.003).
Conclusion:
This research highlights the phenotyping level needed to identify distinct disease traits based on phenotypic termsets, establishing complete gene-phenotype and genotype-phenotype relationships.
Three hundred million people are affected by rare genetic diseases, many of which lack a molecular diagnosis. Addressing this challenge, the Genomic Research to Elucidate the Genetics of Rare Disease (GREGoR) Consortium was launched in 2021. I worked with Human Phenotype Ontology, a standardized database of clinical features (“HPO terms”) used to encode clinical presentations from patients with genetic disorders. My goal was assessing phenotype richness, specifically HPO terms in the U03 GREGoR Consortium-wide dataset, comparing it to rare genetic disease cases from the literature.
Methods:
The depth of a patient’s phenotype is assessed by the number of specific branches navigated within the ontology, with more specific terms (e.g., ‘nail pits’) providing higher information content than less specific terms (e.g., ‘developmental delay’). I selected one hundred rare genetic disease case reports from the British Medical Journal, Oxford Medical Reports, & the Journal of Clinical Psychiatry. I extracted and matched each patient case’s phenotypic features with specific HPO IDs (e.g., Proteinuria-HP:0000093). Using R-studio’s ontology Index package, the phenotypic depth in case report controls and the GREGoR dataset was compared
Results:
On average, there were 15.8 HPO IDs in the case reports and 4.5 in the GREGoR dataset per patient (p<0.001). The top HPO IDs in the GREGoR dataset included developmental delay (207 times), seizure (134), hypotonia (119), intellectual disability (111), and muscle weakness (93). The top case report HPO IDs were jaundice (15), anemia (13), short stature (12), respiratory distress (10), and fever (10). A symmetrical distribution was found between the case reports and the GREGoR dataset phenotypic depth (median: 7; mode: 6; IQR: 4; mean ~8), indicating that the GREGoR dataset’s phenotypic depth matches the case reports (p = 0.003).
Conclusion:
This research highlights the phenotyping level needed to identify distinct disease traits based on phenotypic termsets, establishing complete gene-phenotype and genotype-phenotype relationships.