Skip to main content

Conference Program

Subpage Hero

Loading

ERVExplorer: A comprehensive list of experimentally confirmed endogenous retroviruses (ERVs) and their potential biological relevance

Laboratory Genetics and Genomics
  • Primary Categories:
    • Public Health Genetics
  • Secondary Categories:
    • Public Health Genetics
Introduction:




Endogenous retroviruses (ERVs), are non-infectious viruses that insert themselves within host genomes and establish persistence. While they consist of 8% of the human genome, less than 1% of all ERV loci have known functions- primarily because of previous notions that ERVs were non-functional in infectious cells. However, recent advancements in whole-genome sequencing have allowed us to find crucial roles of ERV RNA and proteins in disease contexts, such as rheumatoid arthritis  and multiple sclerosis. Thus, catalogs and ERV databases are key to understanding the potential disease associations of ERVs .Existing catalogs that attempt to map the HERV landscape often have several shortcomings - often lacking up-to-date information, functional annotations, and a common framework to name and classify said ERVs. Our systematic literature review aims to create a unified, experimentally confirmed list of ERV locations, aiming to identify ERV locations with biological relevance by first identifying ERV locations with active expression capabilities from literature, and then compare said ERV locations with existing repeat element databases. 

Methods:




In this systematic review, we plan to conduct an initial title/abstract/keyword search within the PUBMED literature database, accessing the full-text articles via the implementation of batch-downloading software. Said full-texts will be evaluated in two steps - first, the full-texts will be scanned utilizing the PDE (PDF Data Extractor) R package, enabling an approximate 40 percent reduction in workload associated with literature review.(Stricker et al., 2024) Articles incompatible with said R package will be subject to a manual abstract/title screen, with a manual full-text evaluation if the article is found to meet review inclusion criteria. Next, data will be extracted from the included studies utilizing the same PDE package - featuring standard values found in every ERV catalog such as superfamily names, but also certain criteria unique to this database, such as evidence levels of RNA expression. Searches will be conducted independently by two reviewers, and disagreements will be resolved by discussion with a third member. 

Results:




We have successfully conducted the initial title/abstract keyword search and are almost finished with batchdownloading all articles that met our review inclusion criteria in accordance with two search strategies. The first strategy yielded 11627 results in PUBMED. 9000 of those PMIDs have been processed, with 2241 of said 9000 eligible for download. The second search strategy yielded 9460 results in PUBMED, of which 3265 have been successfully downloaded. From here, we plan to complete batchdownloading of the first search strategy, extract data with the PDE, and synthesize extracted data into a standard database presentation.

Conclusion:




With this database, we hope to create the foundation for future research on ERV-based therapies and treatment. We also aim to understand how viral remnants are transcribed in disease states, as well as how the proteins encoded by ERVs could be functionally active in cells - perhaps serving as biomarkers for diseases. Finally, our database can be used to conduct targeted association studies in order to map certain ERVs with specific diseases/disorders. 

Agenda

Sponsors