Skip to main content

Conference Program

Subpage Hero

Loading

Advancing Genomic Research: Enhancing dbGaP Interoperability with FHIR APIs

Health Services and Implementation
  • Primary Categories:
    • Non-Clinical
  • Secondary Categories:
    • Non-Clinical
Introduction:
With 2,800 studies covering 800 diseases, involving 4 million participants, and capturing 450,000 demographic and clinical phenotype variables, the NIH's Database of Genotypes and Phenotypes (dbGaP) offers an abundance of research data. Over 8,000 manuscripts, including those published in journals like Nature Genetics, have cited dbGaP data. However, using dbGaP is not intuitive. It was built to archive submitted data as collected, study by study, and thus phenotypic data across studies are not always harmonized. Downloading the data requires manually installing an additional software toolkit. As researchers increasingly rely on powerful cloud computing platforms for analyses across multiple studies—not only within dbGaP but also from other data repositories—data harmonization becomes even more crucial.

Methods:
To address these challenges and make dbGaP data interoperable and analysis-ready, we are providing a set of user-friendly Application Programming Interfaces (APIs). Rather than designing dbGaP-specific APIs, we have chosen the global standard FHIR (Fast Healthcare Interoperability Resources) to further promote interoperability. FHIR was mandated in the 2020 final ruling of the 21st Century Cures Act, which states that Electronic Health Records (EHRs) must be exchanged using the FHIR format. FHIR offers benefits such as open-source server implementations and a standard interface, facilitating integration for application developers. For example, LabCorp, Quest, MyChart, Apple Health, and many others use FHIR for data exchange. While FHIR for EHRs has been widely adopted, its application in genetic and genomic research is still emerging. The NIH's Office of Data Science Strategy (ODSS) has encouraged the exploration of FHIR for research data to promote data sharing. To make an interoperable data exchange standard take hold, a critical mass of data repositories needs to join forces. ODSS created the NIH Cloud Platform Interoperability (NCPI) initiative to coordinate and promote this effort, and dbGaP is one of the NCPI partners piloting this FHIR effort.

Results:
We report our initial success in using FHIR to deliver both open-access and controlled-access data. The open-access API provides study-level information, enabling researchers to discover datasets most relevant to their needs programmatically before applying for access. For instance, with minimal scripting using the FHIR API, application developers can easily create specific disease-focused alerts when new datasets are released. The controlled-access API delivers over 1.1 billion individual phenotypic values along with rich molecular sequence data files, using persistent URLs provided by the Global Alliance for Genomics and Health's (GA4GH) Data Repository Service (DRS). DRS is also a global standard API that has been integrated into many genomic cloud computing platforms. We will show how the FHIR API, with the standard CodeSystem, represents a significant step toward metadata harmonization across studies. Harmonization facilitates searching across studies and cohort building for participants that match specific phenotypic criteria (e.g., over 40 years old with blood pressure >130 mmHg). This method of cohort building increases research power and saves time and money.  

Conclusion:
Embracing global standards like FHIR and DRS fosters interoperability and data sharing, paving the way toward more integrated, scalable, and innovative approaches in genetic and genomic research.

Agenda

Sponsors