Analyzing FHIR data with Azure Synapse Analytics

This post has been republished via RSS; it originally appeared at: New blog articles in Microsoft Tech Community.

This blog has been authored by Ranvijay Kumar, Principal Program Manager, Microsoft Health & Life Sciences

HL7 Fast Healthcare Interoperability Resources (FHIR®) is quickly becoming the de facto standard for persisting and exchanging healthcare data. FHIR specifies a high-fidelity and extensible information model for capturing details of healthcare entities and events.

This article will teach you a simple approach to creating analytical data marts by exporting, transforming, and copying data from Azure API for FHIR to Azure Synapse Analytics, which is a limitless analytics service designed for data warehousing and big data workloads. You can complete your Business Intelligence (to Artificial Intelligence (AI) analytics with Synapse due to the deep integration with Power BI, Azure Machine Learning, and Azure Cognitive services.

FHIR to Synapse.png

In this approach, as illustrated in the diagram, you will use the $export operation in Azure API for FHIR to export FHIR resources in NDJSON format (newline delimited JSON) to Azure storage. You will then use T-SQL from any of the serverless or the dedicated SQL pools in Synapse to query against those NDJSON files and optionally save the results into tables for further analysis.

Exporting FHIR data to Azure storage

Azure API for FHIR implements the $export operation defined by the FHIR spec to export all – or a filtered subset – of FHIR data in NDJSON format. It also supports de-identified export to enable secondary use of healthcare data. You can configure the server to export the data to any kind of Azure Storage account; however, we recommend exporting to ADLS Gen 2 for best alignment with Synapse.

Let’s consider a scenario in which data scientists want to analyze clinical data of patients who are former smokers. For the study, data scientists need an initial copy of data from the FHIR server followed by incremental data for the same set of patients every month for the next two years.

The first step to get this data is to identify the patients in the FHIR server who are former smokers. The following GET call searches the FHIR server using the LOINC code 72166-2 (Tobacco smoking status) for Observation, and SNOMED code 8517006 (Former smoker) for Observation value-concept to get subjects of the observations who are former smokers. You may need to use different codes depending on how your data is coded.

https://{{fhirserverurl}}/Observation?code=72166-2&value-concept=8517006&_elements=subject

You need to save this list of patients to enable exporting their clinical data monthly. There are a few options to manage a collection of resources in FHIR. Since Group is supported by the $export operation, you will manage the collection of patient resource IDs as a Group. Use the results from the above search query to create a person-type Group.

{

"resourceType": "Group", "id": "1","type": "person", "actual": true,

"member": [{"entity": {"reference": "Patient/44f6f10e-96c2-4802-b857-4861f1802522"}},

... other patient entities from the result ...

]

}

Once you have a Group, you can export all the data related to the patients in the Group with the following async REST call:

Note: Azure API for FHIR takes an optional container name to simplify the organization of exported data.

https://{{fhirserverurl}}/Group/{{GroupId}}/$export?_container={{BlobContainer}}

You can also use _type and _typefilter parameters in the $export call to restrict the resources we you want to export. Finally, you can use _since parameter in the $export call to do incremental exports every month for two years to meet your original requirement. This parameter restricts export to the resources that have been created or updated since the supplied time.

https://{{fhirserverurl}}/Group/{{GroupId}}/$export?_container={{BlobContainer}}&_since=2021-02-06T01:09:53.526+00:00

Now that you have data in ADLS Gen 2, let’s talk about Synapse and see how you can load it to Synapse.

About Azure Synapse Analytics

Create a pipeline

You can use a variety of REST clients such as Postman to export the data from the FHIR server and use Synapse Studio or any other SQL client to run the above T-SQL statements. However, it is a good idea to convert these steps into a robust data movement pipeline using Synapse Pipelines. You can use the Synapse Web activity for triggering the export, and the Stored procedure activity to run the T-SQL statements in the pipeline.

Conclusion

You can use the FHIR $export API and T-SQL to transform and move all or a filtered subset of data from FHIR server to Synapse Analytics. After the initial data load, the _since parameter in the $export operation can be used to do incremental data load. An ETL pipeline with the steps mentioned in this article can be used to keep the data in the FHIR server and the Synapse Analytics in sync.

®FHIR is registered trademark of Health Level Seven International, registered in the U.S. Trademark Office and is used with their permission.

Exporting FHIR data to Azure storage

About Azure Synapse Analytics

Create a pipeline

Conclusion

Leave a Reply Cancel reply