11 Novel Systematic Method for Identifying Congenital Anomaly Cases in Electronic Health Record Databases

Elly Brokamp; Lisa Bastarache; Nancy Cox; Rizwan Hamid; Nikhil K. Khanakari; Gillian Hooker; Megan Shuey

doi:10.1017/cts.2024.32

11 Novel Systematic Method for Identifying Congenital Anomaly Cases in Electronic Health Record Databases

Part of: JCTS_2024_ABSTRACT_COLLECTION

Published online by Cambridge University Press: 03 April 2024

Elly Brokamp ,

Lisa Bastarache ,

Nancy Cox ,

Rizwan Hamid ,

Nikhil K. Khanakari ,

Gillian Hooker and

Megan Shuey

Show author details

Elly Brokamp: Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA, 37203
Lisa Bastarache: Affiliation:
Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA
Nancy Cox: Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA, 37203
Rizwan Hamid: Affiliation:
Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
Nikhil K. Khanakari: Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA, 37203
Gillian Hooker: Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA, 37203
Megan Shuey: Affiliation:
Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA, 37203

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

OBJECTIVES/GOALS: Congenital anomalies (CAs) affect 3% of live births, yet the cause of 80% of CAs is unknown and for the 20% with an identified cause, variability in penetrance suggests additional risk drivers exist. Our method for identifying and categorizing CAs in electronic health record (EHR) linked biobank databases can expand and improve CA etiologic research. METHODS/STUDY POPULATION: We identified individuals with CAs in three groups: 1. Those with at least one CA 2. Those with multiple CAs (MCA), those with two or more ‘major’ CAs, and 3. Those with CAs in a specific organ system. We also created a novel quantitative approach, using phenome-wide association studies (pheWAS), for determining CA-associated genetic disease billing codes in order to separate individuals that have a known genetic cause for their CAs from those with idiopathic CAs. We updated CA phecodes, aggregates of clinical billing codes, which we used to identify CA cases in Vanderbilt’s EHR-linked biobank database, BioVU. We create a new phecode, ‘All CAs’, for researchers to quickly identify all individuals with at least one CA. We evaluate the definition of MCA using pheWAS analyses to compare ‘minor’ vs ‘major’ CA. RESULTS/ANTICIPATED RESULTS: The new CA phecode nomenclature includes 5.8 times more codes for CAs compared with the previous version (365 vs 56), improving granularity. 85 (19.7%) CA-associated genetic disease billing codes were identified through literature review. PheWAS analyses revealed an additional 16 (3.7%) genetic disease billing codes with one or more significant (p< 2.75 x10-5) association with CA-related phecodes. Identifying CA-associated genetic disease billing codes allows researchers to differentiate between idiopathic CAs and those that have a known genetic cause. PheWAS analyses of individuals with previously considered “minor” CAs showed many associated severe health problems, revealing that the differentiation between “minor” vs “major” CAs when identifying individuals with MCA in the EHR is arbitrary. DISCUSSION/SIGNIFICANCE: Our CA identification method is scalable for the growing number of EHR-linked biobanks. Differentiating between idiopathic CAs from those with known causes will increase power in studies discovering additional genetic drivers of CAs. Our novel method allows for expansion and acceleration of CA epidemiological research in EHR-linked biobank data.

Type: Biostatistics, Epidemiology, and Research Design
Information: Journal of Clinical and Translational Science , Volume 8 , Issue s1 , April 2024 , pp. 3

DOI: https://doi.org/10.1017/cts.2024.32 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.

Article contents

11 Novel Systematic Method for Identifying Congenital Anomaly Cases in Electronic Health Record Databases

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests