Elsevier

American Heart Journal

Volume 160, Issue 6, December 2010, Pages 1099-1104
American Heart Journal

Clinical Investigation
Valvular and Congenital Heart Disease
Linking clinical registry data with administrative data using indirect identifiers: Implementation and validation in the congenital heart surgery population

https://doi.org/10.1016/j.ahj.2010.08.010Get rights and content

Background

The use of clinical registries and administrative data sets in pediatric cardiovascular research has become increasingly common. However, this approach is limited by relatively few existing datasets, each of which contain limited data, and do not communicate with one another. We describe the implementation and validation of methodology using indirect patient identifiers to link The Society of Thoracic Surgeons Congenital Heart Surgery (STS-CHS) Database to The Pediatric Health Information Systems (PHIS) Database (a pediatric administrative database).

Methods

Centers submitting data to STS-CHS and PHIS during 2004 to 2008 were included (n = 30). Both data sets were limited to patients 0 to 18 years old undergoing cardiac surgery. An exact match was defined as an exact match on each of the following: date of birth, date of admission, date of discharge, sex, and center. Likely matches were defined as an exact match for all variables except ±1 day for one of the date variables.

Results

Of 45,830 STS-CHS records, 87.4% matched to PHIS using the exact match criteria and 90.3% using the exact or likely match criteria. Validation in a subset of patients revealed that 100% of exact and likely matches were true matches.

Conclusions

This analysis demonstrates that indirect identifiers can be used to create high-quality link between a clinical registry and administrative data set in the congenital heart surgery population. This methodology, which can also be applied to other data sets, allows researchers to capitalize on the strengths of both types of data and expands the pool of data available to answer important clinical questions.

Section snippets

Data sources

The STS-CHS Database is the largest existing clinical congenital heart surgery data registry in the world. It currently contains data on >140,000 surgeries conducted since 1998 performed at 85 US centers in 37 states. This represents nearly three quarters of all US centers performing congenital heart surgery.12 The STS-CHS Database contains perioperative and operative data on all patients undergoing congenital heart surgery at participating centers, including demographic information, anatomical

Data link

Using method 1, 45,830 STS-CHS records and 211,973 PHIS records were identified (Table II). Broadening the initial inclusion criteria used to define the PHIS cardiac population (methods 2 and 3) was not associated with a greater match but was associated with a greater number of duplicates (Table II). Therefore, method 1 was deemed to be the best method with 87.4% exact matches and a total of 90.3% exact or likely matches.

Center variation

The proportion of exact and likely matches at each of the 30 centers is

Discussion

These data demonstrate the feasibility of using indirect identifiers to create a high-quality link between clinical registry and administrative data for patients undergoing congenital heart surgery. This methodology allows researchers to capitalize on the strengths of both data sets and expands the pool of data available for analysis. Similar methodology can also be applied to link other existing data sources.

The use of indirect identifiers to link large data sets has been described previously

Conclusions

This analysis demonstrates that indirect identifiers can be used to create a high-quality link between a clinical registry and administrative data set in the congenital heart surgery population. This will allow several analyses to be performed with linked data from this study, and similar methodology can also be applied to link other data sets. Leveraging these and other linked data will enable researchers to answer important questions regarding practice variation and the efficacy and safety of

Disclosures

This study was supported by National Heart, Lung, and Blood Institute grant 1RC1HL099941-01 under the 2009 American Recovery and Reinvestment Act.

Dr Pasquali receives grant support (KL2 RR024127-02) from the National Center for Research Resources, a component of the National Institutes of Health (NIH) and NIH Roadmap for Medical Research, and from the American Heart Association Mid-Atlantic Affiliate Clinical Research Program. The contents of this publication are solely the responsibility of

References (19)

There are more references available in the full text version of this article.

Cited by (0)

View full text