In order to bring you the best possible user experience, this site uses Javascript. If you are seeing this message, it is likely that the Javascript option in your browser is disabled. For optimal viewing of this site, please ensure that Javascript is enabled for your browser.
Did you know that your browser is out of date? To get the best experience using our website we recommend that you upgrade to a newer version. Learn more.


About the initiative

Big-Data-at-heart-logo.jpgLaunched in March 2017, BigData@Heart is a five-year project of the Innovative Medicines Initiative (IMI), an EU public-private consortium consisting of patient networks, learned societies, SMEs, pharmaceutical companies and academia.

The ESC, in partnership with a number of European academic research groups and pharmaceutical companies, have joined forces to develop a big data-driven translational research platform. BigData@Heart has access to most of the relevant large-scale European databases, ranging from EHR and disease registries to well-phenotyped clinical trials and large epidemiological cohorts enriched with –omics data, including data on more than five million patients with acute coronary syndromes, atrial fibrillation, and heart failure and about 20 million controls without the diseases. By accessing and harmonising European-wide data sets, the ambition is to design algorithms that predict the evolution of disease, based on medical history, hospital records, and country-specific statistics.

Using all available data across data modalities, combined with machine learning or Bayesian network models, is expected to further refine outcome prediction. In addition, BigData@Heart will explore and set standards for the use of large and heterogeneously distributed data sets. Investigations include data mapping using common standards, federated data analysis obviating the need for central databases, and the legal and ethical aspects of using consented and unconsented data, in view of the EU general data protection regulation but also across countries with varying privacy rules.


October 2020 - Stakeholder Meeting > 

July 2020 - Stakeholder Meeting >

CODE-EHR Publication

'CODE-EHR best practice framework for the use of structured electronic healthcare records in clinical research' has been published as open access.

Read the paper in the European Heart Journal Read the paper in the BMJ Read the paper in The Lancet Digital Health

Routinely-collected healthcare data has the potential to improve the lives and wellbeing of patients across the world, through better understanding of disease, and research on existing and new treatments. Presented on 29 August 2022 at ESC Congress and simultaneously published in the European Heart Journal, The BMJ, and The Lancet Digital Health, an international team propose a framework to improve the integrity and quality of studies using healthcare data, and boost confidence in using the results for clinical decision support.

The CODE-EHR approach and framework

The approach was compiled by a wide range of global stakeholders, coordinated by the BigData@Heart consortium and the European Society of Cardiology.  This included patients and patient advocacy groups, regulators, government agencies and leading medical journals, plus representatives from professional societies, academic institutions, the pharmaceutical industry and payers. Participants convened to review opportunities and challenges, and develop pragmatic advice on how healthcare data can be applied to research across the spectrum of disease.

The CODE-EHR framework was iteratively developed to provide researchers with step-by-step guidance on how to achieve appropriate governance and transparency, and for stakeholders to have confidence in the reported findings.  Minimum standards are outlined for five key areas, with preferred standards providing the direction for future research:

  • Dataset construction and linkage; clarifying the source, completeness and linkage of any healthcare data used in the study.
  • Data fit for purpose; providing detail on the coding systems used, any data manipulation, and assessment of data quality.
  • Disease outcome and definitions; allowing other researchers to re-use and improve by clearly stating all codes and algorithms used, including those for patient identification, therapy, procedures, comorbidities and outcomes.
  • Analysis; describing how outcome events were analysed to allow for validation and replication.
  • Ethics and governance; communicating processes for consent, data privacy, and patient and public involvement.

CODE-EHR Checklist

The CODE-EHR framework checklist is available for download through the links below:






'Identification and Mapping Real-World Data Sources for Heart Failure, Acute Coronary Syndrome, and Atrial Fibrillation' has been published as open access.

Read the paper

This manuscript presents the findings, summarising the value and limitations of each data source, the availability of data for researchers, and links to other data sources for ascertainment of clinical outcomes. This is the first time that such an approach has been taken for cardiovascular disease. Read the full paper. 

Mapping real-world data sources

The data sweep provides an overview of existing real-world data as a foundation for the identification of potential data sources to conduct observational research studies. The aim of the data sweep is to foster collaboration between researchers, increase access and use of real-world data to strengthen the evidence available, and eventually improve outcomes for patients with heart failure, acute coronary syndrome and atrial fibrillation. 

Explaining the approach to collecting data

The BigData@Heart consortium conducted a systematic review of publications with global real-world data (RWD) pertaining to heart failure (HF), acute coronary syndrome (ACS) and atrial fibrillation (AF), generating a list of unique data sources.

Metadata were extracted based on the source type (e.g., electronic health records, genomics, clinical data), study design, population size, clinical characteristics, follow-up duration, outcomes and assessment of data available for future studies and linkage. A total of 11,889 publications were retrieved for HF, 10,729 for ACS and 6,262 for AF.

A detailed review was conducted of 322 (HF), 287 (ACS) and 220 (AF) data sources. The majority of data sources had near-complete data on demographic variables and considerable data on comorbidities. The least reported data categories were drug codes and caregiver involvement. Only a minority of data sources provided information on access to data for other researchers or whether data could be linked to other data sources to maximize clinical impact.

Results of the review

This review has created a comprehensive resource of cardiovascular data sources, providing new avenues to improve future real-world research and paving the way for better patient outcomes.

The list and metadata for RWD sources are publicly available in the Medical Information Framework catalogue¹ after registration. Please select the BigData@Heart Literature Review.


Figure 1: Geographical distribution of HF, ACS and AF data sources

Left-hand side: Top 5 data sources as per count of publications per indication

Right-hand side top: Geographical distribution of ACS, HF and AF data sources

Right-hand side bottom: Number of patients per data source for ACS, HF and AF

1_Take home figure.png


APMA: Asia Pacific Middle East and African countries; LaCan: Latin America and Canada”

With a growing need for real-world evidence (RWE) and high-quality RWD, this work provides researchers with a knowledge base to conduct feasibility assessments of these data sources for RWE studies. The top five data sources per disease or syndrome based on the highest number of publications are presented in Figure 1 (left-hand side).  Across the three CV indications, most data sources were currently from Europe and North America, but with growing numbers from the Middle East, Asia, Russia, and South America. Several data sources per geographical area are outlined in Figure 1 (top right). A large number of data sources are available and, at the same time, the variability in geographical scope, size, variables collected might suggest a need to further enhance collaboration and harmonization between data holders and researchers.


Figures 2-3-4 show the geographical distribution per HF, ACS and AF data sources respectively.


Figure 2 Geographical distribution of HF data sources


APMA: Asia Pacific Middle East and African countries; LaCan: Latin America and Canada”


Figure 3 Geographical distribution of ACS data sources


APMA: Asia Pacific Middle East and African countries; LaCan: Latin America and Canada”


Figure 4 Geographical distribution of AF data sources


APMA: Asia Pacific Middle East and African countries; LaCan: Latin America and Canada”


The results are being made available to the scientific and medical communities as a resource on available RWD for AF, ACS, and HF.  The information on the data sweeps is intended to be updated by the data holders or an update to the data sweep moving forward in the future.


Notes to editor

The Medical Information Framework (EMIF) catalogue was built to allow researchers to find databases that fulfil their research study requirements in a quick and easy way. EMIF has provided the tools to increase access and re-use of human health data. For each data source, metadata, as well as practical information, is provided as a resource for any researcher in identifying data sources and potential data partners. Specifically, metadata extracted from publications, now available in the catalogue, includes details on the

  • data source identified such as description, coverage, and follow-up
  • availability of clinically relevant key variables related to HF, ACS, and AF such as diagnosis and staging, demographics, management (including procedures), test results and treatments, the burden of disease (including costs), deaths and resource use, quality of life, and adverse events
  • Publicly available information related to the data sources, such as the data source holder/owner, access and linkage possibility, supporting documentation and its governance aspects.

Also of interest

ESC 365 - The cardiology knowledge hub

Home to all abstracts, slides and videos from all our congresses.

Access now
ESC Journal Family

Submit your own papers and read the findings of others. Twelve different periodicals that cover the entire field of cardiovascular medicine and research.

See more