In order to bring you the best possible user experience, this site uses Javascript. If you are seeing this message, it is likely that the Javascript option in your browser is disabled. For optimal viewing of this site, please ensure that Javascript is enabled for your browser.
Did you know that your browser is out of date? To get the best experience using our website we recommend that you upgrade to a newer version. Learn more.

Machine Learning and Big Data: Opportunities for Improving Risk Assessment and Treatment in Cardiology

Structured risk assessment and -stratification is an essential part of modern cardiology. European Society of Cardiology (ESC) Guidelines for Preventive Cardiology (1) recommend the use of formal stratification tools to classify subjects according to their risk of first or subsequent CVD events, as well as for patients with established cardiovascular disease (CVD), including ST-elevation myocardial infarction and atrial fibrillation (AF) (2,3). It is then recommended to start treatment or modify treatment intensity, according to estimated risk.

Despite these recommendations, it has been observed that risk assessment tools ‘are not adequately implemented in clinical practice’ (1). One might speculate on the reason why, but it seems quite natural, that the performance of recommended instruments is a relevant factor. For example, the concordance- (C-) statistics for the CHA2DS2-VASc score for the prediction of ischemic stroke in AF (4), and the SMART score for the prediction of the 10-year risk of vascular complications in patients with CVD (5) did not exceed 0.70, which can be classified as ‘modest’. Other risk scores, such as the GRACE score for death/myocardial infarction after acute coronary syndrome admission, have better discriminatory power (C-statistic 0.73-0.77) (6), but there definitely is room for improvement.

Multiple factors contribute to (variations in) the performance of risk prediction instruments, including, but not limited to, the population and endpoint of interest, the sample size, the number of potential predictive variables, and the analytic complexity of the data. Risk stratification tools in cardiology are usually derived from classical regression analyses on (large) routine clinical practice data sets. The outcome to be predicted (‘dependent’ variable Y) is then modelled as a function of a series of selected, predefined predictor (or ‘independent’) variables (X). This supervised approach is straightforward from a statistical point of view, and results in a transparent model. The relations between the Xs and Y are estimated by regression coefficients (‘betas’) that can easily be understood by the end users of the model.

Goldstein et al. recently argued that the field should move beyond these regression techniques, and apply machine learning (ML) instead, ‘to address analytic challenges’ (7). They argue that the performance of ‘classical’ regression is suboptimal in case of non-linear X-Y relationships, dependency of X-Y relationships on other X’s, and in case many X’s are present. Indeed, ML techniques, such as ridge regression and LASSO regression, or the method of the ‘Nearest Neighbour’, are useful alternatives to overcome these challenges (7). My personal view is that risk assessment in cardiology of ML techniques go hand-in-hand with the exploration of ‘Big Data’.

<b>The Cardiology Information System</b><br/>Reproduced with permission from Simoons et al. Eur Heart J 2002;23:1148-52As a legacy at the end of his two-year term as ESC president in 2002, Prof. Maarten Simoons introduced the notion of an integrated ‘Cardiology Information System’ (figure) (8). In order to foster personalised CV medicine, he envisaged combining data derived from hospital information systems, regional and national registries, and the knowledge base. In his view, the ESC had an important role to play'… to promote development of data standards for information to achieve the required level of integration’ (7). Since 2002, the volume and variety of data has revolutionised: citizens and patients almost continuously contribute to a wealth of data that might be relevant for their (CV) health, including registries of personal preferences (e.g. supermarkets), spatiotemporal data and social media tracks. Furthermore, subjects generate data on personal CV health by using instruments such as smart watches to measure pulse frequency and blood pressure, or smart phones which can easily be upgraded to become a stethoscope, ECG device, or instrument for measuring blood glucose. Clearly this era of smart data not only provides unprecedented opportunities to further optimise personalised CV management, but also requires smart approaches to transform data into useful information. The ESC should, therefore, be complimented on having launched the Digital Health Virtual Journal, a platform on which data scientists and clinical cardiologists can meet.


  1. Rossello X, Dorresteijn JAN, Janssen A, Lambrinou E, Scherrenberg M, Bonnefoy-Cudraz E, Cobain M, Piepoli MF, Visseren FLJ, Dendale P. Risk prediction tools in cardiovasculardisease prevention: A report from theESC Prevention of CVD Programme ledby the European Association of PreventiveCardiology (EAPC) in collaboration with the Association for Acute Cardiovascular Care (ACVC) and the Associationof Cardiovascular Nursing and AlliedProfessions (ACNAP). Eur J Prev Cardiology 2019;26:1534-44.
  2. Ibanez B, James S, Agewall S, Antunes MJ, Bucciarelli-Ducci C, Bueno H, Caforio ALP, Crea F, Goudevenos JA, Halvorsen S, Hindricks G, Kastrati A, Lenzen MJ, Prescott E, Roffi M, Valgimigli M, Varenhorst C, Vranckx P, Widimsky P; ESC Scientific Document Group. 2017 ESC Guidelines for the management of acute myocardial infarction in patients presenting with ST-segment elevation: The Task Force for the management of acute myocardial infarction in patients presenting with ST-segment elevation of the European Society of Cardiology (ESC). Eur Heart J 2018;39:119-77.
  3. Kirchhof P, Benussi S, Kotecha D, Ahlsson A, Atar D, Casadei B, Castella M, Diener HC, Heidbuchel H, Hendriks J, Hindricks G, Manolis AS, Oldgren J, Popescu BA, Schotten U, Van Putte B, Vardas P; ESC Scientific Document Group. 2016 ESC Guidelines for the management of atrial fibrillation developed in collaboration with EACTS. Eur Heart J 2016;37:2893-962.
  4. Van den Ham HA, Klungel OH, Singer DE, Leufkens HG, van Staa TP. Comparative performance of ATRIA, CHADS2, and CHA2DS2-VASc Risk scores predicting stroke in patients with atrial fibrillation: results from a national primary care database. J Am Coll Cardiol 2015;66:1851-9.
  5. Dorresteijn JA, Visseren FL, Wassink AM, Gondrie MJ, Steyerberg EW, Ridker PM, Cook NR, van der Graaf Y; SMART Study Group. Development and validation of a prediction rule for recurrent vascular events based on a cohort study of patients with arterial disease: the SMART risk score. Heart 2013;99:866-72.
  6. Fox KAA, FitzGerald G, Puymirat E, Huang W, Carruthers K, Simon T, Coste P, Monsegu J, Steg PG, Danchin N, Anderson F. Should patients with acute coronary disease be stratified for management according to their risk? BMJ Open 2014;4:e004425
  7. Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J 2017;38:1805-14.
  8. Simoons ML, van der Putten N, Wood D, Boersma E, Bassand JP. The Cardiology Information System: the need for data standards for integration of systems for patient care, registries and guidelines for clinical practice. Eur Heart J 2002;23:1148-52.

Notes to editor

Editorial on Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J 2017;38(23):1805-1814.

Declaration of Interests: The author(s) have declared no conflicts of interest.

The content of this article reflects the personal opinion of the author/s and is not necessarily the official position of the European Society of Cardiology.