Recent advances in computational capabilities and the availability of large amounts of data have led to the development and widespread adoption of artificial intelligence (AI), particularly using deep neural networks, across nearly every aspect of daily life. Medicine is no exception, with numerous applications developed for creating predictive models or replicating human diagnosis from medical examinations. One area where AI application has grown exponentially in recent years is the analysis of the electrocardiogram (ECG), leveraging raw ECG signals for event prediction and classification tasks.¹
In the recent JACC Scientific Statement on Clinical Risk Assessment and Prediction in Congenital Heart Disease Across the Lifespan, Opotowsky et al.² highlighted the challenges associated with risk prediction in the heterogeneous and continuously evolving population of patients with congenital heart disease (CHD). Traditionally, risk scores and models have been based on multivariate analyses using datasets of modest size. This statement acknowledges the emerging role of AI and its potential to advance the field.
Mayourian et al.³ take a significant step in this direction with their study published in the European Heart Journal. Their work assesses the utility of an AI-enhanced ECG model for risk stratification in a large cohort of pediatric and adult patients with CHD, including cardiomyopathies and channelopathies, at Boston Children’s Hospital. Benefiting from over 30 years of stored digital ECGs, their goal was to predict 5-year overall mortality using only ECG data.
The authors adhered to key steps required for developing AI models. The entire cohort was split, with 50% allocated for model development (95% for training and 5% for tuning hyperparameters) and 50% for testing. They evaluated typical model performance metrics, including the area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (positive predictive value and sensitivity), sensitivity, specificity, and positive and negative predictive values. The model was also used to classify patients into low- and high-risk categories, followed by survival curve analysis and hazard ratio (HR) estimation using Kaplan-Meier and Cox proportional hazards regression. The primary outcome of the study was 5-year overall mortality after an ECG.
The model was developed in a cohort of nearly 40,000 patients with a wide range of defects and over 112,000 ECGs. Raw ECG data were preprocessed to ensure good data quality and served as input for a convolutional neural network. Once developed, the AI-ECG model was evaluated in the test cohort of approximately 40,000 patients. Since patients often have multiple ECGs during follow-up, the authors tested the model using the first, last, and a random ECG for each patient. The AI-ECG model performed well, achieving an AUROC of ~0.8 and a high negative predictive value (0.98). Interestingly, the most recent ECG outperformed others, likely because it incorporates information reflecting a patient’s current clinical state. For instance, an ECG taken after surgery might differ substantially from earlier recordings, and these changes are captured by the model. Notably, the AI-ECG model outperformed traditional clinical measures, including age, QRS duration, and left ventricular ejection fraction. Patients classified as high-risk by the model had an HR >4 (based on the first or random ECG) and >7 when using the most recent ECG.
An essential aspect for gaining clinician confidence in these models is their explainability. Avoiding the perception of AI as a “black box” can facilitate its adoption in clinical decision-making. The authors addressed this by using saliency maps, which identify ECG features that contribute most to the model’s predictions. These maps, derived from a median ECG beat, highlighted QRS width, deep S waves, and low voltage as key predictors influencing the model’s outputs.
In summary, Mayourian et al.³ demonstrate that these emerging technologies have the potential to enhance current risk stratification schemes and represent a meaningful advancement in the field. Given the diversity and heterogeneity of CHD, as well as the growing number of patients and centers involved in their management, the CHD population is well-positioned to benefit from AI. However, significant challenges remain. Chief among these is the need for widespread implementation in clinical practice. For this to occur, it is critical to ensure that these models are generalizable globally, necessitating extensive external validation in other centers and countries. Moreover, as guided by evidence-based medicine principles, prospective testing of these models in randomized clinical trials is essential.⁴
Our mission: To reduce the burden of cardiovascular disease.