An AI algorithm developed from ~45,000 individuals and tested on a separate group of >50,000 individuals has been able to accurately identify LV dysfunction (defined as an LV ejection fraction [LVEF] ≤35%) with an area under the curve (AUC) of 0.93 [12]. Indeed, even in those with normal LV function at the time of assessment, those flagged by the algorithm were significantly more likely to develop LV systolic dysfunction over the following three years than those considered to be negative by the algorithm. The predictive value of this algorithm has subsequently been validated in a prospective dataset with similar findings [13]. Kwon et al have shown similar results in a Korean cohort of ~55,000 ECGs with a model able to detect both reduced (LVEF ≤40%) and mid-range systolic function (LVEF ≤50%) [14]. Other groups have also developed algorithms with similar efficacy for detecting systolic dysfunction [15]. As a result, such AI models, in combination with readily available clinical and biomarker data, may provide an appropriate screening tool for LV dysfunction.
ECG diagnostics in hypertrophic cardiomyopathy
AI, particularly deep learning with CNNs, has shown promise in detecting HCM via ECGs. Traditional ECG features suggestive of HCM have been hampered by their inconsistency, with up to 10% of those with HCM having apparently normal ECGs [16,17]. While screening with echocardiography is a reliable method, it is comparatively costly and time-consuming. Development of AI-based ECG diagnostics may be able to avoid the limitations of traditional algorithms by not relying on specific ECG criteria, such as LV hypertrophy (LVH), which often presents variably among HCM patients [18].
An AI algorithm developed by a team from the Mayo Clinic using ECGs from ~2,500 individuals with HCM, and ~51,000 controls, demonstrated impressive accuracy (AUC 0.95). This finding was similar when limited to ECGs with either traditional LVH criteria or those with apparently normal ECGs [18]. This model's robust performance, particularly in younger patients and irrespective of mutation status, underscores its potential for broad application in HCM screening. Of potential importance, the high negative predictive value (NPV) suggests that this tool may be useful to rule out HCM in large-scale screening assessments [18]. More broadly, it is possible for AI to accurately determine the presence of LVH from an ECG. An AI algorithm developed from 21,000 individuals with paired ECG and echocardiography data demonstrated a greater sensitivity for the same specificity when compared to assessment of the ECG by a cardiologist [19].
Similar methods have been shown to be successful for helping to identify patients at high risk of AF or those already suffering from symptomless paroxysms of the arrhythmia. Utilising approximately over 600,000 ECGs, AI demonstrated a 79% accuracy and AUC of 0.87 in detecting paroxysmal AF by identifying subclinical changes not visible during active episodes [2]. Indeed, when ECGs from within one month of the patient being diagnosed with AF were examined, the accuracy of the AI algorithm was even higher (AUC 0.90, accuracy 83%). However, when this model was compared against an established clinical prediction model for AF (CHARGE-AF), they were similarly predictive for AF risk over time (C-statistics for AI 0.69 [95% confidence interval {CI}: 0.66 to 0.72] and CHARGE-AF 0.69 [95% CI: 0.66 to 0.71]) [6]. As such, while models with high precision can be developed, and while AI may indeed provide an important option for “point of care” risk assessment using a single test, we should not forget the value of traditional prediction models.
Given the important pathological association between AF and stroke, determining stroke risk in those with AF is an important avenue for AI to consider. In such scenarios, AI models have been demonstrated to outperform the CHA2DS2-VASc score, with a study showing three different supervised machine learning models were able to demonstrate reasonable predictive ability for stroke in those with AF (AUC of 0.60 to 0.66). In contrast, the CHA2DS2-VASc score had an AUC of only 0.52 [20]. Despite these promising findings, it should be considered that stroke prediction in those with AF remains inaccurate even when AI models are combined with established clinical models.
ECG diagnostics in other cardiac pathologies
The ECG plays a role, to a greater or lesser extent, in the diagnosis of a range of other cardiac pathologies. AI models have been developed to determine the presence of long QT syndrome (LQTS) even in the presence of a normal corrected QT interval. In an analysis of ECGs from ~2,000 individuals from a specialised genetic heart rhythm clinic, the developed algorithm was able to separate those with confirmed LQTS with a QT corrected for heart rate (QTc) <450 ms from those without LQTS (AUC 0.86) [21]. The ability for AI algorithms to detect LQTS has been confirmed by other studies [22]. The ability to identify the presence of LQTS despite a normal QT interval has important screening implications for clinical practice.
Valvular heart disease is a common group of pathologies with important implications for mortality and morbidity [23]. Traditional diagnosis is normally based on clinical auscultation followed by echocardiographic assessment. However, auscultation is a variable skill which may not identify all patients with valvular heart disease [24]. AI algorithms have been developed which identify moderate-severe aortic stenosis (AS) with a high degree of precision (AUC ~0.90), with similar findings when used on an external validation dataset [25,26]. Interestingly, those flagged as positive by one AI model but who were found not to have significant AS were twice as likely to progress to moderate-severe AS in the following 15 years compared to those identified as negative by the algorithm [25]. Similar results have been achieved when identifying individuals with clinically significant mitral valve disease [27]. Indeed, one algorithm developed using ECGs from ~77,000 individuals had a high degree of accuracy for all left-sided valvular pathologies [28].
Validation in practice; challenges and limitations
Despite all these advances, it is important to acknowledge several potential challenges and limitations. These may be derived first by the fact that ECGs obtained in routine, real-world settings might be of poor quality. Also, even though some of the models might perform well in a certain population, it is imperative that they undergo rigorous evaluation for external validity in diverse populations. The development of AI-ECG models needs large datasets for training, validation and testing, hence multicentre collaborations seem vital. Moreover, legal and regulatory aspects of incorporating AI-based diagnoses include whether and to what extent the clinician may opt to take them into consideration, as well as the approval of which regulatory bodies. A legal framework to regulate AI-based decision-making is warranted.
Conclusions
Impressive recent advances, particularly in the medical field, have provided clinicians with insights into data acquisition and analysis leading to advanced non-invasive diagnostics via AI. Within ECG diagnostics in particular, remarkable analysis by means of deep-learning CNNs have enabled rapid interpretation utilising ECG features as an ideal substrate for this process in various pathologies, including arrhythmias, LV systolic dysfunction, cardiomyopathy and valve disease. However, as with any medical tool, the AI in ECG diagnostics requires thorough validation, clinician training and an appropriate legal framework before being integrated into medical practice.