Article Text

Original research
Smartphone detection of atrial fibrillation using photoplethysmography: a systematic review and meta-analysis
  1. Simrat Gill1,2,
  2. Karina V Bunting1,2,
  3. Claudio Sartini3,
  4. Victor Roth Cardoso1,2,4,
  5. Narges Ghoreishi3,
  6. Hae-Won Uh5,
  7. John A Williams2,4,
  8. Kiliana Suzart-Woischnik3,
  9. Amitava Banerjee6,
  10. Folkert W Asselbergs7,8,9,
  11. MJC Eijkemans5,
  12. Georgios V Gkoutos2,4,
  13. Dipak Kotecha1,2,7
  1. 1Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, UK
  2. 2Health Data Research UK Midlands Site, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
  3. 3Medical Affairs and Pharmacovigilance, Pharmaceuticals, Integrated Evidence Generation, Bayer AG, Leverkusen, Nordrhein-Westfalen, Germany
  4. 4Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK
  5. 5Julius Center for Health Sciences and Primary Care, University Medical Centre, Utrecht, Netherlands
  6. 6Farr Institute of Health Informatics Research, University College London, London, UK
  7. 7Department of Cardiology, University Medical Centre Utrecht Department of Cardiology, Utrecht, Netherlands
  8. 8Department of Cardiology, University College London Faculty of Population Health Sciences, London, UK
  9. 9Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London, UK
  1. Correspondence to Dr Simrat Gill, Institute of Cardiovascular Sciences, University of Birmingham, Birmingham, UK; s.gill.5{at}bham.ac.uk

Abstract

Objectives Timely diagnosis of atrial fibrillation (AF) is essential to reduce complications from this increasingly common condition. We sought to assess the diagnostic accuracy of smartphone camera photoplethysmography (PPG) compared with conventional electrocardiogram (ECG) for AF detection.

Methods This is a systematic review of MEDLINE, EMBASE and Cochrane (1980–December 2020), including any study or abstract, where smartphone PPG was compared with a reference ECG (1, 3 or 12-lead). Random effects meta-analysis was performed to pool sensitivity/specificity and identify publication bias, with study quality assessed using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2) risk of bias tool.

Results 28 studies were included (10 full-text publications and 18 abstracts), providing 31 comparisons of smartphone PPG versus ECG for AF detection. 11 404 participants were included (2950 in AF), with most studies being small and based in secondary care. Sensitivity and specificity for AF detection were high, ranging from 81% to 100%, and from 85% to 100%, respectively. 20 comparisons from 17 studies were meta-analysed, including 6891 participants (2299 with AF); the pooled sensitivity was 94% (95% CI 92% to 95%) and specificity 97% (96%–98%), with substantial heterogeneity (p<0.01). Studies were of poor quality overall and none met all the QUADAS-2 criteria, with particular issues regarding selection bias and the potential for publication bias.

Conclusion PPG provides a non-invasive, patient-led screening tool for AF. However, current evidence is limited to small, biased, low-quality studies with unrealistically high sensitivity and specificity. Further studies are needed, preferably independent from manufacturers, in order to advise clinicians on the true value of PPG technology for AF detection.

  • atrial fibrillation
  • smartphone
  • photoplethysmography

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study. This is a systematic review and meta-analysis of published and fully anonymised studies. The data used have already been published in journals in the form of full-text articles or conference abstracts.

https://creativecommons.org/licenses/by/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Atrial fibrillation (AF) is the most common cardiac arrhythmia encountered by healthcare professionals with rising prevalence, particularly in older patients and those with predisposing comorbidities.1 Timely identification is key due to the significant impact that AF has on patient quality of life and mortality, in addition to morbidity due to thromboembolism, heart failure and cognitive decline. In particular, there is a fivefold increased risk of stroke, where at the time of stroke, up to a third of patients either have a known or new diagnosis of AF. These strokes tend to be more disabling when compared with strokes from other causes, and are largely preventable with oral anticoagulation.2–4 According to the European Society of Cardiology guidelines, patients aged over 65 years can benefit from AF screening using single timepoint, repeated or ambulatory electrocardiogram (ECG) recordings.4 However, there is a risk of missing paroxysmal episodes as AF can be brief and sporadic. This was seen in the STROKESTOP study (systematic ECG screening for AF among 75-year-old subjects), where short intermittent home ECG recordings resulted in higher sensitivity rates for the detection of new AF compared with one-off measurement.5

The development of novel screening devices has the potential to increase screening coverage and improve clinical detection of AF. Smartphone applications can allow self-detection of arrhythmias, allowing for patient self-care and involvement.6 Photoplethysmography (PPG) technology found in smartphone cameras can be used for AF screening by patients in the community. The technique uses the light-emitting diode in cameras to measure pulsatile changes in light intensity that are reflected from a finger (or face). Several smartphone PPG applications are currently available, but their clinical value for AF detection is unclear. The majority are commercial products, and there is justified concern over publication bias.7 In this systematic review, we assess the diagnostic accuracy of AF detection using smartphone PPG applications in comparison to a gold-standard ECG recording and provide guidance to clinicians about the value and limitations of their potential use to guide clinical management.

Methods

Eligibility and search strategy

This review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines and was prospectively registered with the PROSPERO database of systematic reviews (registration number: CRD42019109455). A systematic review of MEDLINE (1950–1 December 2020), EMBASE (1980–1 December 2020), and the Cochrane Library (until 1 December 2020) was performed without language restriction (see online supplemental tables S1 and S2 for search terms). We also manually searched reference lists of relevant studies for any further available literature.

Inclusion and exclusion criteria

All publications examining any type of AF were evaluated (paroxysmal, persistent or permanent as defined by the study), including original research and conference abstracts in participants aged 18 years and above. We required (1) comparison with a reference standard ECG (1, 3 or 12-lead); and (2) AF detection using a smartphone camera to analyse the PPG signal from a fingertip or the face. Editorials and reviews were excluded, in addition to publications that did not meet the study objectives. For example, we excluded use of wrist-worn devices to generate PPG signals (as these require additional hardware beyond a smartphone) and studies that lacked a reference ECG.

Outcomes

The outcomes considered were validation of AF detection by examining (1) sensitivity; (2) specificity; (3) positive predictive value (PPV); (4) negative predictive value (NPV); and (5) overall accuracy.

Data extraction

All publications that were identified from literature searches were initially screened based on title and abstract by two independent reviewers (SG and KVB). Data were stored in a standardised tabular format and the full list was assessed for eligibility by two different reviewers independently. Following screening, any discrepancies were discussed between the reviewers (SG, KVB). Any further conflicts were resolved by reviewing the original publication and additional adjudication.

Two reviewers (SG and KVB) assessed the full text of each article or abstract with four evidence-based hierarchy criteria: (1) original research reporting findings for all outcomes considered; (2) original research reporting findings for some, but not all, outcomes; (3) conference abstract reporting findings for all outcomes; (4) conference abstract reporting findings for some, but not all, outcomes. During the assessment, publications meeting the criteria above were extracted at study level, and a table was generated including relevant information (the disease of interest, setting, population and sample size, type of smartphone application, outcomes measured, reference test and study quality results). We sought additional data on missing parameters from lead authors of the publications included, but no additional data were received. Where relevant, we made note of industry involvement in the study (eg, study funding, authors employed in industry and provision of study devices or technology).

Risk of bias

All studies were assessed by two independent reviewers (SG and KVB) using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool.8 This assesses four key domains: (1) patient selection; (2) index test used (smartphone application for AF detection via fingertip or facial PPG signal); (3) reference standard used (1, 3 or 12-lead ECG); and (4) flow and timing. All domains were assessed for risk of bias using signalling questions, and the first three for applicability to the review question. Each domain is given a score of high, low or unclear for risk of bias and applicability.

Patient and public involvement (PPI)

A PPI team were not involved in the design, conduct, reporting or dissemination plans of our research.

Ethics

This study follows the principles of the World Medical Association’s Declaration of Helsinki. In this case, separate ethical approval was not required as the study is a meta-analysis of previously published tabular information from relevant studies.

Statistical analysis

All results for sensitivity, specificity, PPV, NPV and diagnostic accuracy for smartphone AF detection were visualised using forest plots. When not directly reported by study authors, we derived diagnostic values using 2×2 contingency tables from reported sensitivity, specificity and corresponding confidence intervals. Due to uncertainty in the number of patients with AF, we were unable to produce a contingency table for one study.9 Publication bias was assessed using a funnel plot of linear regression of the log ORs on the inverse root of the sample size, with asymmetry and/or a non-zero slope coefficient with p<0.1 indicative of small study bias. Bivariate mixed-effects regression modelling was used to meta-analyse study comparisons with confidence intervals for sensitivity and specificity. A summary receiver operating characteristic plot was constructed to provide information on the overall diagnostic accuracy of smartphones for AF detection with 95% prediction regions. Heterogeneity was assessed using the I2 statistic, and visually using the bivariate box plot approach. Statistical analysis was performed using Stata V.14.2 (StataCorp LP, Texas, USA) and the MIDAS program.10

Results

The search strategy identified a total of 1153 publications (figure 1), of which 28 studies were included in the systematic review. Ten studies were full-text original research publications and 18 were conference abstracts.

Figure 1

Flow diagram for systematic review. Flowchart demonstrating selection of eligible studies. AF, atrial fibrillation; ECG, electrocardiogram; PPG, photoplethysmography.

Design, devices and population

Of the 28 studies included, 25 were prospective, of which 19 were conducted in a single centre9 11–29 and 6 involved two or more centres.23 30–34 Sixteen studies were conducted in secondary care,9 12–20 22 25 26 29–31 seven in primary care,11 21 23 24 32–34 three were unspecified27 28 35 and three used retrospective databases.35 36 Nine studies conducted in secondary care included those with known AF scheduled for cardioversion or catheter ablation.9 13–15 19 20 22 25 26 Details on inclusion and exclusion criteria for each study are presented in table 1 (and online supplemental table S3 for conference abstracts).

Table 1

Summary of full-text studies

In terms of devices, 16 studies used an Apple iOS smartphone,9 11 13–16 18–20 26 29–32 36 1 study used an Android smartphone,12 a further 2 studies used a combination22 25 and 9 studies did not specify the smartphone used.17 21 23 24 27 28 33–35 The most common PPG applications used were Cardiio Rhythm mobile application (CRMA; eight studies), Fibricheck (four studies), Preventicus (five studies) and PULSESMART (two studies), with nine studies not specifying the application.13 19 20 22 23 25 36 A 12-lead ECG was used as a reference standard in 13 studies, while 11 studies used a single-lead ECG and 4 studies used a combination of 12-lead, single or 3-lead ECG. Only three studies documented performing the PPG and ECG recording simultaneously.12 27 35

The total number of participants was 11 404, of which 2422 (21.2%) had an ECG diagnosis of AF. The prevalence of AF varied from 0.5% to 100% in individual studies. The average age (where stated) ranged from 59 to 77 years, with a weighted mean of 67 years (SD 4.9). The proportion of women ranged from 18% to 59%, with a weighted average of 48.2% (table 2).

Table 2

Characteristics of participants

Risk of bias and publication bias

The included studies were found to be of low quality overall; none were graded as meeting all QUADAS-2 criteria. Levels of bias were consistent across full-text studies (online supplemental table S4) and abstract-only studies (online supplemental table S5), even though assessment of the latter was limited by shorter description. The major concerns related to high risk of bias, particularly for patient selection (eg, selection of non-random patients), and the conduct of the study (exclusion of data from final analysis and unclear timing of reference and index tests) (figure 2A). Regression of effect size on sample size demonstrated a non-zero slope coefficient across comparisons with asymmetry (figure 2B). After excluding the largest study, the p value for the slope coefficient was 0.06 (p<0.10 suggestive of small study/publication bias). Heterogeneity is visualised on the bivariate box plot, with a number of studies outside the fence area (figure 2C).

Figure 2

Risk of bias, publication bias and heterogeneity. Top panel (A; bar chart) shows the overall risk of bias based on QUADAS-2 criteria (see online supplemental table S4 for each study). Bottom panel demonstrates high likelihood of publication bias (B; weighted funnel plot) and study heterogeneity (C; bivariate box plot with the inner shaded area representing the median distribution of sensitivity and specificity, and the outer area the 95% confidence bound). See table 3 for numbers linking to each study comparison.

Detection of AF

The 28 studies included provided 31 comparisons for AF detection using PPG smartphone applications against conventional ECG (29 fingertip and 2 facial PPG). A comparative summary by study is presented in table 3, and by smartphone PPG application online in online supplemental table S6. Sensitivity ranged from 81% to 100%, specificity from 85% to 100%, PPV 54% to 100% and NPV 77% to 100% (online supplemental figures S1 and S2). Accuracy was reported in 18 comparisons and ranged from 61% to 99%.

Table 3

Summary of sensitivity, specificity and accuracy of PPG

In meta-analysis of 20 comparisons of AF detection from 17 studies (n=5561; 1674 with AF), the pooled sensitivity was 94% (95% CI 92% to 95%), with significant heterogeneity (I2=49.6%; p=0.01). The pooled specificity was 97% (95% CI 96% to 98%), with significant heterogeneity (I2=85.3%; p<0.01). Overall, the area under the receiver operating curve was 0.98 (95% CI 0.97 to 0.99), again with substantial significant heterogeneity (I2=98%; p<0.0001) (figure 3).

Figure 3

Summary receiver operator characteristic plot. Includes all comparisons in the meta-analysis (see table 3 for numbers linking to each study comparison) with summary receiver operator characteristics. Note that significant heterogeneity was identified across studies overall (p<0.0001), and for sensitivity and specificity individually (I2=49.6%; p=0.01 and I2=85.3%; p<0.01).

Discussion

This systematic review has identified the potential for AF detection using smartphone-based PPG technology, but with insufficient evidence to demonstrate clinical utility at this time. Unrealistically high values for sensitivity and specificity were found from predominantly small, single-centre studies (in meta-analysis, 94% and 97%, respectively). We identified high risk of bias, especially for the type of patients selected to take part, and insufficient information regarding study flow, design and timings. Additionally, there was evidence of publication bias with significant asymmetry indicating negative studies were less likely to be published (figure 4). In general, commercial smartphone applications were used for AF detection, but most studies lacked algorithm transparency. Information regarding data quality assessment and characterisation methods used to delineate AF from other arrhythmias (eg, atrial flutter, tachycardia or ectopics) was often missing, making replication and validation difficult. Taken together, these findings suggest the need for larger independent studies to assess the role of smartphone PPG for AF detection.

Figure 4

Graphical summary. A graphical summary of the main findings within this systematic review and meta-analysis. AF, atrial fibrillation; PPG, photoplethysmography.

While AF is commonly associated with symptoms, asymptomatic episodes can occur and therefore only identified incidentally during routine medical review.37 Patients with undiagnosed AF can present with an ischaemic stroke as their first clinical presentation, identified on an admission ECG or during subsequent monitoring.38 There is a clear healthcare priority to increase effective screening in the community given the increasing prevalence of AF,1 the substantial morbidity that is associated with undiagnosed AF, and the benefit of early detection and use of oral anticoagulation to prevent thromboembolism.7 The sporadic nature of AF illustrates a genuine need for non-invasive and scalable screening techniques that can be used to detect AF over a prolonged period of time. Current practice for AF screening consists mainly of opportunistic pulse palpation or detection using ECG and medical ambulatory devices, which have a limited monitoring duration.4 39 Hence, the development of new technology wherein patients can repeatedly monitor their own heart rhythm, providing more opportunity to pick up AF. With smartphones now a ubiquitous part of life in most communities across the world, the potential for widespread AF screening is now realisable. However, as with most new technologies, it is likely that hardware, software and algorithms will all need to develop to provide reliable information that can help direct clinical management.

This systematic review specifically addressed the value of PPG using smartphone technology, but other forms of PPG AF detection are also available. For example, the Apple Heart study used intermittent smartwatch-based PPG monitoring in 419 297 participants, of which 2161 (0.52%) received an irregular pulse alert. AF was newly detected in 153 or 450 (34%) participants who wore a 7-day ECG patch, giving a PPV for AF detection of 84%.37 40 The Huawei Heart study used smartwatches and smart bands in 187 912 participants, of which 424 (0.23%) received an irregular pulse notification and in those followed up, PPV was 92%.41 Both studies show the potential that PPG technology offers for accessible long-term self-monitoring, but also the limitations of requiring specific technology (in this case smartwatches) that often limits the population to younger individuals (only 6% were aged over 65 years in the Apple Heart study). This results in low AF detection rates, the potential for false positive cases, uncertain clinical impact (eg, in those without risk factors for thromboembolism) and hence high levels of unnecessary anxiety. There is also an issue of cost borne by the consumer, and exclusion of those with socioeconomic deprivation. Conversely, the number of smartphone users grows globally at an average of 11.8% annually, with increasing numbers across all age ranges, including those aged 65 and above (source: Statista), who have the most to gain from detecting AF.

This review highlights the need for real-world studies, with minimisation of selection bias to establish the true diagnostic accuracy of smartphone PPG. With a condition as heterogeneous as AF, it seems improbable that sensitivity and specificity values would be as high in an unselected population, meaning that false positives and false negatives would need careful consideration. With regard to smartphone applications, greater transparency from commercial providers regarding AF detection algorithms are required, and further work is needed to evaluate their role in the diagnostic pathway alongside conventional AF screening. Large-scale randomised clinical trials that are powered for endpoints such as stroke and cost-effectiveness are needed to compare these devices and establish their merits,37 including studies that are independent of device or algorithm manufacturers.

Strengths and limitations

This systematic review is addressing a contemporary technology, with rapidly changing hardware and software. In order to capture the evolving field, we included data from published articles as well as conference abstracts, where findings may not have been peer reviewed or full information available. Ascertainment of study quality and bias was challenging for abstracts, and scores could improve following full-text publication. However, as many abstracts do not go on to a full-text publication, omitting these studies would have contributed to publication bias, particularly for studies with less positive or neutral results. The full range of study designs were included (retrospective, prospective and case-control studies), which may have led to an overestimation in diagnostic accuracy. Heterogeneity was substantial and there was evidence of possible publication bias. This is perhaps not surprising, as commercial companies that supported these studies may be less inclined to publish neutral studies or may have withheld developing results in order to protect their intellectual property. Due to the overall level of study quality, we were unable to separate results for low bias studies, and in particular selection bias is likely to have substantial impact on the generalisation of our findings.

Conclusion

Due to its paroxysmal nature in many patients, the detection of AF can be challenging using conventional ECG methods. With the growing use of smartphones, PPG technology offers the potential for large scale, non-invasive, patient-led screening of AF. PPG technology has shown promise for AF detection with high sensitivity and specificity. However, the current evidence base consists of small, biased and low-quality studies which are insufficient to advise clinicians on the true value of PPG devices for AF detection. In view of the extensive global use of such devices, further research is urgently required with reference standards, standardised validation and transparent algorithms.

Key messages

What is already known on this subject?

  • With its rising prevalence and sporadic nature, early diagnosis of atrial fibrillation (AF) can prevent adverse events in at-risk patients.

  • Smartphone photoplethysmography (PPG) technology has the potential to offer widespread non-invasive community AF screening over a prolonged period of time.

What might this study add?

  • This systematic review and meta-analysis compared smartphone PPG applications with standard 1, 3 or 12-lead electrocardiograms for AF detection.

  • This meta-analysis showed unrealistically high sensitivity and specificity for AF detection, and identified concerns regarding study quality and bias, limiting applicability to current practice.

How might this impact on clinical practice?

  • This review demonstrates that at present there is insufficient evidence to recommend the use of smartphone PPG for AF detection in clinical practice.

  • Further independent large-scale studies are required to evaluate its role in diagnostic and screening purposes.

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study. This is a systematic review and meta-analysis of published and fully anonymised studies. The data used have already been published in journals in the form of full-text articles or conference abstracts.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @vrothCardoso, @amibanerjee1

  • Correction notice This article has been corrected since it was first published. The open access licence has been updated to CC BY.

  • Contributors All authors are justifiably credited with authorship, according to the authorship criteria. In detail, SG, KVB, CS and DK are responsible for overall content, conception and design; SG and KVB for acquisition of data; SG, DK and VRC for analysis and interpretation of data; SG and DK for drafting the manuscript and final approval; SG, KVB, CS, VRC, NG, H-WU, JAW, K-SW, AB, FWA, MJCE, GG and DK for revision of manuscript and final approval. SG acted as guarantor.

  • Competing interests Dr Gill reports funding through the BigData@Heart Innovative Medicines Initiative, grant no.116074. Dr Bunting reports a grant from the University of Birmingham’s British Heart Foundation Accelerator Award (BHF AA/18/2/34218). Dr Sartini reports that Bayer AG is one of the partners that have funded the IMI framework and is employed by Bayer AG in this IMI collaboration. Mr Roth Cardoso has nothing to disclose. Ms Narges Ghoreishi has nothing to disclose. Dr Uh reports grants and personal fees from Innovative Medicines Initiative 2 BigData@Heart, grant no. 116074, during the conduct of the study. Dr Williams has nothing to disclose. Dr Kiliana Suzart-Woischnik has nothing to disclose. Dr Banerjee reports grants from AstraZeneca, outside the submitted work. Professor Asselbergs reports grants from Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No (116074), during the conduct of the study and is supported by UCL Hospitals NIHR Biomedical Research Centre. Professor MJC Eijkemans has nothing to disclose. Professor Gkoutos has nothing to disclose. Professor Kotecha reports grants from the National Institute for Health Research (NIHR CDF-2015-08-074 RATE-AF; NIHR HTA-130280 DaRe2THINK; NIHR EME-132974 D2T-NV), the British Heart Foundation (PG/17/55/33087 and AA/18/2/34218), EU/EFPIA Innovative Medicines Initiative (BigData@Heart 116074), the European Society of Cardiology supported by educational grants from Boehringer Ingelheim/BMS-Pfizer Alliance/Bayer/Daiichi Sankyo/Boston Scientific, the NIHR/University of Oxford Biomedical Research Centre and British Heart Foundation/University of Birmingham Accelerator Award (STEEER-AF NCT04396418), Amomed Pharma and IRCCS San Raffaele/Menarini (Beta-blockers in Heart Failure Collaborative Group NCT0083244); in addition to personal fees from Bayer (Advisory Board), AtriCure (Speaker fees), Protherics Medicines Development (Advisory Board) and Myokardia (Advisory Board).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.