Wearables for COVID-19 Detection: 1 Year Later

Last May I wrote a post discussing the growing interest in using wearables to aid the battle against COVID-19. Just over 1 year later the preliminary results of several studies on this topic have been published.

In this article I round up the research related to using wearables to detect COVID-19 before or at the time of symptom onset.

Why wearables?

Wearables could serve as a first-level screening for COVID-19 to alert individuals of a possible infection early in the course of the disease. The rationale is that an alert from a wearable would:

(1) allow the individual to take immediate precautionary measures (isolation, masking, testing) to prevent spreading

(2) help health officials (or businesses, sports organizations, schools, etc.) to prioritize who should get COVID-19 testing on a given day when resources are limited

To be clear, at this point no one is arguing for wearables to serve as a medical-grade diagnostic test. Rather, they may offer a valuable first-line defense against illness spread, because individuals can use them in a hands-off 24/7 manner and receive alerts as soon as they wake up in the morning. And while wearable data is not medical grade, the presence of many data points, over several months, and from multiple physiological data streams (heart rate, respiratory rate, temperature, activity, sleep) could overcome limitations in the accuracy of a single measure.

How this article is organized

For each paper I summarize the purpose, methods, and key results, along with an exemplary graph and caption copied from the paper to help illustrate the findings. Where possible, I separate results that I consider proof-of-concept (looking at data to figure out if there is anything possibly meaningful there) from detection algorithms (actually trying to predict infection from the data using a model that was not built on that same data).

If you don’t want to get into the nitty gritty, you can just jump to the end of the article where I attempt to pull all these findings together and look ahead to what's next in this space.

Note: Most of the methods are complex, so I do not dive heavily into the details here. My main goal is to give a high-level overview. I strongly encourage you to dig into the papers themselves via the links provided. Two papers have not yet been peer-reviewed at the time of publication and are denoted by PREPRINT.

The Papers

Miller, D. J. et al. Analyzing changes in respiratory rate to predict the risk of COVID-19 infection. PLoS ONE 15, e0243693 (2020).

Mishra, T. et al. Pre-symptomatic detection of COVID-19 from smartwatch data. Nat Biomed Eng 4, 1208-1220 (2020).

Natarajan, A., Su, H.-W. & Heneghan, C. Assessment of physiological signs associated with COVID-19 measured using wearable devices. NPJ Digit Med 3, 156–8 (2020).

Smarr, B. L. et al. Feasibility of continuous fever monitoring using wearable devices. Sci Rep 10, 21640–11 (2020).

Hirten, R. P. et al. Use of Physiological Data From a Wearable Device to Identify SARS-CoV-2 Infection and Symptoms and Predict COVID-19 Diagnosis: Observational Study. J Med Internet Res 23, e26107 (2021).

Quer, G. et al. Wearable sensor data and self-reported symptoms for COVID-19 detection. Nat Med 27, 73–77 (2021).

Hassantabar, S. et al. CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks. (2020). Preprint.

Liu, S. et al. Fitbeat: COVID-19 Estimation based on Wristband Heart Rate. (2021). Preprint.

Terminology

Several terms come up frequently in this area of study, so I’ll define them here:

HR (or RHR) - heart rate, typically taken during a period of sleep or daytime rest for the purpose of these analyses. The calculation depends on the study and may be the mean, median, or minimum during a given time interval. An unusual increase in resting HR may indicate illness.

HRV - heart rate variability, or how (in)consistent the time between heart beats is. Also typically measured during rest or sleep. There are several different ways to calculate it (for example, root mean sum of squared differences or RMSSD), but generally too little or too much may signal illness.

RR - respiration rate, or how many breaths an individual takes per min. Similar to HR and HRV, this is typically calculated as an average during rest or sleep to maximize accuracy. RR may increase due to an illness that causes breathing difficulties, such as COVID-19.

T - temperature. In the studies presented here it represents skin temperature rather than core body temperature.

Sensitivity - true positive rate, or the percentage of individuals with the disease that are correctly identified by a classification algorithm (sick/not sick).

Specificity - true negative rate, or the percentage individuals without the disease who are correctly identified as not having the disease by a classification algorithm (sick/not sick).

PPV - positive predictive value, or the probability that a person with a positive test actually has the disease.

NPV - negative predictive value, or probability that a person with a negative test actually does not have the disease.

AUC - area under the receiver-operating characteristic (ROC) curve, which is created by plotting the sensitivity of a diagnostic test against its false positive rate. The closer the value is to 1, the better the test is at classifying disease.

Now the good stuff…

Round Up of Research on Early Detection of COVID-19 via Wearables

Analyzing changes in respiratory rate to predict the risk of COVID-19 infection

Miller et al. 2020, Results of predictive algorithm for COVID-19 from respiratory rate data. Cumulative percentage of individuals from each dataset that were classified as COVID-19 positive (C+, black) or COVID-19 negative (C-, gray) relative to symptom onset (day = 0). — **Miller et al. 2020, Results of predictive algorithm for COVID-19 from respiratory rate data.** Cumulative percentage of individuals from each dataset that were classified as COVID-19 positive (C+, black) or COVID-19 negative (C-, gray) relative to symptom onset (day = 0).

Purpose
To predict COVID-19 infection before or during early symptom onset using RR.

Participants
271 WHOOP Strap users who reported COVID-19 symptoms and test results from March 14 through June 6, 2020. They were divided into 3 groups: training (COVID-19 symptoms March 14-April 14), validation #1 (COVID-19 positive with symptoms April 14-June 6), and validation #2 (COVID-19 negative with symptoms April 14 – June 6).

Wearable
WHOOP Strap

Wearable Measures
RR (nighttime) transformed into several different features. The features essentially examined different ways of normalizing the current RR value to “baseline.” One feature also looked for a linear trend in RR in the preceding nights.

Early Detection Algorithm
The model used several RR features to provide a probability that an individual was infected with COVID-19 on the current day. Days with 30% probability or higher of COVID-19 were classified as C+ and the remaining days as C-. The model had 36.5% sensitivity and 95.3% specificity in individuals with confirmed COVID-19, meaning it accurately classified 36.5% of actual infected days (-2 days symptom to +3 post symptom onset) as C+ and 95.3% of healthy days (defined as 30 to 14 days prior to symptom onset) classified as C-. The PPV was 73.8% and NPV was 80.6%. 20% of COVID-19 cases were detected 2 days before onset, 30% at symptom onset, and 80% by 3 days post onset.

Summary
This is one of the first studies to use RR data as the primary input to its model. The results are promising and provide good proof of concept that RR could support illness detection. Like other studies, the model results suggest it would be best suited for identifying absence of disease. The authors also provide valuable information about the typical variability of RR, finding that across 750,000 nights of data RR appears to be less variable than resting HR or HRV, which may make anomaly detection easier.

So What Does It All Mean?

It’s encouraging that there are several groups working on this problem in parallel, each taking slightly different approaches with seemingly similar success. Which approach shakes out to be the most effective remains to be seen. However, several key themes have emerged:

Heart rate, heart rate variability, respiratory rate, and skin temperature have value in detecting COVID-19 —and likely other flu-like illnesses, too. Galvanic skin response, activity, and sleep parameters also may be helpful.
Illness periods are best identified by looking for anomalies in the data relative to an individual’s typical or “baseline” values. There is too much difference between individuals in these types of data to allow single values in isolation to be meaningful.
Complex analyses are not necessarily superior to simpler analyses. Methods used by these studies ranged from a simple mean difference in heart rate to frequency content analysis of skin temperature data. The optimal way to analyze each signal, however, is yet to be determined.
Wearable data seem better suited to identifying absence of disease than the presence of disease. Users with a “negative” result could have good confidence to go about their regular lives, while “positive” results flag a need for medical consultation and isolation measures until a diagnostic test can be obtained.

Not a Silver Bullet

There are several limitations to this research that need to be noted:

These are mainly explorations of feasibility. None of this research is ready yet for real-time operation in the wild although several research groups are actively working toward this.
Research participants are not representative of the general population. Most of these studies relied on participants who already had wearable devices and also had access to COVID-19 testing in the first half of 2020. Quer et al. 2020 reported that while “a recent survey found no racial or ethnic variation in smartwatch or activity tracker usage (23%, 26% and 21% for Black, Hispanic and White individuals, respectively), the lowest percentage of users were identified in those with the lowest annual earnings (12%), the lowest educational attainment (15%) and in those over age 50 (17%).”
Defining “healthy” and “sick” is fuzzy. COVID-19 has a variable incubation period of 2-14 days from infection to symptom onset. Since most studies relied on participant self-reports of date of symptom onset or positive COVID-19 test, it’s hard to know precisely when someone was infected. Additionally, some individuals classified as “healthy” actually may have been asymptomatic with COVID-19. Finally, COVID-19 diagnostic criteria were evolving in parallel with these studies. All together, this makes labelling “healthy” and “sick” datapoints noisy, which in turn challenges the models.
A wearable only works when you wear it. Data was lost in several studies when individuals forgot to wear their device or let it run out of charge. This is common with wearables and worse when someone is feeling unwell. Besides compromising results, it highlights the practical challenges of relying on a wearable to manage health issues.

On the Horizon

I expect we will see more evolved algorithms emerge over the coming months. Jawbone, one of the earliest consumer wearable manufacturers, appears to be coming back to market with a focus on illness detection. Other wearable companies also have begun staking a claim in this space. Meanwhile many of the research groups cited in this article are taking their proof-of-concept findings and deploying them in real-time detection studies. And we await the first results of ongoing projects at West Virginia University, Duke University, and the Department of Defense.

The Big Picture

Even though an end to COVID-19 seems in sight (thank you vaccines!), this research is important and will remain important for many years to come. Ultimately, advancing this area of science is my main motivation for summarizing these studies here.

A wearable could help with disease screening and triage in the interim. Recent estimates call for at least another year until herd immunity from COVID-19 but likely longer and with significant geographic disparity. Given the high cost of most wearable devices and lower adherence in older populations, however, it remains to be seen whether this can be done in an accessible manner.

This research can and should be leveraged to develop broader early illness detection strategies. Most of this work looks at detecting flu-like illness rather than COVID-19 specifically. As Joseph Patterson noted in our University of Michigan panel discussion last year, a bad case of strep throat ripping through a US Army barracks is just as problematic for military readiness as the presence of COVID-19. Many illnesses inflict a serious human and financial cost, so the opportunity to detect illness before transmission to others can be tremendously beneficial to employers, the military, sports, and of course, the general public.

It appears very feasible for wearable devices that provide physiological data to be used to reliably alert an individual to possible illness. Like any scientific endeavor, much more work needs to be done, so I’ll keep this space updated as the research evolves.

Until then, wishing you low resting heart rates, ample sleep durations, and a healthy summer 2021!

Extra Credit: Other Interesting COVID-19 Wearables Research

Observational study on wearable biosensors and machine learning‐based remote monitoring of COVID‐19 patients

The study used the Everion sensor (HR, HRV, RR, pulse oxygen saturation, skin temp, actigraphy) and a proprietary illness index (Biovitals Sentinal platform) to predict disease severity, clinical status, and length of stay for COVID-19 patients in the hospital. The proprietary index outperformed the NIH clinical status index (NEWS2) for discriminating viral load, predicting clinical worsening events, and predicting hospital discharge.