PSA: Your “sleep monitor” is probably anything but

As the “quantified self” (probably ill-named) movement gains steam, all kinds of apps that purport to measure important physiological parameters that are related to health gain popularity.

In principle, this development is to be welcomed, as individual lifestyle and metabolism is so heterogenous across the population that most scientific studies on the matter are so noisy that they offer only very limited benefits to optimize individual lifestyles.

Having more data available is almost always to be welcomed, as it can inform decision making about lifestyle choices far beyond common sense (which is actually far less common than commonly assumed) and old wives’ tales. Moreover, the medical profession is more geared towards limiting downsides and minimizing suffering from acute harm than towards optimizing the upside of life. Preventative medicine is in its infancy.

However, these developments naturally also harbor significant risks. The only decisions worse than those made randomly are those based on bad data, as they tend to be systematically wrong, yet are defended with conviction.

Most scientists are properly trained in the gathering and interpretation of data, as this how they make their living. As data collection and interpretation moves into the mainstream, familiar (to scientists) concerns about objectivity, reliability and validity of data come to the fore.

Giving a primer on research methods is beyond the scope of this piece (however helpful it might be). Instead, we will focus on one recent – alarming – development.

As we have observed before, language matters. Tremendously. This is also the case here.

In a nutshell, all devices that measure physiologic parameters rely on proxies (that are actually measured). The validity of the measurement crucially relies on the tightness of the link between the proxy and the quality of interest. This can be tricky if the quality of interest is psychological or neurological in nature. For interest, current and past “lie detectors” are really only called that. In reality, they measure skin conductance changes, which is used as a proxy for sweat gland activity, which is used as a proxy for sympathetic nervous system activity, which is used as a proxy for the probability that someone is lying (because something is personally significant or unsettling). In principle, this is sound, as the autonomous nervous system is not under the voluntary control of most people. However, that’s a lot of proxies, so the link is tenuous at best.

The same is true for sleep, which is still for the most part a mystery. Even if you go in for a clinical “sleep study”, the way sleep is currently assessed is by “polysomnography“. Briefly, several sensors measure the electrical activity on the scalp (via EEG), muscular activity (via EMG), eye movements (via EOG) breathing rate, blood oxygenation and perhaps some other parameters. Of these, EEG, EMG and EOG – in combination – are most diagnostic of gross physiological states, such as sleep. For instance, an EEG that is dominated by high amplitude and low frequency waves (delta) characterizes deep sleep. But things can get tricky. The EEG patterns during REM sleep are relatively low amplitude and high frequency and can look quite desynchronized. To the untrained observer, it would be hard to distinguish REM sleep from awake, based on the EEG alone. That’s where the other parameters come in. During REM, EMG will show a lack of muscle tone (there is actually an active muscle paralysis going on) whereas EOG will show characteristic rapid eye movements, both of which sets it apart from the waking state. Similarly, it is hard to distinguish deep sleep from REM sleep if one were to look at the EMG alone. There isn’t much movement during either phase, even though these sleep states couldn’t be more different in any other way.

Polysomnography

The gist of this is that one really needs all three parameters in order to properly characterize sleep stages, as they are defined now (as we don’t really understand sleep yet, I anticipate there to be further, neurotransmitter-based metrics to come into common use in the future). At a minimum, one cannot forgo EEG, as one won’t be able to distinguish REM from deep sleep without it. There is a reason it is called polysomnography. Many parameters need to be measured in order to characterize sleep.

And herein lies the rub. In order to measure the EEG, one has to get an electrode on the scalp. With the advent of wireless technology, brave companies like ZEO have pioneered this approach. While the results fall far short of full polysomnography – as one would do in a clinical setting – they are quite impressive. It is remarkable what modern signal processing can do with a single electrode. Their correlations with polysomnographic recordings are high, implying a high reliability. This kind of “at home” sleep measurement capability opens up the potential for all kinds of investigations, both for research and personal use.

But this technology is not cheap and however important sleep might be, most people turned out to be reluctant to shell out much money for its measurement. Moreover, the ZEO device necessarily required a headband to be worn, and most people couldn’t be bothered. Consequently, ZEO (the company) is struggling for survival.

In contrast, all kinds of smartphone apps and devices that rely on accelerometers which can be gotten for a few dollars are booming. It is understandable that people want to minimize cost and not want to bother with headbands, utilizing devices that they have anyway for this purpose (e.g. phones, calorie trackers). However, one should *not* confuse these devices with sleep monitors, as they do no such thing. Claims that they allow to “track”, “monitor” or “measure” sleep are disingenuous at best.

Measuring sleep is not trivial

Inferring sleep from vibrations on the surface of the bed is at least one step too far. A myriad of factors could confound these measures, all of which should be obvious, such as pets jumping on the bed, sharing the bed with a partner, etc. Most serious is the insurmountable hurdle of distinguishing REM sleep from deep, based on data from accelerometers. Most currently available apps just lump the two together and call it “sleep depth”. This is at best inaccurate (as they could not be more different physiologically) and at worst dangerous. For instance, it has been shown that a lot of deep sleep is restorative in terms of physical exertion whereas too much REM sleep is not necessarily a good thing. Instead, it can be indicative of a major depressive episode. Sleep disturbances like that accompany almost all mood disorders. It is an extremely disconcerting prospect that someone with such a disorder could rely on measurements from such a pseudo-sleep-monitor to reassure himself that their sleep is just fine, when it is really not.

To summarize, actigraph based metrics can at best measure the quantity, but not the quality of sleep. Claiming otherwise is (deliberately?) misleading.

No one likes criticism. But in this case, it is warranted. It is understandable why people do what they do, but in this case, I must stand fast. I’m not sure if it will make a dent, but it is imperative to stem this quite unfortunate development.

Of course, there is no harm in people using these cheap accelerometer-based devices if they understand their limitations. However, given how these are currently marketed, I doubt that most people are aware of this problem. It would be a good start to stop calling them “sleep monitors”.

It is up to every individual how serious they take the development of their own life. As long as they can live with the outcome. Because they will have to live with it.

Update: ZEO has lost the struggle for survival. It is a terrible shame that cheap actigraphs killed the only device that came close to a home sleep monitor. Hopefully, this is just an indication that ZEO was ahead of its time and not indicative of how serious people are willing to get about sleep. Perhaps people *are* rational after all, though. If the vast majority of users is unable to interpret a hypnogram anyway, it makes sense to go with the option that is cheaper, more convenient and – apparently – simpler to interpret. As detailed above, this can – however – be dangerous, as excessive REM sleep is far less refreshing than commonly believed.

In the meantime, I bet there is a sizable minority who would be happy to spend quite a bit for genuine home sleep-monitoring capabilities sensu ZEO. What to do? Kickstarter to the rescue? Maybe mattress manufacturers could sponsor it – if their marketing claims are true, one *should* see a reliable improvement in sleep, as measured.

Awesome writeup. Just want to note that part of the problem with Zeo’s survival was likely their abominably bad reliability. Look at the amazon reviews and it’s easy to conclude that a very large portion of their devices simply failed. I went through three of them before they shuttered.

I realize this article was posted quite a while ago, but have you seen the Kokoon Headphones + Sensors that do a headband-style neural activity tracking through a headphone band? Thoughts on expected quality of readings through hair?