Editor’s Note: Millions of people suffer from serious mental
illness, but very few receive consistent coordinated care. Since leaving his
post in 2015 after 13 years as director of the National Institute of Mental
Health, co-author Tom Insel has been on a mission to use technology (such as
mining your smartphone) to better understand your state of mind and treat
depression, schizophrenia, and other disorders. Insel and co-author Joshua
Chauvin, part of the team at a healthcare innovation company, examine the
potential and pitfalls of this next digital frontier.

Source/Shutterstock

Imagine that you visit your physician complaining of a fever and,
rather than taking out a thermometer, they begin hovering their “educated
hands” over you. Gradually, they press down against your arm to gain a full
impression of your skin’s temperature and the “deeper seated combustions.”
Removing their hand, they look closely at your appearance and pronounce their
assessment: you do, in fact, have a fever. You might (justifiably) be dubious.

Thankfully, clinicians today have an inexpensive, ubiquitous
tool to measure a patient’s temperature. But it wasn’t always this way.1 The first person to devise a thermometer, an
instrument to track temperature, was an Italian physician, Santorio Santorio,
who in the early 17th century described a device for measuring the expansion of
water or alcohol with heat. It was a century later that Fahrenheit, a
Polish-born Dutchman who was both a physicist and a glass blower, used mercury
instead of water or alcohol and created the temperature-measuring scale that we
continue to use today. But this device could not be used in clinical practice.
The early versions were cumbersome and, for over a century after Fahrenheit, no
one knew how to connect the measurement of body temperature to the state of
disease. Indeed, until late in the 19th century, the physician’s hand was the
standard medical instrument for detecting a fever.

Lack of Objective Measurement in Psychiatry

Just as the thermometer provided a
standardized, objective measurement for detecting fever, tools to quantify health and disease parameters have transformed medicine in almost every
major disease area—electrocardiograms for heart disease, blood glucose for
diabetes, and, recently, genetic diagnostic tests for cancer. But when it comes
to brain health, and in the case of mental illness especially, progress has
been uneven. Although direct brain imaging instruments exist, most (MRI, PET,
MEG) are expensive, inaccessible to many, rarely useful for deciding the
treatment of an individual patient, and time-intensive to administer. While
they can identify brain lesions in multiple sclerosis or dementia, they are
less useful in mental disorders. This lack
of measurement matters because, to borrow a truism from business, “we don’t
manage well what we don’t measure well.”

In the absence of reliable instruments,
clinicians treating mental illness use indirect, intuition-based measures. The DSM-5, the principal schema
for
classifying mental disorders, requires clinicians to form diagnoses based
on their subjective judgments and arbitrary cut-off points. For instance, the
diagnosis of major depressive disorder requires five of eight features (such as
diminished interest or pleasure in activities, feelings of worthlessness, or
diminished ability to think or concentrate) to persist for two weeks—based on
patient or family reports.

During treatment, similarly, patients are not
routinely monitored with objective assessments. Some clinicians employ self-report
questionnaires, like the PHQ-9 (a nine-item scale to rate depressive symptoms),
but these scales are only modestly correlated with ratings of trained observers.2The self-report
information is, of course, important to monitor but, like reports of chest pain
or headache, usually proves insufficient. (That
is, they have relatively low inter-rater reliability, do not assess
patients in real world settings, and often cannot reliably attribute the effects of a
given intervention). Moreover, only 18 percent
of US psychiatrists and 11 percent of psychologists routinely use
symptom-rating scales or Patient-Reported Outcome Measures (PROMs) to monitor
patient improvement.3,4 Thus, for the
vast majority of patients with a mental illness, measurement often comes down
to “How are you feeling?” during sporadic, brief visits in primary care. This
is a bit like our example of the doctor trying to determine whether your
temperature is increasing using only his or her hands and clinical experience
to guide them. Or treating hypertension without a blood pressure cuff or
diabetes without a glucometer.

There has been a push toward using
“measurement-based care” that relies on standard rating scales and
patient-reported outcomes, and good evidence that it can improve clinical
outcomes.5 But this approach has its limitations.
Practically speaking, standard assessments can be difficult to implement, held
back by a lack of financial support and limited personnel to administer the
tests. They increase paperwork, which can burden stretched clinicians.6 Perhaps most problematic, these measurement tools are necessarily brief and can
capture only a narrow spectrum of a patient’s overall state (e.g., general
depression symptoms). And since they are administered infrequently, usually in
the clinic, they of necessity collect one-time, or “snapshot,” impressions of a person’s mental health.7

How can we move
beyond this state of affairs?

Let’s imagine
the ideal form of measurement for mental health. In addition to being
objective, it would be continuous (assessing symptoms frequently) and precise
(both sensitive and specific) and collected in the “real world” (outside the
context of the clinical encounter). It should give clinicians access to summarized and up-to-date patient data (e.g. on
symptom severity), easily interpretable to provide meaningful, clinically
actionable information.8,9 Such information would enable
clinicians to measure response to treatment in real-time on an ongoing basis
and to adjust treatment plans based on the patient’s preference and response. Finally, to be
effective and scale to global populations, the measurement should be
passive—done without asking individuals to change their behavior or do anything
on top of what they are already doing. Taken together, the combination of
attributes would help to ensure early and timely intervention. Instead of the
current model of care, which is largely reactive (administered when someone is
presenting symptoms), better measurement that is objective, continuous, and
passive can move the health system towards more proactive and preventive
interventions.

The Hope: Advent of Digital Phenotyping

If such a tool
sounds implausible, consider for a moment recent advances in information
technology and data science.

In 2011, the World Health Organization stated: “The use of mobile and wireless technologies to
support the achievement of health objectives (mHealth) has the potential to
transform the face of health service delivery across the globe.”10Since then, smartphone subscriptions have increased more than five-fold (from
856 million to more than five billion today), with projections to reach nearly
seven billion by 2022.1 There’s also been astonishing growth in broadband access, even in areas without
easy access to clean water.12

Over the same period, there
have been significant advances in data science too, including the advent of
machine learning, which can find patterns in large data sets that were not
evident using conventional statistical approaches. These developments are
already transforming healthcare: diagnostic testing is beginning to incorporate
a form of machine learning called neural networks,13
healthcare systems are taking advantage of machine learning to help triage and
streamline patients through services,14 and
predictive modeling—the analysis of past and current data to forecast
outcome—is using electronic health records to drive personalized medicine and
improve healthcare quality.15

The increasing ubiquity of
smartphones and advent of technologies, such as home devices (Amazon Echo and
Google Home) and wearables (FitBit, Apple Watch), that can act as a reliable
source of measurement, combined with advances in analyzing continuous data, presents us for the first time with
an opportunity to monitor
brain function at population scale. This
approach, called digital phenotyping, is a two-step process that works
by applying machine learning to data collected from digital devices such as
wearables and smartphones.16
Obtaining the signals from the phone or wearable device is the
first step. Making sense of these signals by finding the patterns
that correlate with clinical state is often the more difficult second
step. To find patterns in complex data most researchers have used
machine learning, a powerful statistical approach that can extract predictive
features from large data sets.17Machine learning is a rapidly evolving field which promises to improve our
ability to find clinically-relevant signals with each iteration. With it,
information from sensors (e.g. physical activity, location, heartrate),
keyboard interactions, and other features such as voice and speech can be analyzed
to provide insight into changes in a person’s behavior, their psychological
state, and cognitive function. The approach can even provide predictors of
risk.

Examples of using activity monitors and motion
sensors to monitor the behavior of patients with mental illness — what we are
now calling the digital phenotype—have been around since at least the 1980s.18,19 Today, studies continue to demonstrate that patterns of activity and
geolocation can herald mania or depression20 and sleep actigraphy can predict suicidal ideation;21 moreover, other biosensors have shown that heart
rate variability can help predict Post Traumatic Stress Disorder diagnosis22 and speech and voice, which can reveal
important aspects of our emotional, social and psychological worlds, may be
able to provide insight into depression.23

While signals from actigraphy and voice have
proven to be predictive, they are also noisy and nosey. One particularly promising
approach to developing digital phenotypes of cognition that might help to move
the field beyond these concerns involves data from human-computer
interaction (HCI). HCI-based digital biomarkers can be generated from
passively-collected, content-free interactions, like typing and scrolling
patterns on a smartphone, measuring the latency between space and character in
a text or the interval between scroll and a click. This approach was originally
developed in cybersecurity to track hackers with what was called “digital
fingerprinting.” (Based on an individual’s pattern of activity, every
individual who spends time online leaves a unique trace, which can be used to
create identifiers for individual users—hence the notion of a digital fingerprint.) Applying this concept to
mental health, scientists have developed digital biomarkers that strongly correlate with performance on traditional
cognitive tests and with mood ratings.24 With the average user’s
output of over 2,600 smartphone touches a day,25 these ubiquitous computer
interactions can reveal a lot about how we think and how we feel and when
combined with other measures like sleep, activity, and speech, create a digital
phenotype.

Supplementing clinical impressions and subjective, episodic
assessment, digital phenotyping offers an opportunity to move towards
objective, measurement-based care. For psychiatry, it could bring brain health measures to the population,
and with it the ability to target care and intervention to high-risk patients,
extending independence and improving productivity. Applications could include
screening, early detection, disease monitoring, precise diagnosis, and a new
care model based upon these.

The Challenges

Let’s go back to the history of the
thermometer for a moment. It wasn’t because thermometers didn’t exist that 19th
century physicians were reluctant to change practice. As we have seen, they had
in some form been available for 200 years. Nor was it because physicians
weren’t aware that temperature was related to illness—that had been known since
Hippocrates 2,000 years before. So, what was it that held the field back?

While physicians had a reliable
instrument for measuring body heat, they didn’t know what a normal temperature
range was. It was only with the
discoveries made by Carl Wunderlich (1868), a psychiatrist who collated nearly
100,000 observations, that data could define normal and abnormal body
temperature. At that point, with the clinical utility of the thermometer
evident, it was routinely adopted in clinical practice as part of a complete
medical evaluation. Temperature could be used as a biomarker for disease.26

To gain widespread clinical use, digital
phenotyping will need to overcome similar challenges, and a few contemporary
hurdles as well.

As with body temperature, digital phenotyping
needs to be tested in large, diverse populations to identify the digital
biomarkers that matter. This means validating digital parameters against
standard (if imperfect) measures of cognition and mood to determine which, if
any, reliably give
accurate, actionable data. The good
news? There are already many ongoing large-scale clinical trials helping to
validate this technology, and so far, the results have been promising.

But the clinical use of digital phenotyping
presents ethical, legal, and social questions that the thermometer did not.27And there’s a gap between demonstrating clinical value and achieving public
trust: patients must be able to balance the benefits against real or perceived
risks. No doubt building an evidence-base will be an essential step in this direction,
but it will not be enough.

With the recent “techlash”
against giant technology companies—consider the stir caused when Cambridge
Analytica misused personal data28
and the ongoing wave of negative news coverage for Facebook29—such
acceptance will require more than compliance with healthcare and privacy
regulations. Besides protecting user data, digital tools must offer transparency
and informed consent, and when there are questions of malpractice, users must
be able to hold designers, providers, companies, or otherwise, accountable.30
To pre-empt ethical transgressions, and build trust with patients, active engagement
with users in the development of new technologies and careful consideration of
users concerns is essential. Tech companies must also consider limiting the
range of data they are collecting and consider the potential invasiveness of
their approach. The content-free digital phenotyping provided by
human-computer interactions described above, for example, is likely to be more
acceptable than approaches drawing upon personally identifiable information
like voice or location.

Even with scientific backing and public trust,
adoption and acceptance—by patients, clinicians, and healthcare systems—still
presents significant challenges. Patients must want to engage with
the new digital health tools. Of the more than
300,000 digital health apps currently on the market, a mere 41 account for the
bulk of all downloads, while 85 percent have fewer than 5,000 installs.31 Clinicians
must likewise be won over. According to an American Medical Association survey, current levels of
use of digital health tools by clinicians remain low, with only 26
percent currently using patient engagement technologies (i.e., solutions for
chronic conditions designed to promote patient wellness and active
participation in their care, for example, through promoting adherence to
treatment) and 13 percent using remote patient monitoring technologies designed
for daily measurement.32Among other reasons, clinicians reject these
emerging tools because they’re disruptive, time-consuming, unvalidated, and
costly to use. For health systems to adopt
digital health tools more broadly, progress must include better curation and
evaluation of apps. This includes: establishing best practices around privacy
and security; getting patients and providers to recognize the value;
establishing regulatory guidelines and reimbursement models for payments; and
making it easy for clinicians to integrate new technologies into their
practice.33

Finally, we must recognize that digital
phenotyping is only one piece of the puzzle. Improved health outcomes require
more than detection: if the smartphone becomes a digital smoke alarm, how do we
put out the fire? For mental health, many of the best treatments involve
communication, skill building, and a therapeutic relationship. All of these can
be done on a phone, allowing a “closed loop” approach to mental healthcare,
where digital phenotyping identifies a need and the treatment is delivered immediately by a remote clinician.
The same phone can also monitor the impact of the treatment, making
measurement-based care for an individual with depression or psychosis the
equivalent of both the thermometer and the antibiotic for a patient with a
fever.

The pioneering German psychiatrist Carl
Wunderlich is said to have commented that “a physician who carried on his
profession without employing the thermometer was like a blind man endeavoring
to distinguish colors by feeling.”34The same might one day be said about clinicians who don’t adopt more objective
measures of brain health. Whether digital phenotyping or some other method
takes precedence, it is clear that practice as usual is no longer an option if
we are to improve outcomes for people with mental illness.