Predicting Medical AI in 2017

What a blast 2016 was. It seemed like every day there was a new, massive breakthrough in deep learning research. It was also the year that the wider world really started to take notice. The media, professional groups, and the general public all climbed aboard the AI hype train in 2016. Governments commissioned major reports and conservative economic forums discussed the future of work (like, whether work will still exist).

But in medicine, the progress was much more modest. Indeed, for most of the year I thought we would not see any big disruptive breakthrough of the sort described above. But, at five minutes to the midnight of 2016, Google pulled a rabbit out of their hat with their work on diabetic retinopathy assessment. For the first time, we saw a computer system truly compete with doctors at a medical task.

Doctors started talking seriously about AI in 2016. Specialties in the firing line (like radiology and pathology) have lead the conversation, although I’m not convinced by the prevailing wisdom on the short, mid and long term outlook for our professions.

I’ll talk more about the prevailing wisdom some other time, but I think part of the reason people get it wrong is that long term prediction is really hard, especially when the pace of change is so rapid and the role technology is taking is (arguably) unprecedented.

But the near term is much more clear, so I thought it might be worthwhile grabbing my prediction goggles to look to the future. Let’s consider what AI tricks and treats might be in store for the medical world in 2017.

Donning my +1 Glasses of Future Sight. Note: I don’t actually have these and could never wear them. Even sunglasses look ridiculous on me.

Phased and confused

So what next medical AI in 2017? In my last blogpost* I talked about the sensible way clinical trials in medicine are divided into different phases, and suggested we can understand AI research in a similar way. This is a nice way to appreciate translational research because these phases reflect how likely it is for an application to reach clinical practice, and how long it will take to get there.

So when you read about a successful clinical trial, you can understand the impact of the results in context. As a rough rule of thumb, the chance of an eventual clinical product and the time until that product is available will be:

Preclinical complete: 5% chance, 10 years

Phase I complete: 10% chance, 8 years

Phase II complete: 50% chance, 5 years

Phase III complete: 80% chance, 1 year

We don’t know if AI research will mimic these numbers, because no AI trials have made it past phase II yet. It is likely that phase I and II trials for AI can be much quicker, because clinical trials require long follow-up times where AI trials use retrospective data. But the phase III and regulatory stages should be pretty similar to clinical trials.

The great thing about this framework is how it predicts how soon the AI can fulfill its promise in medicine. This makes it a very nice way to ground our predictions, because we only need to estimate how many good quality trials of each phase will be performed, and how much impact there will be just flows from there.

To be clear about what I am predicting about here:

I am only considering deep learning research in my predictions. As I have said in the past, older machine learning methods are still widely used and published in medicine, but there is no good reason to expect they will suddenly achieve breakthrough performance after 30 years without this success. Nor should we expect them to work amazingly well in medicine when they are not revolutionising other areas of technology. These older technologies make steady, incremental progress and still have a lot to offer, but aren’t going to be unexpectedly disruptive.

I am only concerned with systems that will directly alter patient care. Self-help apps that recommend you see a doctor at the first sign of trouble, image processing systems, data gathering systems and so on are cool and have a big role to play, but they aren’t “medical”. I’m talking about systems that do doctor work, and need regulatory approval.

I don’t know what I don’t know. If the big tech companies have large unpublished studies that are ahead of publicly available research, I’ll be surprised, but I can’t incorporate that into my predictions. Unless someone wants to tell me about it 🙂

You can’t expect me to know about unknown unknowns!

That definitional stuff out of the way, onto the predictions.

PHASE I

Prediction:

We will see the quantity of phase I research double in 2017.

Argument:

It is really hard to know what we should call phase I research. Technically, every CS student project with a public medical dataset and every medical Kaggle competition is a phase I study. But almost none of these will ever end up becoming products, because they are sort of “throwaway” research. There is no infrastructure to take the projects further. This isn’t something we see in clinical trials, because even preclinical and phase I trials are expensive. The lack of a cost barrier to entry in AI studies confuses the whole space to some extent.

It means I will have to be more narrow in my definitions. I will say a “true” phase I trial is one performed by a research group, published in peer-reviewed literature. These are more analogous to the phase I clinical trials, because researchers are motivated by impact and therefore they usually select projects that can grow into bigger things.

If this is our definition, then we anecdotally see something like five to ten good quality phase I AI trials each month. I “investigated” this by looking at the results of a couple of Google Scholar searches (e.g. “deep learning medicine”), covering the last 6 months. Five to ten per month seemed about right. If anything, I am over-estimating the output at the moment, which would make my prediction a bit optimistic.

I think this will happen because the number of researchers interested in deep learning is growing massively. The barrier to entry is low, practically anyone who can code can spin up cutting edge neural networks without much fuss. We see huge increases in conference attendance, new heavily attended conferences focused on deep learning targeted at medical researchers, and packed-to-the-brim deep learning workshops at the top medical ML conferences (several of the organisers for that one are actually my collaborators).

Not to mention the fact that there are an enormous number of medical informatics and medical machine learning folks who are still publishing on old methods, and they are going to catch on eventually and transition to deep learning.

I think I am being fairly conservative when I say there will be between ten and twenty good quality phase I AI trials per month by the end of 2017. I wouldn’t be surprised with a tripling instead of a doubling.

PHASE II

Prediction:

We will see several (3-5) large phase II AI trials published in the medical literature.

Argument:

This prediction doesn’t seem very impressive, but considering we have a history of exactly one phase II trial for a deep learning system performing a medical task (see the previous blogpost), we are actually looking at a three to five-fold increase in one year.

(On a side note, there have actually been a few other trials that I think are on the cusp of “phase II-ness”, but fall just short for a variety of reasons)

I’m not sure if I am being over-optimistic here. Google spent a lot of effort getting an army of ophthalmologists to create a dataset for them, a level of effort I am not sure other groups are ready for yet.

From the public side, anecdata suggests that funding bodies are loathe to give large grants for deep learning applications research that hasn’t been supported by large studies already (we could call the “phase I to phase II funding gap”). Many of the academics labs will probably keep focusing on old techniques for a while, if for no reason other than funding availability. I foresee zero to two phase II studies will come out of public institutions in 2017.

There will be some role for cashed-up startups, but I haven’t really had any catch my eye yet. This is probably because their public statements are investor-facing, not aimed at convincing doctors. It is difficult to grok if they know what they are doing. Enlitic is probably the most likely to surprise me, but a complete phase II trial would be a surprise.

Overall, I would be surprised if a single good phase II study came out of a startup in 2017. Maybe next year.

It will probably be up to the large tech companies to push this along, and not having any direct knowledge about the work these groups are doing behind closed doors, I am not sure about their appetites.

Though if I had to guess and considering the amount of money on the table if they are successful, I would suspect they are quite hungry. I’m mainly talking about Google, Microsoft, IBM et al. although the big med-tech companies could play a role.

I predict up two to five phase II trials will come out of established industry groups.

PHASE III

Prediction:

We won’t see any complete phase III trials in 2017

Argument:

I would love to be wrong here, but I just can’t see it happening. To be successful in phase III, you need to show that using the system is as good or better than using human doctors, on real patients in real clinics. That is a whole new ballgame.

Let’s take Google’s diabetic retinopathy study, assuming it is the only one that is almost ready for phase III and could get by an ethics board. Even if they could formulate a really good use case here (I’m not an ophthalmologist, but maybe something along the lines of a screening system that can avoid the need for specialist review in mild cases and therefore save money with no decrease in patient safety) we are looking at at least a year or two of follow-up.

Diseases like diabetes move slowly. If you reduce the follow-up period, fewer people will suffer the events you are watching out for (like blindness). This doesn’t mean you can’t do it, but you will need a larger cohort to prove it works and is safe. This adds to the cost, which is already going to be high (seven figures high).

If a phase III trial comes out of no-where, I don’t know who would have approved it ethically.

If anyone can think of good candidates for research groups that might be closing in on phase III and I haven’t heard of, let me know. I will update my predictions accordingly.

But as things stand, I don’t expect any phase III studies in 2017.

MISCELLANEOUS PREDICTIONS

Some random AI and non-AI stuff, just for fun.

Medical apps will continue to proliferate. Things like smartphone skin cancer detectors (that just recommend you see a doctor for anything unusual), health trackers and quantified self stuff, medication trackers/reminder systems falls prediction, psychological health support bots and so on. Anything that doesn’t need regulation. I doubt any will achieve significant market penetration, but these are the kind of low-hanging fruit that will see a lot of effort and start-up interest.

News stories will continue to breathlessly report the end of doctors, despite all evidence to the contrary.

Radiologists and other “threatened” specialists will continue to talk about this at every major meeting and in opinion pieces in all the major journals, but the conversation isn’t going to really go anywhere new this year. It won’t really change unless we see disruption at scale.

On the non-AI front, augmented and virtual reality isn’t going to do much of anything useful in medicine in 2017.

Similarly, 3D printing isn’t going to do much of anything useful in medicine in 2017.

Genomics will continue to progress incrementally, without any major breakthroughs. Deep learning for genomics is tricky. We will cross the $1000 genome barrier this year though, which is actually pretty amazing.

The biotech revolution will start picking up steam. More really effective targeted cancer therapies, more stem-cell stuff, more rejuventation tech including the first evidence on senolytics (anti-aging treatments). We should get an idea about how effective new anti-Alzheimers treatments are, and whether metformin extends human lifespans.

I think that is all I have for now. The most important implication is that without any phase III trials in 2017, we are at least two years away from any clinical application that can displace doctors. So rejoice, medics, and be merry.

I’ve tried to make my primary predictions numerical, and therefore falsifiable. A few of the throw-away predictions at the end are semi-falsifiable too (these are bolded). I will definitely be coming back to these by the end of the year, and we can see how I did.

Happy new year, and cheers!

*To be honest, I only started writing that last post because I wanted to write this one. Now I’m glad I did, because I really do think the framework is useful.

The problem is of course that the disruption is exponential. I assume fields with the highest amount of deviation in treatment choice will stay, such as emergency care and the non-standard cases of every type?