Podcast

Podcast

The prognosis for AI-assisted radiology

Improvements in machine learning and image recognition, and gradual acceptance by regulators, have brought innovative companies the threshold of the radiology lab. Bill sits down with HBS Professor Shane Greenstein to discuss one such company and the challenges of creating effective applications and bringing them to market. Greenstein also shares insights on how AI will impact radiologists, often labeled vulnerable to automation. Will they be freed from routine tasks? Or will they soon be a thing of the past?

Bill Kerr: As Stanford technologist Roy Amara famously observed, we tend overestimate the effect of a technology in the short run, but ironically underestimate its effect in the long run. Decades after the introduction of the internet, we are still realizing its full potential. In the same way, many note the great promise of artificial intelligence, but it’s going to require countless implementations by companies before the technology truly transforms our world.

Welcome to the Managing the Future of Work podcast from Harvard Business School. I’m your host, Bill Kerr. Today I’m speaking with HBS professor Shane Greenstein, who has written extensively on the history of the internet and the broad impacts of digitization. Shane is co-chair of Harvard’s Digital Initiative and an expert on technology commercialization and diffusion. Shane’s newest research considers how the lessons from past waves of tech diffusion will apply to artificial intelligence. And his latest case studies highlight important best practices on businesses adoption. Welcome, Shane.

Shane Greenstein: Thanks for having me.

Kerr: Shane, I’d thought we’d kick off with Zebra Medical Vision, one of your recent case studies. It also will introduce us to the world of machine learning. Tell us a little bit about it.

Greenstein: Zebra Medical is an Israeli company, founded by three people who were looking to develop machine-learning approaches to X-rays and CT scans. Today they’re a 45-person company. They have one product that has made it all the way through the FDA approval process and six that have made it through the European process, and they’re developing dozens more. What can we say about them? They’re a pioneer. They’re recognized as a leader. They have faced many, many challenges in trying to translate machine learning in CT scans and X-rays into something that is useful for radiologists today. Those challenges that they face are very helpful to our students in understanding both how machine learning is going to become a real product—is going to become part of real markets and real products—and also, the same sort of challenges that they’re going to face if they enter into this arena.

Kerr: Shane, for some of our listeners, the phrase “machine learning” might be completely new to them. Tell us a little about what machine learning is doing.

Greenstein: Machine learning translates at a basic level to an algorithm; it translates inputs into predictions. So what’s being predicted here, in a simple case, you could have an X-ray. An X-ray, at the end of the day, is a bunch of pixels with white dots, black dots.

Kerr: Images.

Greenstein: Yup. Let’s say an X-ray of your spine. What would you like to know about that? You’d like to know if you had a fracture, for example. So what the algorithm does is it examines the spine and looks for lines that aren’t supposed to be there and places a probability on the line that should be there, because it’s a part of your spinal column, and a line that shouldn’t be there, because it went through a particular part of bone and indicates a fracture. That’s a prediction, and the algorithm provides a probability of that prediction—of that, if you will, fracture.

So how do you get to that? The way you get there is you have to show the computer a large number of examples of lines and dots, some of which should be there and some of which should not be there. And then the computer algorithm learns from examples how to predict whether any given situation looks like it has a high probability of being a good situation or a bad situation. The language of machine learning is: You train on a data set in order to use it on a new example.

Kerr: How is this different from medical school, or is the same thing?

Greenstein: Actually, it’s really rather different, because it’s very specific, and it’s typically only about the diagnostic part of medical school. If you follow doctors around, doctors do a lot more than diagnosis. For example, in this case, a radiologist does perform diagnosis, but they do a lot more than that as well. The have to get the human into a position to do the appropriate picture in the first place. They also, once they’ve done a diagnosis, have to consult with others if there’s ambiguity. And then they have to communicate to other physicians what the diagnosis implies for proper care, in addition to many, many other things. So a physician is trained to do a large number of things. Machine learning is only oriented toward doing the diagnosis.

Kerr: So, let’s go back and think about the data, then, that you’re feeding into this diagnostic algorithm. Tell us about how Zebra developed this data and the way that they learned from the data.

Greenstein: Oh, it’s a great story. One of the founders, himself, went through a medical situation where he could not get a radiologist. It got him thinking—he was in the imaging industry to begin with—and it got him thinking, “This just can’t be the case. It’s got to be possible to automate some of the simpler parts of diagnosis.”

Kerr: Most of us would just look for a different place to go. But I’m still with you.

Greenstein: And so, as it is, he made a deal with one of the HMOs in Israel. It’s a very interesting deal, in part because it’s a small country, there’s a small number of HMOs. It’s actually a very difficult deal to make, for example, in the United States, where privacy laws are going to make it quite challenging for a commercial firm to have access to such data.

Kerr: To health records?

Greenstein: Yeah, to health records. Don’t get me wrong, the firms here also had to comply with a series of privacy norms that generally are followed worldwide. It’s just a little easier to get trust when have a small country, and people can talk to each other and make deals and enforce them also through reputation. So they anonymized the data. One of the founders, himself, had a lot of experience in security, which helped them understand how to properly anonymize it. Then they had to take millions of records, classify them by type of diagnosis, and also classify the diagnosis, and put them into “healthy” and “potentially not healthy,” and then sort them in a variety of ways before they could then train the algorithms on very specific kinds of diagnoses.

Kerr: The data didn’t already have the diagnosis attached to it?

Greenstein: Sometimes it did, sometimes it didn’t. As it turns out, radiologists in practice are very busy, so sometimes they write things down, and sometimes it’s not very thorough, and sometimes it is. One of the interesting things they found when they got the actual records is that they had to hire medical students sometimes to just complete them. The vast majority of these are “healthy” records—so there’s no particular reason why the radiologist has to write “healthy.” But, actually, the computer has to know that. And this gets to one of the interesting features of machine learning, which is known as the difference between structured and unstructured learning. When they started out, they were doing what’s known as “structured learning,” where they write the diagnosis. An “unstructured learning” would merely read a 120-word paragraph written by the radiologist, and then infer whether the radiologist is implying “healthy” or “not healthy.” That unstructured learning is much more difficult to do. They’re only now trying to crack that. But when they first started, they were doing structured learning.

Kerr: In certain domains or spaces, unstructured learning may be required for the technology to take root.

Greenstein: Yeah, in some of the more challenging areas. That’s something we’re likely to see in the future, either from them and others as well.

Kerr: One of the things that you mentioned in the case was that distinction between false positives and false negatives, and some of the trade-offs that companies need to make here. Tell us a little about that.

Greenstein: If you have to make an automated diagnosis of an X-ray, you have to worry about two separate sorts of things: a false positive and a false negative. A false positive would be saying that somebody is sick when they aren’t. A false negative is missing that somebody is sick when they are. Those two things matter a lot in the medical field. So let’s take the first one. Diagnosing someone as sick when they’re not, so a false positive.

Kerr: Is that erring on the side of caution?

Greenstein: Yeah, but if you were to use the diagnosis as a screen, for example, being off by a certain percentage then implies that you have to bring in a lot of people who weren’t otherwise sick for a second look. The case, for example, talks about breast cancer, which is a place where it would be very nice and very valuable to have an automated screen. But it has to reduce the number of false positives, because the last thing you want to have …. Let’s say you’re off 5 percent of the time. In 1,000 people, that implies that 50 people show who up who actually aren’t ill, who then have to get a secondary look. That’s a problem. You’re imposing a cost on the system, and that’s something the doctors don’t want. Similarly, in the other direction, if somebody had a false negative—that you don’t see that someone is ill when they are—that would be, for example, in the rare cancer—you really don’t want to have that. So if you’re going to use an automated system like this, you want to be able to make sure you get it right in the situations where you’re looking for something very specific like that.

Again, they want to be very conscious as a company that the users here—radiologists—are very worried about false positives and false negatives. They, as a company, want to develop an algorithm that is as good as human beings—frankly even better than human beings. That’s a pretty, pretty difficult standard.

Kerr: Tall challenge.

Greenstein: That’s a tall challenge.

Kerr: Many people look back to 2012 and say something different happened around then that put us on the pathway to this case and many other applications of machine learning and artificial intelligence.

Greenstein: Yeah. So this is one of these examples—certainly not the only one—where the scientific community makes advances, so those in the commercialization world begin to recognize it has consequences for the products that they’re going to try to develop. What tended to happen—there has been work on machine learning for a very, very long time. Around 2012, Hinton, who is one of the scientific leaders at the University of Toronto, came up with a different model—a modification of an existing model—that had particularly good properties. It’s called a “convolution model.” But it was an innovative way of taking together a lot of ideas that had been around for a while and putting them together in a very flexible approach. In addition to that, there was a standardization of the contests for recognizing pictures that a professor at the University of Illinois, who had by then moved to Princeton, had come up with. There were contests already by that time in recognizing pictures. So in 2012, this convolution model was first introduced into this contest and did remarkably well. Within a few years, it was being improved so much that it could beat humans in recognition of standardized pictures. That’s really the breakthrough, and people recognize in retrospect that’s a breakthrough.

Sort of a last thing that happened, around 2014, was that Google released something called TensorFlow—that’s a standardized algorithm in such a ways to make it very easy for a non-statistician to just input information and get output and not have to program the thing up from scratch. That also made it quite easy for, say, a company like this to go into developing large, large amounts of algorithms.

Kerr: Well, you mentioned Google, and it also came back to Zebra there. We tend to hear about this work being done more in large companies rather than in startup companies. Is that generally true? Is Zebra the exception?

Greenstein: At the moment of this case, there are 70 different firms who are in the developing, machine-learning approaches to CT scans and X-rays. Some of them are big. Some are large, existing firms, and a lot of them are startups, specializing in areas they think they can do things faster than the large firms. Because the opportunity is pretty large and potentially quite wide, it’s not surprising you’re seeing a lot of startups. It’s too much for any one firm to tackle—even a few firms would have a hard time tackling all the opportunities. And, in addition to that, even if we agreed on what all the opportunities are—and we don’t—it would be difficult to do. In this case, there is disagreement about where the best opportunities are. So that, again, is the type of situation, and so it’s unsurprising that you see a lot of startups.

Kerr: All right. We’re now to the million-dollar question. You have 70 companies attempting to create this technology. We always hear about radiology as the example of the job that’s going away in the future. Shane Greenstein, will there be radiologists in the future?

Greenstein: Next year, there will be lots of radiologists. Three years from now, I don’t think there’s going to be any change in that, nor would I expect five years from now. Certainly, from doing this case, I learned it’s quite challenging, first, to develop product. Second of all, it’s quite challenging to develop them in a such a way as to be useful for radiologists. Third, even if you successfully develop one, it’s only a part of the radiology profession. It’s just diagnosis. Radiologists are already pressed for time. So the goal here, for any of the firms in this industry, what they’re trying to do is just make a part of the job easier for radiologists so they can devote their time to other things, anyway. I don’t think anytime in the next decade we’re going to see this.

Kerr: We often hear this as the augmentation-vs.-automation type of debate.

Greenstein: Correct. Exactly.

Kerr: You’re coming off on the augmentation, at least in the short term.

Greenstein: Strongly in the augmentation, yeah, direction. The other thing that I really learned from doing this case—one of the successful use cases for Zebra—is in Britain, where they had nurses trying to use their software in order to find a second, let’s say, fracture in the elderly, who have already had one. In that case, it has huge medical benefits, and it’s quite difficult to find. A screening procedure that just the nurses could do has huge value. Then if you get a low probability, you don’t have to bring in the doctor. If you get a high probability, then you bring in a doctor. You have nurses screen a lot of the elderly who are at risk. You find those who get high probabilities, and then you bring in the doctor. Boy, if you can then prevent that second fracture from happening in the elderly, you can save a lot of cost, and you can potentially keep them mobile for a lot longer. Quality of life goes up quite a bit.

That’s an example where they’ve deployed their software. You didn’t use radiologists at all for a long time. That’s …

Kerr: … expanding access by bringing down the cost barriers.

Greenstein: Yes, it’s expanding. Yeah. That’s really very instructive, right? That’s even more than augmentation. That’s a new use case.

Kerr: Do we have any past experience about how long it takes to go from an augmentation role of technology toward ... I’m bringing you back to that million-dollar question.

Greenstein: Yeah, yeah, yeah. That is the million-dollar question. No, that’s very hard to forecast. I can see a number of different barriers here. One is: Think about it from the radiologist’s standpoint. They aren’t going to want to learn 70 different ways of doing things. They have, at most, five minutes to figure this out. They want to have a standardized interface so that they can figure out things. In that sort of environment, you would not expect the user community to want any more than two or three different companies and standardized interfaces.

Kerr: So consolidation around the various approaches and designs.

Greenstein: And designs, exactly. In addition, then, you would expect, if you look again at radiology, just the numbers are intimidating. There are 200 things radiologists diagnose in a standard radiologist job. That’s before we even get into the specialized ones who do rare cancers and so on. Again, you would expect in that environment that radiologists are going to eventually tend to a small number of packages that can do a wide variety of things. That’s a long way off. One way to answer your question is: I don’t expect packages like that to emerge for quite a long time. We’re just not anywhere near that today.

I’ll even you give you one statistic, which we have in the case. In the most recent contest on 14 different applications and machine learning in comparison to humans, machine learning beat humans in only one. They tied in about 10. Humans beat them in three.

Kerr: Go humans.

Greenstein: Yeah, there you go. That’s just the best case. Those were taking the 14 best. In that sort of situation, my God, we’re still very far away from developing anything like a portfolio that humans will find reliable.

Kerr: Yeah. Especially moving to an environment where you would not be employing the radiologist as well alongside the …

Greenstein: Oh, yeah, yeah. No, we’re very far away. That’s before we even start to think about things like legal considerations.

Kerr: Exactly. I think that’d be a great way to go next. Think about your broader work on technology commercialization and diffusion. This takes time. We’re still integrating the internet into our daily lives. What are the factors that determine how quickly this pace happens? How do we think about that future?

Greenstein: Yeah. One of the lessons you learn from other technologies is the number of decision makers makes a huge difference. Here, you have multiple decision makers. You have an administrator who has got to set up the billing. You’ve got a radiologist who has got to become used to, if you want to describe it that way, having a process of using the software. In the initial stages, they’re going to rely on their own judgment, anyway. They’ll only want things that definitely improve on what they’re doing. You have lawyers who then have to figure out how to write the briefs in the rare cases where things get questioned. Those are very typical situations in technology commercialization. We have them here as well.

Another thing you can learn from is just by comparing this with say, something like dentistry, where again, automated X-rays of whether you have a cavity or not, is an obvious application. There, you have very different situation. It’s quite typically the dentists who are doing the diagnosis. The billing’s being done by somebody else. They’re small firms typically. They’re not …

Kerr: They’re fragmented. Yeah.

Greenstein: They’re fragmented. They’re not large hospitals.

Kerr: You get to work this consolidate decision maker, but you know have to find the decision makers all over the place.

Greenstein: Yeah, exactly. Less than a technology commercialization is, if you’re a seller, you have to adapt what you’re doing to those, the environment in which you’re trying to sell. We have, in the U.S., a very particular health system. Go to another part of the developing world or developed world, you’ll find a different kind of health system. Again, you’re going to have to accommodate the processes in those countries as well.

Kerr: Shane, I guess maybe I can generalize this question a little bit. It sounds like they’re trying to fit the technology into the existing processes. One can also imagine though that there should be a process that should be developed around the new technology. Does that just take a lot of time to emerge?

Greenstein: Yeah, that does happen.

Kerr: But it sounds like it’s the rarer or the longer-term case.

Greenstein: Yeah, I mean, we go to history for examples like that. We’re certainly nowhere near that in machine learning. Let’s at least get there. Historically, you could look at things like the automobile. What happened initially when the automobile is coming out? There weren’t gas stations everywhere, okay? It was kind of catch as catch can.

Kerr: Did they have AAA to come fill you up if you ...?

Greenstein: Yeah. Then eventually, service stations started emerging as a regular way to refuel an automobile. But that’s the equivalent here. That took decades.

Kerr: And the complimentary technologies and infrastructures take a long time.

Greenstein: Infrastructures take a very long time to put in place, yeah.

Kerr: You’ve worked a lot on the digitization, how it affects the cost in our lives of transaction cost, search cost, and so forth. I’m asking you to summarize a decade’s worth of work, but do these new technologies consolidate activity? Do they diffuse them out? How do we think about the impact they’re having on our business and economic geography?

Greenstein: One of the things I find totally fascinating is the argument about death of distance. When the internet first began to diffuse, it was very popular to say, “Well, we’re all going to move to the forest or the beach” or wherever it is that you find refreshing and renewing. “Then we’re going to work remotely.” I don’t know about you, but it hasn’t happened to me. We aren’t seeing a great …

Kerr: We’re in a podcast studio right now …

Greenstein: We have not seen an exodus of …

Kerr: … it has no windows.

Greenstein: If anything, I regret taking the internet with me to the beach. What has happened, if anything, has been the opposite of that forecast. If you look, the deployment of digital technology has been coincident with a consolidation of economic activity and business into major urban areas. There are a couple different reasons why. Part of that is coming from the economies of scale in the support technologies like broadband, wireless technology, and broadband wire line technology, and then increasingly things like data centers and the cloud. That is our present generation.

Some of it also is because these technologies disproportionately require labor from an educated workforce, which disproportionally is located inside of urban areas. Then finally, it’s also the case that just the simple things like PCs, local area networks, the software you use to make your computer work, and all the complimentary things that keep that going, all the support networks for that are in major urban areas as well, and more easily supplied in such areas. Just to summarize a very long amount of research—and I’m certainly not the only one to have noticed this and been surprised by it—but it has turned out digital technology tends to be a consolidating technology. It tends to move things into dense urban areas rather than away from them. We’re seeing this worldwide. It’s not just unique to the U.S.

Kerr: Yeah. One of the things I often remind audiences is that: What’s the No.1 destination of emails from Harvard Business School? Well, it ends up being Harvard Business School. In fact, the No. 1 destination of the second floor of The Rock Center is the second floor of The Rock Center. Just because it’s weightless and can travel across distance doesn’t mean it doesn’t help reinforce value of place.

Greenstein: Yeah. Just to be particularly clear about this, the other interesting thing about digital technologies is it has affected different parts of life, not just production, but also use and leisure time. One of the things, certainly, we’ve noticed, is that life in low-density areas has improved because consumption is easier because of access to a wider array of goods than was previously possible. I don’t want to argue that things have gotten worse in a low-density environment—certainly, they’ve gotten better for consumption and leisure time. It’s a really interesting problem, because it affects different areas to different degrees, and it’s complicated. It doesn’t give you obvious answers.

Kerr: Let’s end with a little bit of advice. Do you have anything you’d like to leave for the listeners of the podcasts, business leaders, and so forth, about how to think about managing the future of work?

Greenstein: Well, let’s be very concrete. If you’re in the artificial or machine learning world, where we started, in many respects this looks like technology commercialization of the past; many of the same issues and lessons we’ve learned in the past apply here. What is of interest to the scientific community isn’t the same as what’s of interest to a commercial buyer. Yet again, creating value for a commercial buyer isn’t the same as capturing value, so the job of the entrepreneur and the large firm in these markets is the same as it’s always been: create value and capture a piece of it. The capturing part, is, as always, messy. It’s always going to be involving careful study of what users want, meeting user needs, being very sensitive and empathetic to the value created for the user. It looks like businesses that we’ve been in before, and the same lessons that we’ve always had should apply here as well. So that’s the place I think always to start when you start talking about commercialization and machine learning.

Kerr: Great. Thank you, Shane, for taking us inside Zebra Medical Vision and how it’s commercializing AI, and also for your insights on how this technology’s going to diffuse.

Greenstein: Thank you for having me. It was my pleasure.

Kerr: From Harvard Business School, I’m Professor Bill Kerr. Thanks to all of you for listening in.