In this post I'm going to try to take a look at what safety claims can, and cannot, be made for the Tesla Autopilot based on publicly available information. It's sufficient just to look at what Tesla itself says and assume it is 100% accurate to get a broad idea of where they are, and where they aren't. Short version: they have a really long way to go to demonstrate they are at least as good as a normal vehicle.

The Math:

First, let's take Tesla's blog posting about the tragic death that occurred in June.https://www.tesla.com/blog/tragic-loss
Tesla claims 130 million miles of Autopilot driving, and a US fatality every 94 million miles.

One might be tempted to draw an incorrect conclusion from Tesla's statement that Autopilot is safer than manual driving, because 130 million is larger than 94 million. However, that is far too simplistic an approach. Rather, the jury is still very much out on whether Tesla Autopilot will prove safer than everyday vehicles.

From a purely statistical approach, the question is: how many miles does Tesla have to drive to demonstrate they are at least as good as a human (94 million mile mean time to fatality).

This tool: http://reliabilityanalyticstoolkit.appspot.com/mtbf_test_calculator tells you how long you need to test to get a 94M mile Mean Time Between Failure (MTBF) with 95% confidence. Assuming that a failure is a fatal mishap in this case, you need to test 282M miles with no fatalities to be 95% sure that the mean time is only 94 million miles. Yes, it's hard to get that lucky. But if you do have a mishap, you have to do more testing to distinguish between 130 million miles having been lucky, or just reflecting normal statistical fluctuations in having achieved a target average mishap rate. If you only have one mishap, you need a total of 446M miles to reach 95% confidence. If another mishap occurs, that extends to 592M miles for a failure budget of two mishaps, and so on. It would be no surprise if Tesla needs about 1-2 billion miles of accumulated experience for the statistical fluctuations to settle out, assuming the system actually is as good as a human driver.

Looking at it another way, given the current data (1 mishap in 130 million miles), this tool: http://reliabilityanalyticstoolkit.appspot.com/field_mtbf_calculator tells us that Tesla has only demonstrated an MTBF of 27.4M miles or better at 95% confidence at this point, which is less than a third of the way to break-even with a human.
(Please note, I did NOT say that they are only a third as safe as a human. What I am saying is that the data available only supports a claim of about 29.1% as good. Beyond that, the jury is still out.)

With any analysis like this, it's important to be clear about the assumptions and possible flaws in analysis, of which there are often many. Here are some that come to mind.

- The calculations above assume that it is more or less the same software for all 130 million miles. If Tesla makes a "major" change to software, pretty much all bets are off as to whether the new software will be better or worse in practice unless a safety critical assurance methodology (e.g., ISO 26262) has been used. (In fact, one can argue that *any* change invalidates previous test data, but that is a fine point beyond what I want to cover here.) Tesla says they're making a dramatic switch to radar as a primary sensor with this version. That sounds like it could be a major change. It would be no surprise if this software version resets the clock to zero miles of experience in terms of software field reliability both for better and for worse.

- The Tesla is most definitely not a fully autonomous vehicle. Even Tesla makes it quite clear in their public announcements that constant driver attention and supervision is required. So the Tesla experience isn't really about fully self-driving cars per se. Rather, it is about the safety of a human+car partnership (Level 3 autonomy in NHTSA-speak). That includes both the pros (humans can react to weird things the software might get wrong), and the cons (humans easily "drop out" of systems that don't require their attention). If Tesla moves to Level 4 autonomy later, at best they're going to have to make a very nuanced argument to take credit for Level 3 autonomy experience.

- The calculations assume random independent failures, which in general is probably not a good assumption for software. Whether it is a valid assumption for this scenario is an interesting question.

(Calculation note: I just plugged miles instead of hours into the MTBF calculations, because the units don't really matter as long as you are consistent. If you want to translate to hours and back, 30 mph is not a bad approximation for vehicle speed accounting for city and highway driving. If you'd like 90% confidence or 99% confidence feel free to plug the numbers into the tools for yourself.)

About Me

I've done embedded systems for big industry, the US military, startup companies, and now Carnegie Mellon University. I'm the author of the book Better Embedded System Software, which goes into more detail on most of the topics discussed in my corresponding blog.As with any blog, these posts often contain speculative and partially formed thoughts, and should not be interpreted as a fully considered opinion unless stated otherwise.Key pages:Academic home page at CMUEmbedded Software Blog Checksum and CRC Blog