Another year, another post. (Isn’t it great that the timestamp is actually updated
when I start writing the post? So this is still good for 2015.)

During 32C3, I’ve held a
talk on
the Volkswagen emissions cheating (“Dieselgate”), together with Daniel Lange.

You can watch the video
here,
or get the slides as
PDF
or html.
Jake over at lwn.net wrote a very good summary
here, in case you haven’t watched the video.

We’ve got some pretty good feedback, and a number of questions, and I wasn’t
able to answer all of them with enough detail.
I’ll quickly re-iterate what I did, what I found out, and will attempt to
answer the most frequently asked questions.

What is this about?

I found out that on my particular car (Volkswagen Sharan 7N, MY2013), when a particular driving cycle is
followed (and some other parameters are in a hardcoded range), exhaust treatment works differently than otherwise.
In detail, when following the
NEDC, the
DEF (AdBlue™)
dosing is in an acceptable range, whereas leaving the cycle will make the
dosing immediately drop.

This by itself isn’t very new, and matches the Volkswagen press
release.
However, and this is - as far as I know - new, I can show the code path in
the ECU firmware that’s responsible for this behavior, as well as the exact
conditions that are responsible for the switch. I can further show why
dosing is operating in a non-default mode by default.

I’ll try to cover questions which I wasn’t able to answer in the talk here.
If you want to know more, simply leave comments, and I’ll try to address them.

How did you analyze the code?

First, I obtained the code. I did this by getting another, very similar ECU
on eBay; the ECU is a Bosch
EDC17. I paid roughly 150 EUR for this. It came out of a Passat, so while not
the same car as mine, it was very similar. It allowed me to extract the
code, play around with the data logging, and allowed me to understand the
how firmware is encrypted when uploaded via CAN
(spoiler: not very well).

Then, I threw the code in a
disassembler. Bosch doesn’t appear
to write C code, but design them in a graphical language. (You can find a
function sheet, or “Funktionsrahmen”, of a motor sport ECUhere;
the EDC17 is much more complicated, but appears to be based on the same
principle.) The graphical algorithms seem then auto-generated into C then,
and then compiled. The resulting code is interesting to look at because it’s
a large number of top-level functions with barely any flow control in them
(the only flow control is to implement algoritmic blocks like “filter” or
“integrate”), or selective assignments for muxes.

This means that the amount of code that runs for each “step” is constant -
they always calculate everything, and then combine the results for example
with a mux. This helps ensuring that all the software runs under the strict
realtime requirements even in a worst-case scenario.

Since the ECU firmware does not have any ascii output, it’s hard to find
“anchor points” with known functionality. I started by finding the CAN
message parser, which - all data/table driven - parse CAN messages. I knew
some CAN messages, so I was able to find the steering wheel angle, for
example, and could then infer all functions that use the steering wheel angle.

However, the biggest help came from so-called “A2L files”, or “Damos Files”,
which are basically fancy symbol files.

Another big help was logging data from my actual car.

Q&A

You said that the A2L didn’t match your firmware image. Doesn’t this invalidate all your findings?

Sigh. This came up a lot. The short answer is: “I f*cking know what I’m
doing.” (Sorry to sound so condescending, but usually the question is asked
in a condescending way, too).

The long answer is that what everybody who reverse-engineered a bunch of
related binaries (for example different versions of the same codebase) would
explain - you start with a firmware that you have understood very well (for
example because you have symbols for it - or an A2L file in this case - or
one that you worked with a long time), and then you try to match up code
sections to another version. If you’re lucky, the generated code for a
number of functions may be identical, so you can find a function from
firmware A in firmware B by searching for the same byte pattern, but ignore
(non-local) code and data references. IDA Pro can do this with “?”-characters in
the binary search, and bgrep can do the same.

Once you find a match (and you need to make sure that it’s a unique match),
you can then transcribe the references from the high-quality disassembly of
function A to the disassembly of function B.

If you can’t find the exact function, for example because it was changed,
helpful methods includes:

Usually variables are in stable order. Which means that as long as no new
variables were added in the middle, they will appear in the same order. If
you found two variables in firmware B that are close to each other, it’s
likely that the variables between those are the same as in firmware A. Of
course this required careful validation that this is indeed the case - be
assured that I did the due diligence in these case by comparing the code
referencing to the variables in A vs. B.

If you want to find a particular variable in firmware B, but all code that
accesses it is rather generic and ambiguous, it’s useful to find related
variables - either which are used to calculate the variable in question, or
variables a that are a result of the variable in question (plus maybe
others). Maybe related variables are easier to find.

In the end, it’s not unlike doing a jigsaw puzzle - not everything that appears to
match at first turns out to be placed correctly, so you have to be careful in rejecting false-positives by
looking at the big picture. But the more you complete, the easier it is to
spot them.

Your car has AdBlue and EURO5. EURO5 cars don’t require AdBlue.

My car (Sharan 7N, 103kW Diesel CFFB) does require AdBlue and is a EURO5 car.
It is correct that the minority of Volkswagen Diesel cars with EU5 require
an SCR catalyst, but my car is one of the few.

My work was specific to my car, which uses AdBlue. I haven’t looked at other cars.
They may, or may not, use the same or a similar technique. I don’t know. All
I know is how my car does it.

Is it correct that in the regular (“Street”) mode no AdBlue is dosed at all?

No.

Unfortunately I ran out of time during my presentation so I couldn’t explain
this further. The graphic I’ve shown for the “aborted NEDC”-cycle showed
zero dosage after leaving the test cycle.

I haven’t done a good analysis yet to figure out a.) how limited dosing is
in alternate mode exactly, or b.) whether there may be situations in which the
alternate model just works as good as the main model. All I know is that in
the particular situation on the dyno dosing immediately stopped when leaving
the main model.

Please don’t draw incorrect conclusions. There is dosing in alternate
model
(“Street mode”), and I’m not sufficiently skilled to make a statement on how
different this dosing is versus to the main model (“test stand mode”).

All I know is that the dosing calculation is different, and that in a
particular situation on the Dyno the main model did dose, but the alternate
model did not, and that the “test cycle detection” switches between the modes.

How exactly did you extract the firmware image from the ECU? You’ve mentioned using a 0-day exploit in the TriCore chip.

On my ECU (EDC17CP46), the TriCore PFLASH read protection (RPROT) is
configured. This means that when strapping the TC1767 to boot from CAN (or
rather, anything other than flash), read access to the program memory of
PFLASH is not possible without writing a correct unlock password first.
Apparently there are ways to recover the unlock password by nicely asking
the running firmware - since firmware sets the read protection password in
the first place, the algorithm to compute it is there, too - but I wasn’t
aware of how this would work on my particular software version, and open
documentation on this is very sparse.

So what I did instead was to attack the TC1767 directly. It allowed me to
boot in an alternate bootmode (such as booting from CAN) without read protection
being active, and then dumping the program data. Having this
data then also allowed me to compute the read protection password, albeit it
wasn’t technically necessary anymore at that point. But I confirmed that the
computed password does indeed disable read protection when it was kept
enabled at boot (i.e. not applying my “0-day” hack).

I tried this on a single device, and it worked. I don’t know if it works on
other devices. I tried it on another ECU (which uses a TC1766) without success.

I did need to open the device for applying my hack, so I didn’t do this on
my car.

Instead of just describing how my hack worked, I’ll make this a bit more
interesting (for me): I promise if you leave the correct guess in the
comments, I’ll confirm it. So go ahead, post your theories.

How did you read out the realtime data on your car, to produce the logs/graphs?

“It’s just a bunch of code.”

OBD-2 allows you to send CAN frames that are forwarded to the powertrain CAN
bus (to which the Engine ECU is connected). You can talk
UDS over it. By
default, it allows you to read certain signals (internal variables) “by ID”
- there’s a vendor-specific, but fairly stable list of values to read.
However not all of the signals that I was interested in was available this way.

Luckily there’s a second command (UDS calls them “service”) that allows to
read “by address” - which “address” being defined in the internal 32-bit
TriCore address space. Not all variables are readable - security features
like anti-theft or the mentioned read password are protected - but
everything else is readable.

A stupid little Python tool, talking via SocketCAN to a 8devices
USB2CAN (though a Lawicel
CANUSB worked as well), queried the
variables as fast as possible (which allowed me to read ~50 variables per
second or so), and dumped them into a JSON file. A small websocket-based
webpage then rendered them to something human-readable. It’s too crappy to
put this up online, but I’ll do it. (Right after writing Part 2 of the
Optical Media Authentication post.)

The variables I got from disassembling my firmware, and matching it up to
the firmware which I’ve had the symbolic names from the A2L file.

The cheating speedometer thing… What?

During the talk I gave two random examples on how an ECU can “cheat”, unrelated
to emissions. One example was the tachometer, one was the speedometer. Note
that “cheating” in this case just means that the displayed value is
intentionally not the physically correct one. I don’t want to imply that
something bad is happening here.

Let me explain. First the tachometer - the ECU has a very exact representation of the current
engine speed (“rpm”). It needs that to precisely control - well - the
engine. It sends the speed to the head unit, which displays it, usually
using an analog dial.

Now, there are a couple of things that the engine is doing that the owner
doesn’t want to see, for example idle speed fluctuations.

Mechanical RPM displays back then used mechanical parts to provide some
filtering - this has now moved into the digital domain. Here’s a plot:

You can see that while the real RPM fluctuates around the requested idle
speed (900 or 780 in this example, depends on the remaining load - for
example alternator, A/C etc.), the RPM displayed in the head unit is
constant. You can also see that undershoots are filtered away. (Download raw
data)

The other example is the speedometer. Actually, the physical speed of the
vehicle is calculated from a number of sources, and usually very accurate.
But the displayed speed is - also for regulatory reasons - higher than the
actual speed. This “tweak” doesn’t happen in the engine ECU, but actually in
the head unit, which has interesting implications. Let’s say the driver
engages cruise control when ther head unit displays 100 km/h (but the car
moves with 95 km/h). Now the driver tips up the speed to 110 km/h; the
engine ECU needs to calculate the new speed, and since the original speed
was never 100 km/h, it’s not as easy as adding 10 km/h to your current speed.

Instead the head unit sends back the displayed speed, and the ECU takes that
value, does the necessary adjustments, and then uses that as a target. It
sounds complicated to get this right in all situations - and yes, there’s a
lot of code in the ECU that deals with things like that.

All of this is nice and not really a problem - but it shows how much we tend
to trust a software block that we don’t really understand. The ECUcould
cheat in all kind of situations, and we wouldn’t know.