Motivation and background for this field of research

Building a hardware product that cannot be copied is hard. Especially small integrated chips make it hard to distinguishing between a knockoff device and a real one. But this is not only a copyright concern, but also important to ensure trust in a device's origin. For example a chip could be replaced in the manufacturing chain with a backdoored version. A good example to understand the problem is to look at smart cards, especially the ones used for decrypting premium TV channels. The whole business model relies on a shared secret key, embedded inside of the chips. It's obviously in the interest of the company, that nobody can crate a working copy of such a card. But once the secret key is extracted through various hardware attack techniques, creating a copy of a smart card is trivial.

Is there a way how hardware could be manufactured, such that it's practically impossible to copy - and not just very expensive to copy?

Physically Unclonable Function

Physically Unclonable Function (short: PUF) is a concept that attempts to exploit (utilize) physical impurities, which are different for each device, to make exact physical copies impossible to manufacture. In practice this is often used to verify, that a particular hardware (a chip) is not counterfeit.
This is usually implemented with a challenge and response protocol. A vendor can collect valid responses for random challenges of a chip, and the customer can verify later, that the device bought, was really made by that manufacturer.

While creating an exact copy of the chip might be impossible, one could try to understand the mathematical model underlaying the behavior and therefor is able to create a device that emulates the behavior of the original chip. Will every PUF have this flaw, that math can describe it's behavior, or are there PUFs that are truly random and thus unpredictable? - that is an unsolved question.

With the experiments I conducted, we tried to understand a certain PUF family better, whose underlaying mathematical model is unknown. But Fatemeh Ganji and Shahin Tajik were able, with the data I collected, to construct a machine learning algorithm that can learn the behavior of this PUF family.

Bistable Ring PUF

The Bistable Ring PUF (short: BR-PUF) exploits the behavior of inverters in a ring configuration. A digital inverter could for example output 0V if the input was 5V, and output 5V if the input was 0V. Connecting the output of a digital inverter back to it's input will result in an oscillator, which constantly tries to correct the output based on the new input. Connecting two inverters, like shown in the picture below, should result in a stable configuration.

But theoretical it's not possible predict if the the right wire will outputting a logical 1, or if the left wire will output a 1. While this is unpredictable in theory, in practice manufacturing can cause a device to always show the same result when powered on.

This fact can then be used in a configuration like shown below to create a unique challenge and response behavior.

The BR-PUF contains an even number of stages.
Each stage contains two inverters in parallel. Only one is used at a time. Which one depends on the challenge input bit c[i]. for example if the 1st challenge bit is set c[0]=1, then the first inverter in the first stage would be used. If it's not set c[0]=0, then the second inverter would be used.
So based on the challenge bit string c[0]...c[n] (eg. 011..110), a unique inverter ring can be configured. Ideally when a single challenge bit is changed, the outcome should be unpredictable. Similar to a cryptographic hash function - change one bit in the input, observe a basically random change in the output.

The question is, can a mathematical model describe the behavior, or can the behavior be emulated by a machine learning algorithm?

Experiment Setup

Plan

To evaluate a big number of BR-PUFs, we used an FPGA-based implementation. This way we can automatically generate new PUF implementations, program the board, run our tests, and continue with the next one. This way we can gather huge amounts of data, for a lot of different PUF configurations.

Hardware

For the experiments we used a DE0-Nano FPGA Board with an ALTERA Cyclone IV.
The following setup was used to perform the tests:

Windows Host (1) was connected via USB to the FPGA (3), in order to program new PUF configurations via Quartus Programmer (Quartus II 15.0 64-bit Web Edition). After successfully programming the FPGA, it would notify the OSX Host (2) via an Ethernet connection.

The OSX Host (2) implemented a simple server which is waiting for the signal that the FPGA is ready. A script would then start which runs a large amount of challenges via the UART connection and collects the responses. Once all challenges were executed, the Server would respond back to the Windows Host (1), which then can generate another PUF configuration, and start the whole cycle over.

Based on the collected challenges some basic analysis were performed and saved in a github repository for further analysis.

Experiment Execution

Automating PUF generation with Quartus

The first big challenge was to figure out how we could automate programming the FPGA with Quartus. This was not straightforward, because we didn't want to generate verilog code and then recompile it, we wanted to control where the PUF stages are placed inside of the chip. Basically control which logic-cells will be used. This can be achieved using the GUI Assignment Editor. To figure out how this can be done without the GUI, I have observed what kind of files are generated and what kind of programs Quartus invokes upon compilation. This way I learned that the configurations from the Assignment Editor, which can be used to choose the physical location on the FPGA for certain pieces, are written into a Quartus settings file.qsf.

This setup can be used to automatically configure the PUF in various places on the FPGA. For example in a row or column of logic-cells, and then move this line around - first implement a PUF in column 1/2, then in 2/3 and so forth.

Visualizing the FPGA usage

In the Quartus Chip Planber we can see where our FPGA configuration will be placed. This is great to visualize the kind of PUF we have configured. Notice the two colored straight columns, those are the 32 BR-Puf stages - the ring configuration.

To verify that the PUF was really configured how we wanted to, and also to understand the data we collect better, I visualized the FGPA configuration after compilation. Quartus generates an output file puf_top.fit.eqn, which contains the information which component gets placed where. I wrote a script to parse this file and generate an image like this one:

The implementation is horrible but "works for me". I generated an .html file with a huge <div> grid and colored it accordingly with CSS, then rendered it with the selenium webdriver to get the image.

Collecting data

Like mentioned above, we used UART to set a challenge and then read the response. We used several different ways to generate challenges throughout the research. For most of it we used randomly generated challenges. But for example for small PUFs with only 8 or 16 bit, we could run all possible challenges.

We also looked at the impact of a single bit change in a challenge. For example take a random challenge, then always look at the response when only one bit is changed. This way we can learn if a single bit change truly changes the output randomly, or if a single bit has no great effect in a particular configuration.

Very quickly it's clear that certain challenges produce unstable outputs. This means if you apply the same challenge to the PUF (selecting always the same path of inverters) the result might be a 0 or a 1. So we ran every challenge multiple times and stored each response. This way we can see which challenges are very stable, and always have the same response, or which ones are unstable and switch sometimes.

Generating Reports

For the collected data I then created a report like the one below. In this case you can see several different PUF configurations. This table shows simple information like the amount of challenges, and what the ratio between 0 and 1 responses was. For example the PUF implemented in columns 16/17 (LAB1_X16_X17) has a fairly balanced output. While the PUF in columns 23/24 (LAB1_X23_X24) is extremely biased to return a 0.

For each PUF I generated a more detailed report. This includes the PUF configuration, simple pie charts, but also more comlex diagrams. For example visualizing in how many challenges that returned a 0 a bit was set to 1. This way you could identify "influential bits" in a challenge, which have a huge effect on the result.

Analyzing the data and interesting observations

When I started with my experiments and got the first data, we identified problems, weird behaviors and other things that changed the strategy and setup throughout the research. Here are some examples:

Biased PUFs

Most of the PUF configurations we tested turned out to be hugely biased. Meaning they will most of the time return a 0 or 1. Which is quite bad if you want to implement a strong PUF with unpredictable responses. If a PUF returns mostly 1 for all challenges, it's not hard to guess the response for different challenge. A strong PUF would basically have an unbiased 50:50 outcome with a big number of random challenges. Because of this, we focused most analysis on PUFs that we considered to be quite strong - meaning their response can not be predicted with high accuracy based on their bias.

Unstable vs. stable challenges

Another variable we wanted to control is, at what point of time do we read the response from the inverter ring. At the beginning the state of the PUF was read when the response was requested via UART. But at some point I implemented a counter setting. This counter would starts at 0 when turned on, and once it reaches a configured number, it would read the current state of the PUF and save it. Which can then be extracted via UART at some later point in time.

This way we were able to answer questions such as, if a BR-PUF gets more stable over time. And make sure that a collected dataset has more controlled variables.

Here is an example stable challenge with a counter. Yellow is the output of the PUF. Blue indicates when the counter is done and the current logical value of the PUF (yellow) is stored.

(1) The PUF is turned on. The inverter ring starts to power on and starts to oscillate slightly, but it fairly quickly moves towards a stable logical output of 1.

(2) The counter, which started at 0 when the device turned on, reached the configured value. At this point in time the logical value of the PUF (yellow) is stored as a response for this particular challenge.

(3) Then we can see how the PUF output finally stabilizes on the logical 1, a little bit after the response was already saved.

(2) The counter reached the configured value and reads the current state of the PUF (yellow). Because of the heavily oscillating PUF, the response is either a 0 or 1 by chance.

(3) It's extremely interesting that even after a longer period of time, the PUF never becomes stable. We would expect that an even number of inverters would be stable at some point, but this is apparently not the case.

Results influence each-other

While testing, I noticed that some challenges are heavily influenced by previous results. So for example if challenge X returned a 1 and is followed by challenge Y, then challenge Y also returns a 1. But if Z returned a 0 and is followed by a Y, then challenge Y returns a 0. This was quite a big shock, but glad we caught it.

A BR-PUF implemented in an FPGA is not a "perfect" implementation - in the sense that logic-cells are configured with a lookup table to make it behave like an inverter. Here is a picture from Quartus, showing how one particular inverter cell is connected.

Not every possible input into the logical cell is connected. And this causes some analog electrical circuit magic interference that I don't quite understand. We connected these inputs in a specific way and have not observed challenges that influence each other anymore. Unfortunately I haven't figured out how to fix the PUF configurations automatically after creating a new configuration, so I had to do the post fitting assignment by hand.

Stable but oscillating challenges

Another interesting observation I made while looking at the PUF with an oscilloscope were challenges that were basically stable, but not really. What I mean by that is, that certain challenges cause the ring of inverters to oscillate not in chaos, but create a spike that travels around the ring with a certain frequency.

The yellow line shows again the state of the PUF, while the blue one indicates when a response was stored. The interesting part here is that the yellow line is logical 0 for most of the time, but once in a while shows a peak - a wave is traveling around the inverter ring. So even if we humans can read this as generally a stable response 0, the PUF could by accident read the logical state right at the point of a spike, and read a 1 instead.

This traveling wave is especially interesting when looking at different points in time. The top trace is captured when the counter reached 0x1ff, so a fairly short amount of time, and the bottom trace shows the state of the PUF after the counter has counted to 0x4ffff.

Applying the same challenge over and over again shows that the spikes appear at basically the same place every time with a short counter. But when more time passes (bottom trace), then spikes appear in more chaotic places. Which indicates that the waves don't travel with the same frequency every time, but slowly drift.

This is about a vulnerability I discovered in Apache Wicket in 2014, but never got around to publishing my write-up. So it's kinda outdated now...
Apache Wicket is a web application framework for Java and is used by quite a few big sites. I had a closer look at the encrypted url feature, which supposedly protects from cross-site request forgery.

Unfortunately the proposed simple example is inherently flawed for two reasons. First I will give a quick reminder what CSRF (cross-site request forgery) is - you can skip over it if you are familiar with that term. Then I will explain why this solution doesn't protect you from CSRF and at the end I will propose a solution that works.

CSRF Introduction

Cross-site request forgery is very simple but powerful. Imagine a browsergame with a form to send gold to another user:

When you submit this form, the browser will send a GET request to http://www.example.com/send_gold?gold=9999&user=samuirai.
Now wouldn't it be great if all players of the game would be so nice to send you all their gold for showing them what CSRF is?

Just embed this URL as a picture, for example in your game profile or on a fan site:

<imgsrc="http://www.example.com/send_gold?gold=9999&user=samuirai">

← Hint: Open the developer console of your browser go to the Network tab and reload this site.

Every player who is logged into the game and visist a site with this image will unwillingly send this request to the game server and transfer the gold to you.

Defeating Encrypted URLs

Apache Wicket had the great idea that encrypted URLs stop an attacker from doing this, presumably because an attacker can't guess the URL.
The default implementation org.apache.wicket.util.crypt.SunJceCrypt uses CRYPT_METHOD = "PBEWithMD5AndDES";, which means a password is hashed with MD5 (with a salt and 17 rounds) and this hash is used as key and iv for DES - not a very strong method, but there are bigger problems.

But apache wicket does two mistakes here. First mistake is that the example implementation uses the default password: WiCkEt-FRAMEwork. Many many sites don't bother or don't know they should change the password. So an attacker can easily decrypt the URLs and generate all the valid URLs he wants - not only for CSRF but also for other attacks such as reflected XSS (how convinient that the URL hides injected Javascript from XSS auditor and alert users :P).

Proof of concept: This python script will try to decrypt URLs using a standard password. pip install pycrypto required.

Ok let's assume the developers knew about the default password and changed it to sUp3r-pw. And nobody has a fast brute-force implementation for PBEWithMD5AndDES. They are still vulnerable to CSRF. How? - Well I as an attacker really don't care about the content of the URL. I just want to know where I can send the request to.

So this means the example "secure" implementation doesn't protect from CSRF at all.

Defeating Stateful URLs

Actually a bigger obstacle than encrypted URLs are Apache Wickets stateful URLs. They are easily identified with the number as parameter such as ?2:

http://www.example.com/?2

Form URLs typically look like this:

http://www.example.com/send_gold?2-3

Basically the first number is always incremented while visiting different subsites. While the second number is incremented on multiple refreshes on a single page. So this actually makes guessing the URL more difficult. I as an attacker don't know at what number a user currently is.

This even makes the encrypted URLs look more "cryptic" (constantly changing):

Solutions

The best CSRF protection is a so called csrf-token. The server generates a random string for each form and embeds it as <input type="hidden" name="csrf-token" value="r4nd0m123">. When the form is submitted, the server verifies the token.

When using encryption, Apache Wicket should be configured to use org.apache.wicket.util.crypt.KeyInSessionSunJceCryptFactory which doesn't take a fixed key, but generates a new key for each user.

This info should also be added in the standard Apache Wicket Guide. Otherwise developers will continue to implement the default insecure example.

Funny sidenote: This master thesis analyzed the security of this feature and got it wrong.

You can read the previous article on how to setup and access the NodeJS hacking challenge. I will now spoil the challenge, so if you want to try it yourself, stop reading now!

Scroll down for a TL;DR writeup.

1. getting an overview

When we first access the page we find this nice landing page. I tried to make a lame joke, but also hint at the issue. Languages like C are very prone to memory corruption vulnerabilities, especially when an inexperienced programmer starts writing C code. That's why it's advised, to choose "memory safe" languages for regular projects, or generally languages that make it harder to make mistakes. JavaScript is one of those more safe languages. But the bug that will be exploited here shows, that even in this very high-level language, you might not be as safe as you think you are.

The /admin or private Vexillology area is protected by a big password prompt. When we enter a password we get told that the password is wrong.

When we open the developer console from our browser, we can see that when we enter a password, a POST request to /login is performed with the password as JSON data {"password": "test"}.

Another thing we should pay attention to is the cookie. Infact there are two cookies. session=eyJhZG1pbiI6Im5vIn0= and session.sig=wwg0b0z2AQJ2GCyXHt53ONkIXRs. When you decode the base64 session cookie, you will see that it says {"admin":"no"}. Now you might think that we can simply set this to "yes". But this won't work, because the cookie is HMAC protected. If you change it the server will simply throw it away.

There is a good reason why you would want to store this information in a cookie with the client. This way you can have a stateless server application, and you can easily spin up new machines or do load-balancing without having to think about sharing a database with the session information.

2. code review

Now let's have a look at the source code. A good point to start is the app.js file. We can learn several things from it. First we can see that the app uses the express web frameworkvar express = require('express');. But this doesn't really matter too much here.

We can also have a look into the config.js file, which contains a dummy secret_password and dummy session_keys. Those keys are used to generate the HMAC for the cookies.

Next we should have a look at routes/index.js to see where our requests are handled. And it's really not much code.

You might notice that the secret_password is given as flag to the admin template. If you look at the template code in views/admin.jade you can see that if you were authenticated as an admin, you would get the secret_password.

if admin === 'yes'
p You are admin #{flag}
else
....

The only function that seems to have a bit more functionality is /login. Login checks if a password is set. Then it creates a Buffer() from the password, converts the Buffer to a base64 string, which can then be compare to the secret_password. If that were successful, the session would set admin = 'yes'.

3. the vuln

Somebody with a hacker mindset might immediately try to trace where untrusted userinput is handled. And eventually you would come across the Buffer class. And it turns out that Buffer() behaves differently based on the parameter.
You can test this with NodeJS on the commandline:

You can see that when Buffer is called with a string, it will create a Buffer containign those bytes. But if it's called with a number, NodeJS will allocate an n byte big Buffer. But if you look closely, the buffer is not simply <Buffer 00 00 00 00>. It seems to always contain different values. That is because Buffer(number) doesn't zero the memory, and it can leak data that was previously allocated on the heap.

This is the issue that recently surfaced. NodeJS issue #4660 discusses the issue and possible fixes. And yes, there were real-world packages affected.

So becaue we have a JSON middleware (app.use(bodyParser.json())), we can actually send POST data that contains a number. And when you do that, the API will return some memory that is leaked from the heap:

Why can the session key be leaked here? And why can I not leak the secret password? I only have some assumption for the latter, and that is, that the hardcoded password is somewhere in the memory area that is mapped when the JIT compiler takes care of the JS code. But the Buffer() allocated memory area is somehwere else.

The NodeJS app uses cookie-session var session = require('cookie-session'). Which has a dependency to cookies, which has a dependency to keygrip. And keygrip does the HMAC signature by using the node core crypto package. And cryptocreates a Buffer from the key. This means that an old session key could be leaked from memory.

With this session key we can now simply create a {"admin": "yes"} cookie with a valid signature. Which allows us to get access to the private area. You can do that by using the source code of this app, change the session_key in config.js and set the default cookie to req.session.admin = 'yes' in app.js.

Then you can grab the values from your modified local application, and simply set those cookies for the challenge server: session=eyJhZG1pbiI6InllcyJ9 and session.sig=oom6DtiV8CPOxVRSW3IFtE909As.

And now we can decode the base64 flag, which is our secret_password:

ALLES{use_javascript_they_said.its_a_safe_language_they_said}

TLDR: send a number as password to get a memory leak from NodeJS Buffer(number). POST /login {"password": 1000}. With a couple of tries you should leak the session key, which can be used to create a new valid signed cookie with {"admin": "yes"}. Win!

I really like to play CTFs (hacking games), because I always learn something new. But sometimes it's also fun to create a challenge yourself. A couple of days ago a nice NodeJS issue surfaced on my twitter feed and because I didn't have a lot of experience with NodeJS, I thought it would be a cool idea to learn more about it, by creating a challenge around it.

The goal is to successfully gain access to the restricted area and find the secret_password. The source code contains a dummy password and keys, which are obviously different on the actual challenge server. But they are easy identifiable because they follow the same format ALLES{...}. So you know when you got it.

If you stumble across this post at some point in the future and my VM is probably not running anymore, you can just host it locally.
Make sure you have NodeJS and npm installed. In case something changes in the future, I am running following versions:

Creating this system was an interesting challenge - the main threat vector are root exploits. I'm not a sysadmin and my Linux knowledge is not very in-depth. But I'm still pretty confident in my design. So now I want to go over every design decision.

> The whole setup is currently running on a very cheap vServer running a 64bit Debian

I got a cheap vServer because I didn't want to pay a lot of money for something nobody will use. And I chose Debian because that's the OS I'm most familiar with on a Server. But the distro shouldn't really matter as you will see soon.

> Chroot Jail for the game:

I wanted to separate the game from the real system and chroot seemed like a very good choice to handcraft the system. This can be easily done with sshd:

This allows me to handcraft the filesystem used by the players and limit the attack surface.

> No access to potential dangerous stuff like /proc and setuid binaries:

Being able to setup the filesystem how I want, I can choose to not mount /proc or /dev. There is no reason why a user should have access to /proc/kallsyms and know where kernel symbols are. I pulled up a random root exploit on the ExploitDatabase and it relies on access to /proc.

It's a lot of work to copy all the files necessary for a Linux system into the chroot jail. I need to copy every binary, including the shell itself and ls, cd, ... . But not only that, all the libraries like libc have to be copied as well. But this allows me to carefully control to what binaries users have access too and exclude any setuid root binaries. setuid binaries are another way how a root exploit could be achieved - so better remove those.

> Use Linux file attributes prevent players modifying or deleting files, even though they are the owner of them:

The game relies on setuid binaries for levels. So for example you exploit the /matrix/level1/level1 binary that belongs to user level2, so when you exploit it, that you have the rights of level2. But when you login as level2 you should not be able to delete or modify that binary - that would destroy the game. You should also not be able to create files anywhere, even in your home folder. That's why I use Linux file attributes to control this.

Here as an example level1. The owner of the level1 binary is user level2 but the group is still level1, together with the setuid bit s the user level1 can execute the binary but it will run as level2. Additionally the immutable file attribute i is set so that even the owner level2 cannot modify it.

Same goes for the files in the home folder of the user. They all belong to level1 but they are immutable. You may notice that the iwashere file has the write permission for the level1 owner and that the file attribute is append onlya. This allows the user to add a line to the file with for example echo "samuirai was here" >> /home/level1/iwashere but the user cannot delete or overwrite it.

One issue will always be root exploits like the recent CVE-2015-3290. But I hope the restricted filesystem together with the virtualized vServer will protect me from the majority.

The other big issue are race conditions in setting up new levels or making changes to current levels. When I make changes to levels I cannot make these atomic. I have to remove the immutable attribute, modify a file and readd the attribute. There is a window of opportunity where an attacker could make a mess. But this can be avoided by blocking ssh access, killing all processes from players, do the changes and allow them back in.