CrySyS Bloghttps://blog.crysys.hu
CrySyS Lab Blog Site - Beta
Sun, 24 Feb 2019 15:28:11 +0000 en-US
hourly
1 https://wordpress.org/?v=5.1.1Certificate Transparency – the current landscape in Hungaryhttps://blog.crysys.hu/2018/06/certificate-transparency-the-current-landscape-in-hungary/
https://blog.crysys.hu/2018/06/certificate-transparency-the-current-landscape-in-hungary/#respondThu, 14 Jun 2018 15:23:45 +0000http://blog.crysys.hu/?p=834This blog post was written by Márton Horváth who worked on implementing a monitor for Certificate Transparency logs in the context of his student semester project. As a proof-of-concept, he used the monitor he created to collect information about logged certificates issued in Hungary or issued to Hungarian web sites. This blog post contains the [...]]]>This blog post was written by Márton Horváth who worked on implementing a monitor for Certificate Transparency logs in the context of his student semester project. As a proof-of-concept, he used the monitor he created to collect information about logged certificates issued in Hungary or issued to Hungarian web sites. This blog post contains the main findings of the analysis of those certificates.

SSL certificates and the underlying PKI are the backbones of secure web transactions, but this system – just as any – has some weaknesses. One of such weaknesses is the lack of fast information propagation for easier detection of fraudulent activity. This problem motivated Google to introduce Certificate Transparency (or CT for short), which aims at extending the existing PKI by making it more open and verifiable. CT defines three main components: the logs, the monitors, and the auditors.

The most important components of CT are the log servers, which are cryptographically assured append-only databases of certificates. Certificates can be logged by the issuer (which is preferably a Certificate Authority) before or after the issuance, or by the subject (e.g., the operator of a web site). By logging a certificate to a known log, it becomes available to others, including monitors, which query these logs and check logged certificates.

What is this good for? Well, in case of any abuse of a logged certificate, there is a higher chance of someone detecting it. So submitting to a log server is a good thing from a security point of view. What we study in this post is how far the Hungarian web server operators progressed in this process so far.

The data used for this study was collected from public CT logs by a Proof-of-Concept CT monitor implementation developed in the CrySyS Lab. This monitor ran for 11 days at the time this article was written, filtering for certificates that have Hungarian reference in the subject or issuer fields.

The monitor – that is a simple program that queries logs for certificates that are interesting for the owner of the monitor (in our case Hungarian certificates) – found 427.026 matching entries during this time period. With a rough simplification every issuer and subject was considered distinct if there was any difference in the corresponding field (e.g., two certificates based on this logic come from different issuers if one of it had in the issuer field the [‘HU’, ‘Microsec Ltd.’, ‘Qualified’, ‘e-Szigno CA 2009 info@e-szigno.hu’] values and the other had [‘HU’, ‘Microsec Ltd.’, ‘Advanced Class 2’, ‘e-Szigno CA 2009 info@e-szigno.hu’], although they were both issued by Microsec Ltd). Using this method the aforementioned certificates have 122.416 distinct subjects and 129 distinct issuers.

According to Google, the preferred type of certificate logging in the long run is logging before issuance. It requires intervention to the issuing process of the CA, so it is more difficult to implement. The CA has to produce a yet unsigned special format of the certificate, called the pre-certificate, and this has to be logged before signing the real certificate, because in this way, the structure that is returned from the log server (called the SCT or signed certificate timestamp) can be attached to the certificate as an extension. So greater share of pre-certificates should be detected in the logs as time passes, but at the current stage, much more ‘classic’ x509 entries are expected. And this is exactly the case: the current share of pre-certificates in the logs is tiny as shown the following figure.

An interesting retrospection and good indicator of the popularity is the time distribution of logging events. The first certificate that we detected was logged on the 28th of February, 2017 at 9:55. It was still the only one in the 1st of April, so we started enumerating the monthly distribution from that date. The first day of the given month was the base of sampling. Representing this data on charts shows a more intense logging activity from October 2017 to February 2018.

This roughly fits the time period when it was believed that Google going to mandate the use of SCTs in Chrome for HTTPS based communications. By then Google announced a delay in introducing this policy, but it still seems that Google’s intentions had a pressing effect on site owners and CAs. The same tendencies could be seen if we display only the delta in the number of logged certificates in each month. There were two bulks of logging: a smaller one from May 2017 to August 2017 and a greater one from November 2017 to February 2018.

Now let’s see the perspective of the Certificate Authorities! Which CAs have been logging their certificates already? Who is using the pre-certificate format when logging?

It is not surprising that Let’s Encrypt logged the most certificate by far: 421,555 out of the 427,028 entries contains the string “Let’s Encrypt” in the issuer field. This is probably due to the fact that these certificates are free to acquire and Let’s Encrypt integrated the automatic logging in its issuing workflow following Google’s requirements. But one should not forget that the operator of the web site can also log its already issued certificates, so simply the amount of entries is not a good measurement of progress on the CA’s side. Rather the existence of pre-certificate entries should be examined, as those can only be produced before issuance, so those must have been logged by CAs for sure. Considering this, we have to re-evaluate Let’s Encrypt’s advantage in logging, as it has no pre-certificate entry at all. Considering logged pre-certificates, the best result was achieved by DigiCert with 1,395 entries.

The Hungarian CAs has also started to implement logging: there is a sum of 374 certificates that contain reference to a Hungarian issuer CA. Regarding pre-certificate vs. “classic” x509 certificates, Microsec is leading with 18.25%, followed by NetLock with 2.01%.

An entry also contains meta information, such as a timestamp of the log event. We can use this combined with the “not before” field from the certificate to estimate the time difference between issuance and logging. This is not so precise though, because the “not before” value could be set after the issuance. In any case, a big difference in this difference is something notable. We calculated the (not_before – logging_timestamp) difference for every entry. We did not find any negative differences which means that no certificate was logged before it became valid. This fact implies that the issuers already calculate with the delay introduced by the logging and set the “not before” field, according to it. The minimal time to log was 1 hour and 1 second, the maximum was more than 18 years. This latter was a certificate with a “not before” field of 1999 that was logged in 2017. The average difference was about 4 days, but because of these kinds of edge cases, probably the median, which was 3 hours, represents the dataset better. This data also describes the x509 entries correctly, because those entries make up the majority of the log entries. On the other hand pre-certificates themselves show a very different distribution. The minimum difference for them is the same 1 hour and 1 second, but the maximum difference is 1 day and 1 hour. This may not be surprising, because most logs offer a 24-hour maximum merge delay (MMD), meaning that they append the entry to their database within this time interval. In this case, the 11 hours average difference and 10 hours median did not differ that much.

The examined extension field (extended key usage) provided the expected result. Most of the entries contain the serverAuth and clientAuth value. The public key algorithm field within the subject public key info shows that most certificates (426,692) contain an RSA key and only a couple of them (334) contains ECDSA keys.

Certificate Transparency is a new way the improve the security of PKI without much effort from the participants of the TLS certificate issuance process. In addition, it is designed to be optional, hence it does not require everyone to adopt it instantly. So introducing it is a good trade-off for the community. The Hungarian CAs and web server operators have already started to include logging into their procedures, but there is still much room for improvement, especially, in the aspect of logging pre-certificates rather than already issued certificates.

]]>https://blog.crysys.hu/2018/06/certificate-transparency-the-current-landscape-in-hungary/feed/0Linux integrity monitoring on the Raspberry Pihttps://blog.crysys.hu/2018/06/linux-integrity-monitoring-on-the-raspberry-pi/
https://blog.crysys.hu/2018/06/linux-integrity-monitoring-on-the-raspberry-pi/#respondTue, 12 Jun 2018 15:16:00 +0000http://blog.crysys.hu/?p=827This blog post, written by István Telek, is the last post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. This last post describes how we implemented a very basic integrity monitoring function as a trusted application running in OP-TEE. Introduction

Runtime integrity monitoring can enhance [...]]]>

This blog post, written by István Telek, is the last post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. This last post describes how we implemented a very basic integrity monitoring function as a trusted application running in OP-TEE.

Introduction

Runtime integrity monitoring can enhance the security of our system by providing us information about possible failures, misconfigurations and break-in attempts. OP-TEE and ARM TrustZone technology provides a secure way to map certain memory regions and check their contents. In our example, we use this feature to obtain, in the secure world OP-TEE, the list of processes running in the normal world OS. One can then use a whitelisting technique to reliably detect any unknown or unwanted processes running in the normal world OS.

We explain our work in 6 main steps below:

Linux process list structure

Linux kernel memory addressing

Implementation as a Linux kernel module

Mapping Linux kernel memory from OP-TEE

Implementation as a Trusted Application

Creating a client application for our TA

1. Linux process list structure

Linux processes are represented by task_struct structures in kernel memory (see Linux kernel 4.6.3 task_struct structure source code). An instance of this structure is around 3 KiB in size and it contains a lot of information about the particular running process. These structures are stored in the kernel memory cache (kmem_cache).

Key points of the structure:

pid_t pid; Contains the PID of the process

struct list_head children; Holds a circular double linked list with the children of the given process.

struct list_head sibling; Holds a linkage to the parent’s children.

char comm[16]; Contains the name of the process. If a kernel process’ name ends with a / character followed by a number then the number indicates which CPU/core the thread is running on.

For our purposes, we are using a minimal task_struct structure called my_task_struct which only has these key members:

Each task_struct structure contains a link to its first child, which is linked to the child’s sibling field. The last task_struct structure’s sibling in this chain is linked back to the parent’s children field. We can get the process list by traversing this task_struct tree starting from the init_task symbol, which is statically allocated in the kernel memory following boot. The first child of init_task is init with PID 1, and the second child is kthreadd which is the kernel daemon with PID 2. To get the task_struct structure from the children->next pointer we have to calculate the offset of the sibling field and then create a pointer which points to the first element of this struct. This can be done with the list_entry macro which is supplied by the kernel (see list_entry macro documentation).

2. Linux memory addressing

The virtual address of the root of the process list (the init_task symbol) can be found in the System.map static symbol table which is created during the kernel compilation. On a running system /proc/kallsyms contains these static symbols along with the dynamic symbols. This first task is not a running process but the head of the process tree. Its process id (PID) is 0 and it is called swapper.

The address of init_task (which is 0xffffff80089bad00 in our case) depends on the particular kernel version and configuration. This is a Kernel Virtual Address (more about memory addressing can be found here and here), which is linearly mapped in the physical memory, so it is easy to translate it to a physical address. This translation is implemented as a macro called __virt_to_phys in memory.h:

This macro first checks the type of the kernel virtual address and it calculates the offsets accordingly. The virtual address of init_task (0xffffff80089bad00) can be translated to physical address 0x00000000009bad00. The physical addresses are important for us, because we need to map the physical memory regions in OP-TEE with our Trusted Application to access them.

3. Implementation as a Linux kernel module

We can create a simple kernel module which traverses and prints the process list. This kernel module does not use built-in macros or types, because we are going to implement it as a trusted application in the future. Because of this, we have to define our own list_head and task_struct structures, and a list_entry macro.

To create a kernel module we have to implement the init_module function which is defined in the <linux/module.h> header in the kernel sources. To do this we create a directory called kmod_pslist next to the linux kernel sources directory, which in our case is called linux. This directory is where we put our kernel module source file linux_pslist.c and the following Makefile:

To print out the process tree we can write a recursive function which traverses through the process list. This function prints the name and the PID of the task together with the address of the current task_struct structure’s virtual address. The full source code of linux_pslist.c is the following:

After obtaining all sources we have to compile our kernel first. You can speed up the compilation time with specifying the number of Make jobs with the -jX parameter. It is recommended to set this to at least the number of threads your processor can run simultaneously.

$ cd kmod_pslist
$ make kernel -j4

To compile the kernel module simply issue the make command in the kmod_pslist directory:

$ make

If the compilation was successful the kernel module linux_pslist.ko was created in the current directory. We can copy this to the Raspberry Pi and load it to see the process list:

$ sudo insmod linux_pslist.ko
$ dmesg

If the kernel module was loaded successfully a similar message should appear in the kernel log:

4. Mapping linux memory from OP-TEE

To access any memory in OP-TEE, it needs to be mapped. We need to map the area of the kernel memory where the init task symbol and the other task_structs are located. Every mapping must be PAGE_SIZE (usually 4KiB) aligned. For a more detailed description of memory management in OP-TEE see the Design Documentation. The mappings can be registered with register_phys_mem_ul macro. These macros can be called in core_mmu.c or platform specific main.c:

According to the kernel virtual memory layout (see below), 0xffffff80089bad00(virt) 0x009bad00(phys) is in the .data, 0x30000000(phys) is in the memory region.

Memory mapping limitations

In core_mmu_lpae.h the MAX_XLAT_TABLES constant must be set to 10 minimum, because with lower values, OP-TEE panics, the 248 MiB NW memory can not be mapped. The core_mmu_entry_to_finer_grained tries to make the tables “finer grained” (creating more small L1 tables probably) but cannot since it reaches the defined maximum translation (xlat) table limit. Setting the limit higher solves the problem.

Note: There is a slight inconsistency between ARM TF platform specific config constants and OP-TEE config constants: ARM TF defines the secure RAM 28 MiB (DRAM_SEC_SIZE) in size, while OP-TEE defines 32 MiB (TZDRAM_SIZE). Also there is 8 or 12 MiB unused secure RAM according to the definitions in the config files at least.

5. Implementation as a Trusted Application

There are two types of Trusted Applications (TA) in OP-TEE. The most used one is called user mode TAs. They are full featured Trusted Applications as specified by the GlobalPlatform TEE specifications. User TAs are loaded into memory from NW untrusted file system by OP-TEE when called (by specifying their UUID). They can access the GlobalPlatform core API and run in a lower exception level than OP-TEE core.

The other type is called pseudo TAs. These are implemented directly in the OP-TEE Core tree, statically built into the OP-TEE core and bacuse of this can not access the GlobalPlatform core API. They are also called by specifying their UUID and run in the same exception level as OP-TEE Core (like kernel modules in Linux). For a more detailed explanation see the Design Documentation TA section.

The pseudo TA uses the same algorithm that the Kernel Module. The main difference is that it must be independent of the Linux kernel headers used in the module and since it is compiled into OP-TEE Core there is very limited standard IO and string manipulation support. The only usable printing macros are defined in trace.h like DMSG and IMSG.

The entry point of the TA is static TEE_Result invoke_command, just like in user TAs. But it must be registered with the pseudo_ta_registermacro. For example:

We implemented the Linux kernel macros (defined in memory.h) in the pseudo TA for translating NW virtual addresses to physical. Then the physical addresses must be converted to SW virtual addresses so they can be used in code. This is achieved by calling phys_to_virt (defined in core_memprot.h) (ref. issue #1496).

Further issues and problems encountered

Securing the RPi3 hardware watchdog timer (unsuccessfully)

The Raspberry Pi 3 has a hardware watchdog timer (WDT) in the BCM2837 SoC. We wanted to use this timer securely from OP-TEE in such a way that only the Secure World (SW) can access the WDT. By doing this, periodical execution of a Trusted Application (e.g., our integrity monitoring app) could be ensured. Unfortunately, we could not achieve this with the RPi3 board. The device peripherals are mapped in high memory, and there is currently no known method to secure parts of that memory region without potentially causing problems for the Normal World operations and the Linux kernel. The lack of security features of the board makes this process generally harder.

Lack of information in general

BCM2837 does not have a publicly available documentation. The official page references to the BCM2835 and BCM2836 SoC used in the previous generation Pis. In these two documents there is only minimal information about the WDT. The 2836 mentions that one of the timers on the SoC can be used as a watchdog timer, but provides not further information.

Addressing

The datasheets mention 3 type of addressing modes, physical, virtual and bus. The addresses in the documents are bus addresses, therefore can not be directly used in programs. Also the physical addresses mapping is different form the real physical address mapping of the RPi3. This could potentially add another layer of confusion when trying to work with peripheral addresses (eg. in code).

Watchdog reset cycle

One other possible problem could be that the WDT on this board does not fully reset the device (eg. not re-executing bootcode.bin which is the first stage bootloader on the GPU), like a full power-cycle reset does. Some information about this can be found in the Raspberry Pi Forums and in comments in the Linux driver of the WDT.

Memory regions

The physical address of the device peripherals can be found in the ARM-TF RPi3 platform specific source code (contributed by Sequitur Labs): 0x3f000000. This extends to the end of the physical memory (0x40000000).
Every peripheral offered by the SoC is mapped in this region (WDT and other timer registers, GPIO, UART, SPI, I2C…).

The WDT physical address is also defined in the source code (and not documented anywhere else): 0x3f100000. (The used C struct is also available from the Linux driver)

This memory region (0x3f000000 - 0x40000000) is non-secure – accessible by the Normal World too.
ARM-TF configures the memory separation of the NW and SW, because it executes in a higher Exception Level (EL3) than OP-TEE. Only one region of the memory can be made secure in this way.

Possible solutions

Remapping OP-TEE: One solution for securing the peripheral region, could be to remap the OP-TEE secure region, to include this peripheral region too. But this is probably not recommended, since every other device – used and required by the NW Linux kernel is mapped there too, and doing this could lead to hard to identify problems.

External WDT: Since the GPIO also can not be made secure, this solution could only add the possibility of a full power-cycle reset for the board, if that is required. For a discussion on this refer to this RPi forum thread.

ARM-TF TSP (Test Secure Payload) secure timers: Instead of using WDT, this approach uses the ARM secure timers to periodically execute TA code. There is an issue for this on the OP-TEE Github page, but no further information for the RPi board, or the status or success of the approach.
In this ARM Community question there are instructions for an other board. But right now this approach seems like a long shot.

Note: Other solutions for securing any memory region probably does not exist publicly. Refer to this OP-TEE Github issue.

Other resources

]]>https://blog.crysys.hu/2018/06/linux-integrity-monitoring-on-the-raspberry-pi/feed/0Enabling WiFi and converting the Raspberry Pi into a WiFi APhttps://blog.crysys.hu/2018/06/enabling-wifi-and-converting-the-raspberry-pi-into-a-wifi-ap/
https://blog.crysys.hu/2018/06/enabling-wifi-and-converting-the-raspberry-pi-into-a-wifi-ap/#respondTue, 12 Jun 2018 14:28:47 +0000http://blog.crysys.hu/?p=824This blog post, written by Márton Juhász, is the fifth in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. This post specifically will explain how to convert the Raspberry Pi into a WiFi access point such that it can perform some gateway-like functionality. First, we describe [...]]]>This blog post, written by Márton Juhász, is the fifth in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. This post specifically will explain how to convert the Raspberry Pi into a WiFi access point such that it can perform some gateway-like functionality. First, we describe how to enable WiFi and then how to enable other software components to make the Pi an access point.

Enabling Wi-Fi

In this subsection, we enable the Wi-Fi interface of the Raspberry Pi 3, so it will be able connect to any Wi-Fi networks.

1. Buildroot build options

By default Wi-Fi is not enabled when creating an image with Buildroot. Since the Wi-Fi components come up a little late, dynamic hardware configuration tools, such as hotplug, must be used. Buildroot supports mdev which can be built into the filesystem and used for those required dynamic hardware configurations.

BR2_PACKAGE_WPA_SUPPLICANT_NL80211: Enable support for nl80211. This is the current wireless API for Linux, supported by all wireless drivers in vanilla Linux, but may not be supported by some out-of-tree Linux wireless drivers. wpa_supplicant will still fall back to using the Wireless Extensions (wext) API with these drivers. If this option is disabled, then only the deprecated wext API will be supported, with far less features. Linux may support wext with modern drivers using a compatibility layer, but it must be enabled in the kernel configuration. Check Target packages -> Networking applications -> wpa_supplicant -> Enable nl80211 support (BR2_PACKAGE_WPA_SUPPLICANT_NL80211 = y).

BR2_PACKAGE_WPA_SUPPLICANT_PASSPHRASE: Install wpa_passphrase command line utility. This is optional, only needed if you want to connect to other Wi-Fi networks than the originally choosen one without the help of another machine (or if you don’t have wpa_passphrase on your build system). Check Target packages -> Networking applications -> wpa_supplicant -> Install wpa_passphrase binary (BR2_PACKAGE_WPA_SUPPLICANT_PASSPHRASE = y).

2. wpa_supplicant

Create a file, named interfaces in buildroot/board/raspberrypi/ (all the other raspberrypi* are symlinks to this folder). The auto wlan0 will make sure that wlan0 is started when ifup -a is run, wich is done by the init scripts.

Create another file, named wpa_supplicant.conf with wpa_passphrase in buildroot/board/raspberrypi/ (all the other raspberrypi* are symlinks to this folder). It should look like something like this:

network={
ssid="SSID"
#psk="PASSWORD"
psk=XXX
}

3. post-build.sh

The hotplug helper must be set as mdev and write /etc/mdev.conf file. The mdev package itself has some helper script for this and can be used directly. Also the above created files must be copied, so add the following lines to buildroot/board/raspberrypi/post-build.sh:

Making a Wi-Fi AP

With these additional steps the Wi-Fi interface of the Raspberry Pi 3 can work as a Wi-Fi AP. These depend on the steps in the previous part (Enabling WiFi), so first follow those before continuing here.

1. Buildroot build options

The following must be set before a system build, alongside with the other configuration settings in Buildroot.

BR2_PACKAGE_BUSYBOX_SHOW_OTHERS: Show packages in menuconfig that are potentially also provided by busybox. Some of the packages that are needed are provided by busybox and if the following option is not selected, necessary packages will not appear in the package list. Check Target packages -> BusyBox -> Show packages that are also provided by busybox. (BR2_PACKAGE_BUSYBOX_SHOW_OTHERS = y)

BR2_PACKAGE_WPA_SUPPLICANT_AP_SUPPORT: With this option enabled, wpa_supplicant can act as an access point much like hostapd does with a limited feature set. This links in parts of hostapd functionality into wpa_supplicant, making it bigger but dispensing the need for a separate hostapd binary in some applications hence being smaller overall. Check Target packages -> Networking applications -> wpa_supplicant -> Enable AP mode. (BR2_PACKAGE_WPA_SUPPLICANT_AP_SUPPORT = y)

2. wpa_supplicant

Edit buildroot/board/raspberrypi/interfaces (all the other raspberrypi* are symlinks to this folder). Now the network must be configured manually. The settings here must match with the settings in dhcpd.conf (below).

Edit buildroot/board/raspberrypi/wpa_supplicant.conf (all the other raspberrypi* are symlinks to this folder). mode=2 is Access Point mode, and the other settings are for WPA2-PSK. Change the weak psk value of “12345678” to some strong password!

3. dhcpd.conf

Create a file, named dhcpd.conf in buildroot/board/raspberrypi/ (all the other raspberrypi* are symlinks to this folder). Below is a modified default /etc/dhcp/dhcpd.conf file which we changed as follows:

4. post-build.sh

The hotplug helper must be set as mdev and write /etc/mdev.conf file. The mdev package itself has some helper script for this and can be used directly. Also the above created files must be copied, so add the following lines to buildroot/board/raspberrypi/post-build.sh:

5. sysctls

This sysctl must be set after each boot. So create a file, named sysctl.conf, and copy to /etc/ on the root partition of the memory card:

# Enable IP forwarding.
net.ipv4.ip_forward = 1

The scripts in /etc/init.d/ run after each boot and before each shutdown. The following script is responsible for setting the sysctls above. So create another, executable file, named S02procps, and copy to /etc/init.d/ on the root partition of the memory card:

#! /bin/sh
if [ "$1" == "start" ]; then
sysctl -p
fi

6. Firewall settings

The following script is responsible for setting up the firewall. So create another executable file, named S99firewall, and copy to /etc/init.d/ on the root partition of the memory card:

]]>https://blog.crysys.hu/2018/06/enabling-wifi-and-converting-the-raspberry-pi-into-a-wifi-ap/feed/0OS hardening on the Raspberry Pihttps://blog.crysys.hu/2018/06/os-hardening-on-the-raspberry-pi/
https://blog.crysys.hu/2018/06/os-hardening-on-the-raspberry-pi/#respondSun, 10 Jun 2018 14:56:04 +0000http://blog.crysys.hu/?p=816This blog post, written by Márton Juhász, is the fourth in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. Previous posts discussed building a custom Linux system with Buildroot, installing OP-TEE, and verified boot on the Raspberry Pi. This post will describe some OS hardening options [...]]]>This blog post, written by Márton Juhász, is the fourth in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. Previous posts discussed building a custom Linux system with Buildroot, installing OP-TEE, and verified boot on the Raspberry Pi. This post will describe some OS hardening options you can use to reduce the attacks surface.

Some of the hardening options we discuss are build CONFIGs and some of them are runtime settings, but some runtime settings depend on some build CONFIGs. Note that these are for maximum security, so they can cause a performance drop, and they can make some other packages fail during build time or malfunction later.

Buildroot build options and Linux Kernel CONFIGs

These must be set before a system build, alongside with the other configuration settings in Buildroot.

-fstack-protector

BR2_SSP_REGULAR: Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.

BR2_SSP_STRONG: Like -fstack-protector but includes additional functions to be protected – those that have local array definitions, or have references to local frame addresses.

BR2_SSP_ALL: Like -fstack-protector except that all functions are protected. This option might have a significant performance impact on the compiled binaries.

BR2_FORTIFY_SOURCE_1: This option sets _FORTIFY_SOURCE to 1 and only introduces checks that shouldn’t change the behavior of conforming programs. Adds checks at compile-time only.

BR2_FORTIFY_SOURCE_2: This option sets _FORTIFY_SOURCES to 2 and some more checking is added, but some conforming programs might fail. Also adds checks at run-time (detected buffer overflow terminates the program).

Password encoding

Choose the password encoding scheme to use when Buildroot needs to encode a password (eg. the root password, below). Note: this is used at build-time, and not at runtime.

BR2_TARGET_GENERIC_PASSWD_MD5: Use MD5 to encode passwords. The default. Wildly available, and pretty good. Although pretty strong, MD5 is now an old hash function, and suffers from some weaknesses, which makes it susceptible to brute-force attacks.

BR2_TARGET_GENERIC_PASSWD_SHA256: Use SHA256 to encode passwords. Very strong, but not ubiquitous, although available in glibc for some time now.

BR2_TARGET_GENERIC_PASSWD_SHA512: Use SHA512 to encode passwords. Extremely strong, but not ubiquitous, although available in glibc for some time now.

root login

BR2_TARGET_ENABLE_ROOT_LOGIN: Allow root to log in with a password. If not enabled, root will not be able to log in with a password. However, if you have an ssh server and you add an ssh key, you can still allow root to log in.

BR2_TARGET_GENERIC_ROOT_PASSWD: Set the initial root password. If set to empty (the default), then no root password will be set, and root will need no password to log in. If the password starts with any of $1$, $5$ or $6$, it is considered to be already crypt-encoded with respectively MD5, SHA256 or SHA512. Any other value is taken to be a clear-text value, and is crypt-encoded as per the “Passwords encoding” scheme, above. WARNING! The password appears as-is in the .config file, and may appear in the build log! Avoid using a valuable password if either the .config file or the build log may be distributed, or at the very least use a strong cryptographic hash for your password!

The scripts in /etc/init.d/ run after each boot and before each shutdown. The following script is responsible for setting the sysctls above. So create another, executable file, named S02procps, and copy to /etc/init.d/ on the root partition of the memory card.

#! /bin/sh
if [ "$1" == "start" ]; then
sysctl -p
fi

The next post will discuss how to make the Raspberry Pi function as a WiFi access point (such that it can perform some gateway functionality).

Sources

]]>https://blog.crysys.hu/2018/06/os-hardening-on-the-raspberry-pi/feed/0Verified boot on the Raspberry Pihttps://blog.crysys.hu/2018/06/verified-boot-on-the-raspberry-pi/
https://blog.crysys.hu/2018/06/verified-boot-on-the-raspberry-pi/#respondSun, 10 Jun 2018 14:55:55 +0000http://blog.crysys.hu/?p=810This blog post, written by István Telek, is the third post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. It describes how you can implement a verified boot process on the Raspberry pi. Introduction

Securing the boot process is the first step in securing an [...]]]>

This blog post, written by István Telek, is the third post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. It describes how you can implement a verified boot process on the Raspberry pi.

Introduction

Securing the boot process is the first step in securing an embedded system. Booting untrusted images can circumvent existing security measures, therefore it is vital to ensure the integrity of these images. U-Boot introduced a feature called “verified boot” which can be used to verify images while still allowing them to be upgraded when needed. This feature needs an initial trusted image called a “root of trust” which is missing on the Raspberry Pi platform. This is a known limitation of the platform, but we are only using it to demonstrate the verified boot process. More information about verified boot can be found at [1].

The following steps are required to implement verified boot:

Create a signed image: The first step is to create a signed image which can be verified during boot.

Compile U-Boot with FIT image support: By default U-Boot doesn’t verify the images, so we have to configure it to support verified boot.

Install the image: The next step is to install the signed image and boot from it.

1. Create a signed image

U-Boot’s image format is called FIT (Flat Image Tree) which is a structured container format that supports multiple images, device trees, etc. We can add signatures during or after creating a FIT.

The FIT image source file is an .its (image tree source) file, which describes the included images and the signature methods. We can use multiple configurations for booting and we can sign these configurations. In our example we are using OP-TEE with Linux to demonstrate loading multiple images during boot. More information about booting multiple images can be found at [2]. Our .its file is the following:

Create a signed FIT image

The next step is to create the image and sign our configuration with the generated private key. The image source is called im-magic.its which is the source file shown above. The signed image is called image.fit. We are storing the public key in U-Boot’s Control DTB so it can verify the image during boot.

3. Install the image

To install the image, copy optee/out/uboot.env, optee/u-boot/u-boot-rpi.bin and fit/image.fit to the SD card boot partition. Delete Image and optee.bin to save some space. If the image was verified successfully, a similar output should appear on the UART terminal during boot:

Sources

]]>https://blog.crysys.hu/2018/06/verified-boot-on-the-raspberry-pi/feed/0Using Buildroot to create custom Linux system imageshttps://blog.crysys.hu/2018/06/using-buildroot-to-create-custom-linux-system-images/
https://blog.crysys.hu/2018/06/using-buildroot-to-create-custom-linux-system-images/#respondSun, 10 Jun 2018 14:55:37 +0000http://blog.crysys.hu/?p=792This blog post, written by Szilárd Dömötör, is the second post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. The first post explained how to build and install the default OP-TEE implementation for the Raspberry Pi 3. This one describes how you can build [...]]]>This blog post, written by Szilárd Dömötör, is the second post in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform. The first post explained how to build and install the default OP-TEE implementation for the Raspberry Pi 3. This one describes how you can build your own custom Linux system (with OP-TEE) using the Buildroot environment.

Introduction

When using OP-TEE on the Raspberry Pi 3 the default root file system (rootfs) is generated with a simple initramfs build script: gen-rootfs. This is usable for testing OP-TEE and its functionalities, but for more complex applications, greater customizability is needed. Adding Linux packages this way is not practical, as updating the packages and OP-TEE trusted applications and delivering those updates becomes cumbersome after some time. One might prefer to use a traditional Linux distribution like Raspbian, but in those mainline distributions OP-TEE support is missing or incomplete unfortunately.

This is when Buildroot and Yocto Project comes in the picture. With these solutions one can build a completely customizable embedded Linux system. These build systems provide the rootfs, toolchains, kernel, bootloader and a great number of installable packages. Everything is compiled from scratch using cross compilation. Both are very actively developed and maintained projects, and they are widely used in the industry by companies like Intel, Juniper, Xilinx, Texas Instruments etc. Both have great documentation and online training courses. Both are Free Software: Buildroot is released under GNU GPL version 2 (or later), Yocto is a mix of MIT and GPLv2.

We chose Buildroot, because it is simpler to use and understand than Yocto and has a strong focus on simplicity. It re-uses existing technologies, such as kconfig and make. By default it generates small images in a minimalist approach. Also, it is free of corporate management and has an open community. Here is a more in-depth comparison of the two projects.

The main product of Buildroot is the root filesystem image. We replaced the above mentioned default rootfs with the one generated by Buildroot and the kernel too; OP-TEE, U-Boot images and configuration are from the OP-TEE build output. You can follow the tutorial below to achieve this.

Note: In the meantime, the OP-TEE build project 6 also started using Buildroot for building, but at the time of this writing, RPi3 still uses gen-rootfs.

Buildroot can be downloaded from buildroot.org as a tarball, or alternatively cloned from their git:

$ git clone git://git.buildroot.net/buildroot
$ cd buildroot

Check out the tag for the preferred version, for example:

$ git checkout tags/2018.02 -b 2018.02

Note: Buildroot releases are made every 3 months, in February, May, August and November. Long Time Support is available: every YYYY.02 release is maintained for one year, with security, build and bug fixes.

Configuration

An optional out of tree build can be used to build different Buildroot configurations from the same source tree. This way you don’t need to rebuild everything to use a previous config. This can be useful for testing more then 1 or 2 configurations, but is not necessary since the config below should work quite good.

From now on O and C can be omitted. The commands below should be run in the out of tree directory.

The default build configuration for the RPi3 is provided in Buildroot. It contains the necessary settings to make a Linux system for the RPi3. Using the 64 bit version is important, because OP-TEE requires a 64 bit kernel. You can edit configs/raspberrypi3_64_defconfig directly, or make a new defconfig in configs/raspberrypi3_64_custom_defconfig:

Put this file in the configs directory, and start make to create the .config file which contains the previous settings and expands them (set other dependent flags):

$ make raspberrypi3_64_custom_defconfig

To edit the configuration use gconfig for Gnome, xconfig for Qt, menuconfig/nconfig for console menu.

$ make gconfig

Note: This only edits .config, to store the changes in the currently used defconfig too use make savedefconfig.

Explanation of the configuration

If you use the defconfig above, these settings are already applied. The locations of the settings in the menuconfig /gconfig window are in bullet points.

Toolchain

To reduce build time, and to be compatible with the kernel built with OP-TEE, we use the external toolchain provided by Linaro. With this setting, the toolchain will be downloaded, and not compiled like the internal toolchain.

Set Toolchain -> Toolchain -> Linaro AArch64 2017.11 (BR2_TOOLCHAIN_EXTERNAL_LINARO_AARCH64 = y). This is selected by default when you set the external toolchain option.

Kernel

We build the kernel with Buildroot too, since it is an integral part of the system and this way it can be configured just like Buildroot, from the same directory with linux-menuconfig. We use the same kernel that OP-TEE, see here and here.

Additional configuration is also needed for the kernel. This can be done with config fragment files, like rpi3.conflocated in the OP-TEE project. These fragments are merged with the main Linux configuration file.

Note: Copy optee/build/kconfigs/rpi3.conf (available here) to buildroot folder if you not have done already.

We also have to set a Device Tree Source file, which is needed by Linux to describe non-discoverable hardware e.g., UART, I2C, some timers, etc. Check this site for a detailed explanation on the device tree.

Output Images

To extract the rootfs to the SD card, a tarball should be produced. By default Buildroot generates an sdcard.img image file, which could be directly written to the SD card (done in the post-image script). This is unnecessary because using a tarball is more practical now.

Delete System Configuration -> Custom scripts to run after creating filesystem images (BR2_ROOTFS_POST_IMAGE_SCRIPT = ”). This step is needed, because the post image script would create sdcard.img.

An uncompressed rootfs.tar and the kernel image Image will be created in the output/images directory.

Post-build script: are run before building the filesystem image, kernel and bootloader. Post-image script: can be used to perform some specific actions after all images have been created. See more about this here.

Other

The system hostname and banner can be set:

System Configuration -> System hostname (BR2_TARGET_GENERIC_HOSTNAME).

System Configuration -> System banner (BR2_TARGET_GENERIC_ISSUE).

To set the root password:

System configuration -> Root password

Buildroot provides many packages which can be added to the build in Target Packages.

Note: This script also installs S09_optee in /etc/init.d to autostart tee-supplicant. For TA development, OPTEE_APPS_PATH/out is the TA and CA output location.

Start the build with make. (For out of tree builds call it from the created build directory.)$ make

Copy the Image to the FIT image directory, and rebuild the FIT image.

For rootfs installation, simply wipe the rootfs partition of the SD card, and extract the rootfs.tar to it.

For rootfs installation, simply wipe the rootfs partition of the SD card, and extract the rootfs.tar to it. Don’t forget to `sync` the mounted device before unmounting it otherwise the rootfs might get corrupted and the the Pi won’t boot.

The next post will explain how a verified boot process can be implemented on the Rapsberry Pi.

]]>https://blog.crysys.hu/2018/06/using-buildroot-to-create-custom-linux-system-images/feed/0OP-TEE default build and installation on the Raspberry Pihttps://blog.crysys.hu/2018/06/op-tee-default-build-and-installation-on-the-raspberry-pi/
https://blog.crysys.hu/2018/06/op-tee-default-build-and-installation-on-the-raspberry-pi/#respondSun, 10 Jun 2018 14:55:20 +0000http://blog.crysys.hu/?p=784This blog post, written by Márton Juhász, is the first in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform.

This blog post explains how to build and install the default OP-TEE implementation for the Raspberry Pi 3. The easiest way is to follow the steps described [...]]]>

This blog post, written by Márton Juhász, is the first in a series of blog posts on transforming the Raspberry Pi into a security enhanced IoT platform.

This blog post explains how to build and install the default OP-TEE implementation for the Raspberry Pi 3. The easiest way is to follow the steps described in the corresponding git repo of OP-TEE. However, for the sake of completeness (and because some steps may actually be a bit confusing in the original description), we provide a comprehensive description here.

Prerequisites

Theoretically, you can use any Linux distribution to build OP-TEE. However, to be able to build and run OP-TEE, there are a few packages that need to be installed to start with. Therefore, first install the following packages:

Install Android Repo

Note that here you do not install a huge SDK, it is simply a Python script that you download and put in your $PATH. That’s all. To install the Repo, make sure you have a bin/ directory in your home directory and that it is included in your path:

Get the source code of OP-TEE

Note that the repo sync step will take some time if you aren’t referencing an existing tree.

Get the toolchains

Create the toolchains by:

$ cd build
$ make toolchains

Build OP-TEE

The repo manifests have been configured, so that repo will always automatically symlink the Makefile to the correct device specific makefile, that means that you simply start the build by running:

$ make

Note: Remember to add -jX to make to run parallel build. This step will also take some time.

Flash the device

The last step is to partition and format the memory card and to put the files onto it. That is something not automated, since if anything goes wrong, in worst case, it might wipe one of your regular hard disks. Instead, there is another makefile target that will tell you exactly what to do. Run that command and follow the instructions there:

$ make img-help

Note: If you don’t want to get any warnings, errors or don’t want to reboot, then start with an empty, unpartitioned memory card.

Boot up the device

With all files on the memory card, put the memory card into the Raspberry Pi 3 and boot up the system. On the UART interface, you will see the system booting up.

Load tee-supplicant

Theoretically tee-supplicant is already loaded (check with $ ps aux | grep tee-supplicant). If it’s not running, then start it by typing:

$ tee-supplicant &

Run xtest

The entire xtest test suite has been deployed when you we’re running $ make in the previous steps, i.e, in general there is no need to copy any binaries manually. Everything has been put into the root FS automatically. So, to run xtest, you simply type:

$ xtest

If everything went well, then xtest should end with something like this:

Sources

]]>https://blog.crysys.hu/2018/06/op-tee-default-build-and-installation-on-the-raspberry-pi/feed/0Enhancing the Security of the Internet of Thingshttps://blog.crysys.hu/2018/06/enhancing-the-security-of-the-internet-of-things/
https://blog.crysys.hu/2018/06/enhancing-the-security-of-the-internet-of-things/#respondSun, 10 Jun 2018 14:55:07 +0000http://blog.crysys.hu/?p=782The Internet has grown beyond a network of laptops, PCs, and large servers: it also connects millions of small embedded devices. This new trend is called the Internet of Things, or IoT in short, and it enables many new and exciting applications. At the same time, IoT also comes with a number of risks related [...]]]>The Internet has grown beyond a network of laptops, PCs, and large servers: it also connects millions of small embedded devices. This new trend is called the Internet of Things, or IoT in short, and it enables many new and exciting applications. At the same time, IoT also comes with a number of risks related to information security. The lack of security, however, cannot be tolerated in certain applications of IoT, including connected vehicles and smart factories. In those applications, security failures may lead to substantial physical damage or monetary loss. Therefore, one of the biggest challenges today, which hinders the application of IoT technologies in certain application areas, is the lack of security guarantees.

In the CrySyS Lab, one of our main research topic is the design and development of security enhancing technologies for IoT systems. We believe this will enable the use of IoT in a wider range of applications. However, securing IoT systems is challenging, because IoT devices have resource constraints and they may not be capable of running traditional security mechanisms. Another problem is that typical IoT systems contain a large number of those devices, which makes the management of security on them a serious practical issue. On the other hand, in typical application environments, IoT devices are not directly connected to the Internet, but they are using gateway devices. Gateways usually have more resources and they may be physically better protected, so they can perform security functions to protect themselves and they can monitor the operation of those simpler IoT devices that they serve. Therefore, placing IoT devices behind gateways and protecting the gateways from cyber attacks seems to be a technically feasible, scalable, and cost efficient approach to increase the security of IoT systems.

Gateways directly face the Internet and they may be subject to various attacks, so it is important to implement strong security mechanisms on them. The following is a list of mechanisms that we believe to be essential for securing the operation of IoT gateway devices:

Hardware root of trust: While gateways can be physically better protected than IoT devices themselves, they are still not strongly tamper resistant, as providing real tamper resistance for the entire device would be too expensive. This means that attackers with physical access to the gateway can compomise it. Therefore, it is important that the gateway has at least some tamper resistant module inside that can withstand physical attacks and that can serve as a root of trust for other security functions listed below. Such a tamper resistant module can be used to store sensitive cyrptographic secrets (e.g., the private key used to authenticate the device) to which access is strictly controlled.

Verified boot process: It must be ensured that after a reset, the gateway boots into a secure state. This can be achieved by digitally signing the software layers loaded during the boot process, and enforcing each layer to verify the digital signature on the next layer before it is loaded and executed. If verification fails, the boot process is halted. The public signature verification keys needed for this can be stored inside the software layers themsleves, as the signature on a software layer also protects the integrity of the key stored in it. The very first bootloader component must be stored in and executed from unmodifiable ROM memory and it must use a public signature verification key that is stored in physically write-protected memory (e.g., a one-time programmable memory where to key is written during device initialization in a secure environment) or whose protection can be traced backed to the hardware root of trust. All these mechanisms ensure that the code loaded and executed after reset cannot be modified on the gateway, not even if the gateway is compromised at run-time and executes malicious code.

Security hardened firmware/OS: The firmware/OS running on the device must be hardened by disabling unused services and enabling available protection mechanisms that are optional and not used by default. Hardening reduces the attack surface and makes the gateway more resistant to typical known attacks.

Trusted execution environment: It is desirable to execute sensitive security functions in a trusted execution environment (TEE), which we can better trust for behaving correctly and protecting sensitive secrets (e.g., shared symmetric keys and private signature generating keys) during computations than does the main firmware/OS itself. Such a TEE could be provided by a security co-processor to which sensitive operations can be delegated, but this approach may be too expensive in practice. A more cost efficient solution is a software based TEE with some hardware support such as ARM’s TrustZone technology.

Run-time firmware/OS integrity monitoring: While the verified boot process ensures that the gateway boots into a known good state after reset, it does not prevent compromising the device at run-time by exploiting some software vulnerability. This may happen despite firmware/OS hardening, because hardening is just a best-effort approach, and it cannot eliminate the entire attack surface. Hence, it is useful to periodically check the integrity of the firmware/OS and identify malicious components, suspicious network connections, or any other anomaly that may indicate a run-time compromise. However, this integrity verification should not be performed by the firmware/OS itself, as it may be already compromised and the result of the check may not be reliable. Instead, integrity verification must be performed by a trusted application running in the TEE and having access to the entire memory of the firmware/OS. Yet, integrity verification itself is far from trivial, even if it executes in a trusted execution environment. The integrity verification code may extract the list of the currently running processes and compare that to a whitelist, or it can even check if the processes in memory have the right hash values and they did not contain any potentially malicious additional code. If a compromise is detected, the gateway can be forced to reboot and get back to a known good state.

Remote attestation of state: Run-time integrity checking can be triggered by an on-board secure timer (or watchdog mechanism), or it can be invoked remotely by an operator. In the latter case, it may make sense to send a response to the caller informing him/her about the result of the integrity verification procedure. In order to avoid that a compromised device can falsify the response, it should be digitally signed. So essentially, the trusted integrity verification application running in the TEE can send a digitally singed attestation report that can be verified by anyone. The private signature generation key must be protected by the TEE and secure storage mechanisms traceable back to the hardware root of trust.

Secure and fail-safe remote firmware/OS update: The immediate response to a run-time compromise could be rebooting the gateway, however, in the long run, the vulnerability that made the compromise possible should also be eliminated and a security patch or the entire fixed firmware/OS should be downloaded to and installed on the gateway. This requires a secured firmware/OS update process, which is also fail-safe, meaning that if something goes wrong and the new firmware/OS does not work correctly, there must be a way to revert to the old version, at least temporarily. In addition, the revert feature should not introduce the possibility of forcing the gateway to boot an old, potentially vulnerable version of the firmware/OS, when actually a newer version is available and functions correctly.

Besides the essential security features described above, the following mechanisms could also be useful:

Disk encryption: In order to protect the data permanently stored on the gateway device, its persistant memory can be encrypted with standard disk encryption tools. The key or passphrase needed to decrypt the that memory content can be supplied by a trusted application running in the TEE, which also takes care of the secure storage of the key/passphrase using mechanisms that are traceable back to the hardware root of trust.

Secure communications/remote access: The gateway should be able to communicate securely with its operator. This is needed not only for data transmission but also for providing secure remote access to the device for configuration and management purposes. Secure communications can be implemented with cryptographic mechanisms. However, long-term secret passwords or private keys needed for authenticating the gateway to the operator when setting up a secure communication session should not appear in memory accessible to the potentially compromised firmware/OS. This means that those secrets must be stored in secure storage and the cryptographic operations that use them should be implemented by trusted applications executed in the TEE.

In the spring semester of 2018, in a set of student semester projects, we created proof-of-concept (PoC) implemenations for some of the essential security features (verified boot, OS hardening, run-time integrity verification) described above. We used a Raspberry Pi embedded computer as the IoT gateway platform in our pilots. This ARM based platform supports TrustZone technology and the OP-TEE trusted execution environment, which our implementations heavily rely on. At the same time, unfortunately, the Raspberry Pi does not have a hardware root of trust, so our PoC implementations are incomplete in this sense. Yet, creating these PoC implementations was still very useful as a learning process for us. Our plan is to port these implementations to other, more secure embedded platforms that also feature some hardware root of tust element.

While working on the PoC implementations, we realized that for many of the issues that we encountered during development, there is no easily usable documention available on the Web. Information can be found here and there on different forums and github repositories, but not in a single place and not in a comprehensive, easy-to-understand form. So we decided to share our know-how and experience by publishing a series of blog posts that one could follow in a step-by-step manner to repeat what we did. More specifically, our blog posts cover the following topics:

We hope these blog posts will be helpful to the community working on IoT security and, in particular, to those using the Raspberry Pi and aiming at making it a bit more secure than the baseline platform that one gets by default.

Recent technological advancements have enabled the collection of large amounts of personal data at an ever-increasing rate. Data analytics companies such as Cambridge Analytica (CA) and Palantir can collect rich information about individuals’ everyday [...]]]>

Recent technological advancements have enabled the collection of large amounts of personal data at an ever-increasing rate. Data analytics companies such as Cambridge Analytica (CA) and Palantir can collect rich information about individuals’ everyday lives and habits from big data-silos, enabling profiling and micro-targeting of individuals such as in the case of political elections or predictive policing. As it has been reported at several major news outlets already in 2015, approximately 50 million Facebook (FB) profiles have been harvested by Aleksandr Kogan’s app, “thisisyourdigitallife”, through his company Global Science Research (GSR) in collaboration with CA. This data has been used to draw a detailed psychological profile for every person affected, which in turn enabled CA to target them with personalized political ads potentially affecting the outcome of the 2016 US presidential elections. Whether CA used similar techniques at the time of the Brexit vote, elections in Kenya and an undisclosed Eastern European country (and several other countries) is under investigation. Both Kogan and CA deny allegations and say they have complied regulations and acted in good faith.

This blog post does not take sides in this debate, rather, it provides technical insight into the data collection mechanism, namely collateral information collection, that has enabled harvesting the FB profiles of the friends of app users. In this context, the term collateral damage refers to the privacy loss these friends suffer. In a larger nexus, this issue is part of interdependent privacy when your privacy depends unavoidably on the actions of others (in this case, your friends).

Collateral information collection

Facebook has morphed into an immense information repository, storing individuals’ personal information; logging interactions between users and their friends, groups, events and pages. It also offers third-party applications (apps) developed by third-party application providers (app providers). While installing an app on Facebook, it enables access to the profile information of a user. Accepting the permissions, the app collects personal and often sensitive information of the user, such as profile image, dating preferences, religious and political interests.

From a technical standpoint, Facebook provides an Application Programming Interface (API) and a set of permissions that allows third-party apps to gain access and transfer the information of users to app providers. Up until 2014 (2015 for already existing apps, Graph API v1.0) the profile information of a user could also be acquired when a friend of a user installed an app effectuating privacy interdependence on Facebook. Facebook claims that since 2015 (Graph API v2.0), this problem has been mitigated by changing the permission system, requiring mutual consent and mandating app reviews. A user who shared personal information with his/her friends on Facebook had no idea whether a friend had installed an app that also accessed the shared content. In short, when a friend of a user installed an app, the app could request and grant access to the profile information of a user such as the birthday, current location, and history. Such access took place outside the circle of trust, the user was not aware whether a friend had installed an app collecting his/her information; collateral information collection was enabled only based on the friend’s consent and happened without the consent of the user herself. By default, almost all profile information of a user could be accessed by their friends’ applications, unless they manually unchecked the relevant boxes in “Apps others use”. Note that, in some cases, one or more app providers may offer several applications and thus potentially construct a more complete personal profile via data fusion. Such fusion is straightforward to carry out, as all users are assigned a unique ID on Facebook that carries over to any app. This “feature” is still available and will not be affected by FB CEO Mark Zuckerberg’s recent privacy-enhancing announcements.

Collateral damage

We have been studying the interdependent privacy issues of third-party FB apps since 2012, showing the existence of the problem, investigating whether FB users cared about the issue and trying to quantify its likelihood and impact. Our main findings were the following:

Is it likely that an installed application enables collateral information collection?

To answer this question, we computed the probability of the following event in the Facebook ecosystem: one of the friends of a user installs an application that requests friend permissions and enables collateral information collection. Taking into consideration the FB network topology and app adoption dynamics, we concluded that the likelihood of collateral information collection for a given user depends on the number of her friends and the popularity of the app under study. Assuming a single app with more than 5 million active users (such as TripAdvisor), there is an 80% probability for this risk to materialize for an average user.

How significant is the collateral information collection?

Utilizing real data from the Facebook third-party apps ecosystem (from the AppInspect study by Huber et al.), we computed the proportion of profile information collected by apps installed by the friends of a user.

We identified that almost half of the popular apps (more than 10,000 monthly active users) that enable the collateral information collection, collected the photos of the friends of a user, with location and work history being the second and third most popular attributes. 72% of these apps collected exactly one profile item in a collateral fashion, the remaining 28% somewhere between 2 and 11; this amounts to an average of 2 profile items collected per app. We also showed that some app providers do acquire more complete user profiles by offering multiple apps.

Is collateral information collection a risk for the protection of the personal data of Facebook users?

Through the prism of the new European General Data Protection Regulation (GDPR), we clarified who most likely the data controllers and data processors are. Second, we verified whether collateral information collection is an adequate practice from a data protection point of view. Finally, we identified who is accountable. We concluded that such collateral information collection is likely to result in a risk for the protection of the personal data of the Facebook users. The cause is the lack of notification (transparency as per Article 5 GDPR) and consent (consent as per Articles 6 and 7 GDPR), the non-existence of privacy by default in Facebook’s privacy settings, and the amplifying effect of data fusion and, potentially, profiling.

The collateral information collection goes far beyond the legitimate expectations of users and their friends. Consent can only be given by the person whose data is collected, not by the friend – the only exceptions are cases where the user cannot give consent (e.g. children under a certain age, in which case parents can give consent). One could claim that consent has implicitly been given since Facebook application settings allow users to control their settings; users can uncheck the appropriate privacy settings (i.e., “Apps Others Use” sub-menu). However, consent needs to be a “freely given, specific, informed and unambiguous indication of the data subject’s wishes […] by a statement or by a clear affirmative action” (art. 4 (11) GDPR), therefore unchecking a box will not be sufficient for consent. Adding to this, “Apps Other Use” has not really functioned since the deprecation of Graph API v1.0 in 2015: it has remained in the user interface as eye candy but has not resulted in more or less protected profile items.

Are users concerned that apps installed by their friends can collect their profile data on Facebook?

We also investigated the opinion of Facebook users: are they concerned about collateral information collection? We designed a questionnaire and distributed it among 114 participants, to identify their concerns on collateral information collection, lack of transparency (notification) and not being asked for their approval (consent). We identified that the vast majority of our participants, i.e., 77%, is very concerned and would like proper notification and control mechanisms regarding collateral information collection. On top of that, their concern is bidirectional: they would like to both notify their friends and be notified by their friends when installing apps enabling collateral information collection. They would also like to restrict the apps accessing the profile data of both the user and their friends, when their friends enable collateral information collection and vice versa.

Outlook

Our academic study was based on real data on which permissions FB apps actually ask for and thus which personal profile items they can gather. We believe that our computations with regard to collateral damage showed that it is indeed an interesting and practical issue and a worthy research topic. Strengthening our case, we found that collateral information collection was problematic under the new European data protection legislation and that users were absolutely concerned by it. However, it was the detailed, multi-angle media coverage of the Facebook/Cambridge Analytica case that really put our numbers into context and increased interest in our research. Finally, it is important to note that Kogan’s “thisisyourdigitallife” is only one of many FB apps that have collected friends’ personal data. In fact, the interdependent privacy/collateral damage issue is not limited to the FB platform; it is fairly ubiquitous in our connected age: it is present on mobile platforms, in cloud apps and location-based services, and even personal genomics services.

A special thanks to Jessica Schroers (CiTiP, KU Leuven) for her expert opinion on legal implications.

]]>https://blog.crysys.hu/2018/03/interdependent-privacy-in-effect-the-collateral-damage-of-third-party-apps-on-facebook/feed/0Territorial Dispute – NSA’s perspective on APT landscapehttps://blog.crysys.hu/2018/03/territorial-dispute-nsas-perspective-on-apt-landscape/
https://blog.crysys.hu/2018/03/territorial-dispute-nsas-perspective-on-apt-landscape/#respondThu, 08 Mar 2018 19:34:02 +0000http://blog.crysys.hu/?p=700Boldizsár Bencsáth (Boldi) will have a presentation at Kaspersky Security Analyst Summit on 09/03/2018 Friday. The presentation is based on a technical paper which describes findings about modules and information in April 2017 Shadow Brokers leak. The particular information categorizes external APT attackers and calls them SIG1 to SIG45. For more, please check the [...]]]>Boldizsár Bencsáth (Boldi) will have a presentation at Kaspersky Security Analyst Summit on 09/03/2018 Friday. The presentation is based on a technical paper which describes findings about modules and information in April 2017 Shadow Brokers leak. The particular information categorizes external APT attackers and calls them SIG1 to SIG45. For more, please check the paper. Please do not forget The corresponding external sample hash list text file.