Low-level iOS forensics

iOS filesystem encryption and data protection mechanisms are now well
documented and supported by many forensics tools. iOS devices use
NAND flash memory as their main storage area, but physical imaging
usually refers to a "dd image" of the logical partitions. The iOS
Flash Translation Layer for current devices is software-based
(implemented in iBoot and the kernel), which means that the CPU has
direct access to raw NAND memory. In this post we will describe how to
acquire a NAND image and use FTL metadata to recover deleted files on A4
devices. The information presented here is based on the great reverse
engineering work done by the iDroid/openiBoot team.

Reading the NAND memory

A NAND memory chip is made of multiple pages regrouped in blocks. Each
page is split into two parts: the actual data area and a smaller spare
area that contains error correcting information, and optionally
metadata. Pages can be read and written to, but must be erased first
before writing new data. This is the main physical limitation of NAND
storage. Erase operations are done at the block level: multiple pages
must be erased at once. Additionally, pages can only support a limited
number of erase/write cycles before presenting too many errors and
becoming unusable.

iOS devices use one or more identical chips addressed by their "chip
enable" number or CE. The actual geometry parameters (number of CEs,
number of blocks per CE, number of pages per block, page and spare area
sizes) depend on the device model and the total storage capacity. A
physical address is composed of the "chip enable number" (CE), and a
physical page number (PPN) on that chip.

Because of NAND limitations, translation mechanisms (FTL) are commonly
used to allow operating systems to use NAND memory as a regular block
device while optimizing its lifespan and performance under the hood. The
main goal is to reduce the number of erase operations, and spread them
evenly across all blocks. From a forensics point of view, the
interesting side effect is that when a logical block is overwritten at
the block device level, most of the time the older data is not erased
immediately at the physical level. Thus, working on the raw image of the
NAND can be very useful when searching for deleted data.

It is possible to read the raw NAND using openiBoot, but currently USB
transfers are quite slow, which makes it impractical for dumping the
whole Flash memory.

Starting with iOS 3, a program called ioflashstoragetool can be found
on Apple iOS ramdisks. This utility can perform many low-level
operations related to flash storage, and can read the raw NAND pages and
spare areas (without performing any kind of decryption). These
functionalities are exposed by the IOFlashControllerUserClient kernel
service.

In iOS 5, most of the functions exposed by this IOKit interface were
removed. To create a dump using this interface, we can boot a ramdisk
using an older iOS 4 kernel. This works well, however we then lose the
ability to use the existing kernel code to bruteforce the newer iOS 5
keybags. In order to dump the NAND with the iOS 5 kernel, we
re-implemented the part of the
IOFlashControllerUserClient::externalMethod function responsible for
the read functionality. When our NAND dumper runs under the iOS 5
kernel, it replaces this function with one that handles the
kIOFlashControllerReadPage selector.

Additionally, we can set the nand-disable-driver boot flag to prevent
high-level access to the NAND and make sure it is not modified during
the acquisition.

iOS Flash Translation Layer

The Flash Translation Layer used on iOS devices is based on the
Samsung Whimory FTL (WMR). The openiBoot team did a great job at
reverse-engineering it, so we could understand the mechanisms and
structures used. Whimory translation code is split into two layers: VFL
and FTL.

The Virtual Flash Layer (VFL) is responsible for remapping bad blocks
and presenting an error-free NAND to the FTL layer. The VFL layer knows
the physical geometry and translates virtual page numbers used by the
FTL to physical addresses (CE number + physical page number).

The FTL layer operates over VFL, and presents the block device interface
to the operating system. It translates block device logical page numbers
(LPNs) to virtual page numbers, handles wear leveling and garbage
collection of blocks containing outdated data.On devices that support
hardware encryption, all pages that contain data structures related to
VFL and FTL are encrypted by a static metadata key.

The two FTL subsystems have evolved through the
various devices and iOS versions:

Since iOS 4.1, some of the newer devices are equipped with PPN
(Perfect Page New) NAND that uses a specific PPNFTL

PPN devices have their own controller, running a firmware that can be
upgraded through the IOFlashControllerUserClient interface, but most
of the FTL work seems to still be done in software, using YaFTL on top
of a new PPNVFL.

Based on the openiBoot code, we wrote a minimal read-only Python
implementation of YaFTL/VSVFL to get the block-device view over raw NAND
images. Combined with a Python HFS+ implementation, this allows
extraction of the logical partitions to get the equivalent of a
dd-image. We then need to understand the YaFTL mechanisms in order to
leverage the additional data available in the NAND image.

YaFTL

YaFTL
is a page mapping FTL, where logical pages can be stored anywhere and in
any order on the physical media. It is quite similar to DFTL : page
mapping information (called index pages) is stored on Flash and cached
in memory on access.YaFTL splits the virtual address space presented by
the VFL layer into superblocks. A superblock can be seen as a "row"
of physical NAND blocks. There are 3 types of superblocks, based on the
type of pages they store:

Context pages store all the information required to initialize YaFTL,
including the userTocPages array that points to up-to-date index
pages. It also stores erase counters for wear-levelling.

Index pages store pointers to user pages

User pages contain block device data

The following figure summarizes the YaFTL translation process:

During normal operation, only one superblock of each type is “open” at a
given time: pages are written sequentially in a log-block fashion. When
the current superblock is full, YaFTL finds a free superblock to
continue the process. Outdated user data is only erased when the garbage
collector kicks in.

The last pages of Index and User superblocks are used to store the BTOC
(block table of contents). For User blocks, the BTOC lists the Logical
Page Numbers of all the pages stored in the block. For Index blocks, the
BTOC indicates the first LPN pointed by each index page.

FTL restore is performed at boot when the FTL was not unmounted properly
(after a kernel panic or hard reboot for instance) and the latest
context information was not committed to flash storage. The FTL restore
function has to examine all superblocks (using BTOCs to speed up the
process) in order to reconstruct the correct context.

Spare area metadata

iOS reserves 12 bytes in the spare area of each page to store metadata.
The YaFTL metadata format is described in
openiBoot:

The lpn field allows the FTL code to check if the translation was
correct when reading a page. It is also used during the FTL restore
process, to identify pages in “open” superblocks that do not have a
BTOC.

The usn field records the global update sequence number at the time
the page was written. This number is incremented every time a new
version of the FTL context is committed or when a superblock is full and
a new one is open. The usn allows to easily sort superblocks by age.

Metadata whitening

The hardware encryption only applies to the pages data and not to the
spare area. On recent devices running iOS 4, metadata stored in the
spare area is scrambled through a process called "metadata
whitening". The 12 bytes of the SpareData structure are XORed with
pseudorandom values depending on the physical page number. The algorithm
can be found in
openiBoot:

staticuint32_th2fmi_hash_table[256];...voidh2fmi_init(){...// This is a very simple PRNG with// a preset seed. What are you// up to Apple? -- Ricky26// Same as in 3GS NAND -- Blueriseuint32_tval=0x50F4546A;for(i=0;i<256;i++){val=(0x19660D*val)+0x3C6EF35F;intj;for(j=1;j<763;j++){val=(0x19660D*val)+0x3C6EF35F;}h2fmi_hash_table[i]=val;}...}error_th2fmi_read_single_page(...){...if(h2fmi_data_whitening_enabled){uint32_ti;for(i=0;i<3;i++)((uint32_t*)_meta_ptr)[i]^=h2fmi_hash_table[(i+_page)%ARRAY_SIZE(h2fmi_hash_table)];}...}

Whether metadata whitening is enabled or not is indicated in the flags
field of the NANDDRIVERSIGN special page.

Recovering deleted files

Once the NAND image is acquired and the geometry is known, we can start
digging for deleted files.The first step is to build a lookup table that
references all available versions of each logical page. Two methods can
be used:

Read every spare area in the image to find pages where type is
PAGETYPE_LBN, and read the lpn field (bruteforce approach)

Loop through each non-empty user superblock, read BTOCs when
available or scan all pages if the block is not full

Once this lookup table is built, we can then easily access all available
versions of a given logical page. In order to recover deleted files in
the data partition, we implemented a simple algorithm, similar to the
HFS journal carving technique :

list all the file IDs in the data partition in its current state : we
use the regular FTL translation, and use the EMF key to decrypt data

get the location of the current catalog file and attribute file
(ranges of LBAs)

for each LBA belonging to the catalog file

for each available version of the current LBA

read the page for this version, decrypt it with the EMF key

search the page for catalog file records whose file ID is not
present in the current file IDs list (deleted files)

repeat the same process on the attribute file to find the encryption
keys for the deleted files identified previously (cprotect extended
attributes)

for each deleted file found

loop through all the possible encryption keys and versions of the
first logical block until the decrypted contents matches a magic
number (common file headers magic). See the
isDecryptedCorrectly function (which can be improved).

if the decryption key and the USN of the first block are found,
read the next file blocks using that USN as reference. Another
method is to start reading pages starting from the first file
block we found, following the FTL “write order”: read until end of
superblock, and then continue in the next one with a higher USN
and so on until all file blocks are found.

This naive algorithm gives good results on “static” files like pictures,
where the whole file is written once and never updated. For files such
as SQLite databases it would require some more logic to recover
consistent snapshots of the successive versions, by detecting writes to
the file header or tracking modifications to the catalog file entry
(file modification date) for instance.

One file that could be interesting to recover is the system keybag. If
an attacker was able to access the first version of the system keybag
(when no passcode is set, right after the firmware restore), he could
then access all the class keys without having to attack the current
user's passcode. However, it is not possible to exploit older versions
of the system keybag because of a second layer of encryption: the
systembag.kb payload is encrypted with the BAG1 key, which is stored
in the effaceable area and regenerated randomly every time a new version
of the file is written to disk (when the user changes his passcode).This
mechanism was clearly designed to prevent such attacks, as explained in
the "Securing application data" talk from Apple WWDC 2010
(Session 209).

iOS 3.x wipe vulnerability

Once we had the ability to read the raw NAND contents, we took another
look at the iOS 3 mechanisms. At that time, the whole data partition
(including file contents) was encrypted with the EMF key, which was
stored (encrypted) in the last logical block of the partition. Since
there was no effaceable area at that time, we supposed that this last
logical block was managed by the FTL just like the rest of the
partition.By acquiring a NAND image right after wiping an iOS 3 device,
it is indeed possible to find multiple versions of this logical block:
one with the new EMF key generated during the wipe, and the older one
that was only overwritten at the block device level. Thus by using the
old key and collecting the old data partition pages (based on the USN of
the wipe), it is possible to (partially) reconstruct the wiped data
partition. Of course this vulnerability is fixed since iOS 4 with the
effaceable area that allows encryption keys to be erased securely.

The NAND acquisition and carving tools are now available on the
iphone-dataprotection repository. Additional details are also
available on the wiki. Finally, many thanks to Patrick Wildt and the
openiBoot team for their great work on the iOS FTL that allowed us to
build these tools.