BLOG

November 19, 2015

Back in 2012, Mark and I detailed a number of iOS kernel mitigations that were introduced in iOS 6 to prevent an attacker from leveraging well-known exploitation techniques such as the zone free list pointer overwrite. Most of these mitigations rely on entropy (of varying degree) provided by the kernel, and are therefore supported by a separate random number generator known as the early_random() PRNG. As this generator is fundamental to the robustness of these mitigations, and has received additional improvements in iOS 7, it is unarguably a very interesting target that deserves further study.

The initial version of the early random PRNG, found in iOS 6, leveraged a fairly simple generator that derived values directly from the CPU tick count and a seed (provided by iBoot). Although the generator was able to create somewhat unpredictable values, it had a serious defect in that the outputs were well-correlated, especially in the case of successively generated values. Additionally, the seed was only combined with the higher 32 bits of the output, hence the lower bits (typically the only part used on 32-bit iOS devices) were unaffected by the seed value. Thus, in an attempt to improve the early random PRNG in iOS 7, Apple decided to leverage an entirely new generator. Specifically, iOS 7 uses a linear congruential generator, a PRNG well-known both for its strengths and (notable) weaknesses.

An LCG's quality is essentially determined by its choice of parameters. Although the early random PRNG is clearly inspired by glibc random_r()and ANSI C rand(), it is alarmingly weak in practice. Notably,early_random() in iOS 7 can only produce 2^19 unique outputs, with a maximum period of 2^17 (length of sequence of unique outputs, before it starts over). This is well below the size of the possible output space (64-bits), and may allow an attacker to predict values with very little effort. In particular, we found that an unprivileged attacker, even when confined by the most restrictive sandbox, can recover arbitrary outputs from the generator and consequently bypass all the exploit mitigations that rely on the early random PRNG. These findings have been detailed in the following slides and white paper, and includes suggestions on how to improve early_random() in future iOS versions.

Privacy is a hot topic at the moment - it continues to dominate the headlines as news of new NSA incursions, celebrity phone hacks, and corporate breaches are being reported on an increasingly regular basis. In response to this, a number of products have been brought to market that attempt to provide consumers with a greater level of privacy than typical devices allow for. In the phone market, one of the premier products to be released in recent years is undoubtedly the BlackPhone (http://www.blackphone.ch), which has been cited numerous times in tech publications as being one of the best available defenses against mass surveillance, as it provides full end-to-end encryption facilities for voice calls and text/MMS messaging.

While exploring my recently purchased BlackPhone, I discovered that the messaging application contains a serious memory corruption vulnerability that can be triggered remotely by an attacker. If exploited successfully, this flaw could be used to gain remote arbitrary code execution on the target's handset. The code run by the attacker will have the privileges of the messaging application, which is a standard Android application with some additional privileges. Specifically, it is possible to:

decrypt messages / commandeer SilentCircle account

gather location information

read contacts

write to external storage

run additional code of the attacker's choosing (such as a privilege escalation exploit aimed at gaining root or kernel-mode access, thus taking complete control of the phone)

The only knowledge required by the attacker is the target's Silent Circle ID or phone number - the target does not need to be lured in to contacting the attacker (although the flaw is exploitable in this scenario as well).

This issue is now patched by both Silent Circle and Blackphone in the respective App Stores / Product updates.

The remainder of this post discusses the technical details of this vulnerability, citing the source code of the vulnerable application where appropriate. This code is available from Silent Circle's github repository (https://github.com/SilentCircle).

SilentText Messaging Application

The SilentText application bundled with Blackphone (and also made available as a standalone app for Android and iPhone) provides the ability for users to send text messages and share files over an encrypted channel. This encrypted channel is established and managed using the 'Silent Circle Instant Message Protocol' (SCIMP), which is tunneled over Silent Circle's XMPP servers. SCIMP provides end-to-end encryption, so that data exchanged in a given conversation cannot be decrypted by an eavesdropping third party (including Silent Circle). The SCIMP implementation supplied with SilentText contains a type confusion vulnerability, that allows an attacker to directly overwrite a pointer in memory (either partially or in full), which when successfully exploited can be used to gain remote, unauthenticated access to the vulnerable device.

Before discussing the vulnerability itself, a quick overview of SCIMP and YAJL (Yet Another JSON Library - a third party library relevant to the flaw) is provided.

The SCIMP Protocol

SCIMP is a simple message-oriented protocol, where messages are encoded as JSON objects, and then sent over XMPP. SCIMP messages are distinguished by a fixed header string "?SCIMP:", followed by a base64-encoded JSON object, followed by a terminator ("."). An example message looks like this.

SCIMP messages have a message type, followed by a number of data fields depending on the message type. Message type can be one of the following:

commit - sent by the initiator wishing to establish a new session. Also can be used to re-key an existing session.

dh1 - sent by the remote party (responder) in response to a commit message as part of session establishment.

dh2 - sent by the initiator in response to a dh1 message as part of session establishment.

confirm - sent by the responder in response to a dh2 message indicating that session establishment was successful.

data - application-level data sent after a secure session has been established.

These messages are encoded in JSON using a single map object with the name of the map indicating the message type, and a variable number of string or integer-based variables within the map relevant to the specific message type. A JSON-encoded SCIMP message is shown.

The JSON serialization and de-serialization is handled by a third-party library named "Yet Another JSON Library", or libyajl. The source code for this library is available at http://lloyd.github.com/yajl. Understanding the basics of this API is relevant to the discovered SCIMP vulnerability, and so is briefly covered here.

JSON Parsing - The YAJL API

The YAJL library is initialized with a call to yajl_alloc(), which has the following prototype.

This function creates an opaque yajl_handle that is later passed as a parameter to yajl_parse(), the function responsible for parsing a block of JSON text. The first parameter of the yajl_alloc() function is a yajl_callbacks structure, which allows the caller to define a series of callback functions that will be invoked during JSON parsing when certain elements are encountered. The yajl_callback structure is as follows.

/** yajl is an event driven parser. this means as json elements are * parsed, you are called back to do something with the data. The * functions in this table indicate the various events for which * you will be called back. Each callback accepts a "context" * pointer, this is a void * that is passed into the yajl_parse * function which the client code may use to pass around context. * * All callbacks return an integer. If non-zero, the parse will * continue. If zero, the parse will be canceled and * yajl_status_client_canceled will be returned from the parse. * * \attention { * A note about the handling of numbers: * * yajl will only convert numbers that can be represented in a * double or a 64 bit (long long) int. All other numbers will * be passed to the client in string form using the yajl_number * callback. Furthermore, if yajl_number is not NULL, it will * always be used to return numbers, that is yajl_integer and * yajl_double will be ignored. If yajl_number is NULL but one * of yajl_integer or yajl_double are defined, parsing of a * number larger than is representable in a double or 64 bit * integer will result in a parse error. * } */ typedef struct { int (* yajl_null)(void * ctx); int (* yajl_boolean)(void * ctx, int boolVal); int (* yajl_integer)(void * ctx, long long integerVal); int (* yajl_double)(void * ctx, double doubleVal); /** A callback which passes the string representation of the number * back to the client. Will be used for all numbers when present */ int (* yajl_number)(void * ctx, const char * numberVal, size_t numberLen); /** strings are returned as pointers into the JSON text when, * possible, as a result, they are _not_ null padded */ int (* yajl_string)(void * ctx, const unsigned char * stringVal, size_t stringLen); int (* yajl_start_map)(void * ctx); int (* yajl_map_key)(void * ctx, const unsigned char * key, size_t stringLen); int (* yajl_end_map)(void * ctx); int (* yajl_start_array)(void * ctx); int (* yajl_end_array)(void * ctx); } yajl_callbacks;

ctxctx parameter passed by the called as the third parameter toyajl_alloc(), and can be anything the caller wishes.

Finally, it is possible for the caller to specify a custom allocator from which to allocate blocks of memory used to store JSON strings and so on. This can be achieved by filling out the second parameter to theyajl_alloc() function with callbacks to custom allocation and free routines. The yajl_alloc_funcs structure that encapsulates these callbacks is defined as follows.

/** A structure which can be passed to yajl_*_alloc routines to allow the* client to specify memory allocation functions to be used. */typedef struct{ /** pointer to a function that can allocate uninitialized memory */ yajl_malloc_func malloc; /** pointer to a function that can resize memory allocations */ yajl_realloc_func realloc; /** pointer to a function that can free memory allocated using * reallocFunction or mallocFunction */ yajl_free_func free; /** a context pointer that will be passed to above allocation routines */ void * ctx;} yajl_alloc_funcs;

After a handle is allocated, a block of JSON text can be parsed by calling yajl_parse(), which has the following API.

As can be seen, this function simply takes a previously-createdyajl_handle, followed by the block of text to be parsed and its length. Calling yajl_parse() on a block of text will result in the user-specified callbacks defined earlier to be called as each JSON element is encountered. To illustrate how the YAJL callback API is used in action, consider the following JSON block.

{"key1": 12345,"key2": "valueString","key3": { "innerkey": 67890 }}

When parsing the above block, the following sequence of callbacks will be invoked:

yajl_start_map() - called when parsing the initial "{" tokenyajl_map_key() - called when parsing "key1"yajl_integer() - called when parsing "12345"yajl_map_key() - called when parsing "key2"yajl_string() - called when parsing "valuestring"yajl_map_key() - called when parsing "key3"yajl_start_map() - called when parsing the "{" token following "key3"yajl_map_key() - called when parsing "innerkey"yajl_integer() - called when parsing "67890"yajl_end_map() - called when parsing the "}" token following "67890"yajl_end_map() - called when parsing the final "}" token

Vulnerability Details

The vulnerability in question occurs during JSON deserialization of incoming SCIMP messages within libscimp, which is performed by thescimpDeserializeMessageJSON() function (defined insrc/SCimpProtocolFmtJSON.c). The code is shown (edited for brevity).

This function simply allocates and zeroes out a context structure (jctx, which is a SCimpJSONContext structure), establishes a YAJL handle, then invokes the yajl_parse_function(). The following code snippet shows the functions sParse_start_map() and sParse_end_map() - the pair of functions that are invoked whenever an opening brace ('{') or a closing brace ('}') is encountered respectively.

This is a very simple structure used to contain a SCIMP message - the type is denoted by the msgType field, and the data values for the message are stored within the union, which is interpreted depending on the value of the msgType field. The message type structures contained within the above union are as follows.

After initializing jctx->msg, sParse_start_map() then increments thejctx->level integer. This integer indicates the nesting level within the JSON message that is currently being examined. ThesParse_end_map() function performs the corresponding decrement operation on the jctx->level variable, to indicate the end of the currently nested block. This nesting level is utilized by the sParse_map_key()callback also passed to YAJL, to indicate whether the key is a message type (occurs when nesting level is 1), or a variable name (occurs when nesting level is anything else):

Assuming a recognized message type is received, the msgType field of the SCimpMsg structure originally allocated in sParse_map_start() will be initialized with a constant denoting the type of the message. The value following the message type should be a sub-map, whose keys will be parsed by the sParseKey() function (since the jctx->level variable will be set to 2 when parsing this submap). The sParseKey() function is as follows.

As can be seen, depending on the message type, jctx->jType and jCtx->jItem will point to the relevant field within the SCimpMsg structure to be filled out, along with information about what type of data the given field should be. These fields are then filled out by the sParse_string() and sParse_number() callbacks passed to YAJL, which are shown.

The problem with the way SCIMP employs the YAJL library is that the sParse_map_key() function will call the sParseMsgType() function potentially more than once while parsing a single JSON block, which will have the result of altering the message type even after fields in the SCimpMsg union have been filled out by sParse_string()/sParse_number(). For example, consider what happens when the following message is received:

This message will be parsed by the YAJL library, resulting in the following callbacks:

sParse_start_map() - Allocates jctx->msg, zeroes it, and sets jctx->level to 1sParse_map_key() - indicates that this is a "dh2" messagesParse_start_map() - Increments jctx->level to 2sParse_map_key() - Parse and set pk and pkLen fields in the SCimpMsg structuresParse_map_key() - Parse the "maci" field and set Maci[] arraysParse_end_map() - Decrements jctx->level to 1sParse_map_key() - indicates this is a "data" message ** OVERWRITES msg->msgType field with a new type ***sParse_start_map() - Increments jctx->field to 2sParse_map_key() - Parses the seq field, which is a 16-bit integer. This field overlaps with the low 2 bytes of the "pk" pointer allocated previouslysParse_end_map() - Decrements jctx->level to 1sParse_map_key() - Indicates this is a "dh2" message, again overwriting msg->msgType field. Now we have a dh2 message, with the pk pointer modified with arbitrary contents by the attacker.

By resetting the jctx->msg->msgType field with the "dh2" attribute at the end of the message, a type confusion vulnerability will occur where the seq fields supplied in the "data" message will be incorrectly interpreted as the pk field - a raw memory pointer. (In this case, the low two bytes have been set to 0x8080.) Note that by utilizing messages other than "data", we could arbitrarily modify the entire pointer (and the pkLen field, indicating how much data pk points to). Assuming that we are at the correct phase of protocol negotiation, sending this message results in the following crash:

The highlighted line shows the invalid address 0x601b8080 passed to free(), which is the pointer we corrupted. By manipulating the mechanics of the underlying heap or of application data structures themselves, it is possible to leverage this flaw to gain arbitrary code execution.