9.7.10

Special Look: Face Time (part 3: Call Connection Initialization)

Introduction

In part 1 of this series evaluating the FaceTime protocol, we established that the FaceTime network traffic exchange looks like this:

Unknown TCP protocol starts the conversation (TCP/5223);

Unknown UDP traffic between the iPhone and two hosts with similar IP addresses (UDP/16385 and UDP/16386);

Certificate validation through an Akamai server (HTTP);

HTTPS request to an Apple server;

STUN traffic for NAT traversal;

SIP traffic for call setup, negotiation and authentication;

UDP stream data for video/audio (RTP streaming H.264 with AAC audio).

In part 2 we looked at the SIP and RTP traffic in more depth, identifying what I believe is a proprietary authentication protocol in the SIP MESSAGE verb and H.264 and AAC audio data in an RTP stream, extracting that data with videosnarf. Jason Ostrom, one of the authors of videosnarf has even indicated that they plan to work on getting video extraction working so we can record and play-back FaceTime calls.

In this installment of the series we’ll look at the unknown protocol that starts the FaceTime conversation over TCP/5223.

Traffic Analysis

Wireshark does a great job evaluating a packet capture and applying heuristics or standard port designations when applying packet dissectors. Sadly, the FaceTime traffic over TCP/5223 is not interpreted any further than the TCP layer, as shown below (due to some lost traffic during my 888-Facetime packet capture, I’ve switched to a different capture which was more complete):

We’ll have to apply our own creativity to evaluate this traffic further. First, Wireshark’s wonderful TCP stream reassembly feature gives us the ability to view the TCP exchange in a hexadecimal view, with the option to save the data in binary format (“Raw”), ASCII, hex-dump or even C Arrays (great for taking data and dumping it into a C tool for manipulation, or otherwise modifying it to work with Python or other popular languages).

Although obviously a binary protocol (e.g. non-ASCII based) we can see plaintext strings that look similar to certificate content. This is a common characteristic of SSL-based protocols, though Wireshark wasn’t able to identify this automatically. Fortunately, Wireshark is also an extremely flexible tool with a little know-how. Using the “Analyze | Decode As” feature, we can tell Wireshark to treat this traffic as SSL-encrypted to gather a bit more information from the protocol.

First, select one of the packets of the exchange that you want to decode using an alternate protocol and click Analyze | Decode As. From the Wireshark: Decode As menu, select the Transport tab. Specify that both ports should be decoded as SSL, as shown below:

Clicking “Apply” will cause Wireshark to reload the capture data, applying the SSL decoder to the specified port pair, as shown.

One of the great features of the Wireshark SSL dissector is that it will do stream reassembly for us, giving us the option to extract data even if it is transmitted across multiple TCP segments. For example, in the screen-shot above I’ve selected the certificate information, highlighting the bytes in the hex view below. For any highlighted data in Wireshark, we can export it to a binary file by selecting “File | Export | Selected Packet Bytes”. In the Export Raw Data dialog, save the data with the filename extension “.der” to allow Windows to open it as a certificate.

Double-clicking on the file with the “.der” (or “.cer”) extension will open the certificate viewer. We can navigate the certificate details to gather some additional information about the server service.

A few points of interest from this certificate:

Issued to courier.push.apple.com by Entrust on April 13, 2010;

Key use is for Digital Signatures and Key Encipherment (e.g. key encryption)

Enhanced Key Usage indicates that it is valid for Server and Client authentication (e.g. mutual authentication)

Other certificates are also delivered through this exchange, including the root certificate for Entrust.

A Gentle Tap

Curiosity getting the better of me, I decided to give the Apple server at 17.149.37.6 a “gentle tap” to find out more about the authentication requirements here. One of my favorite tools is “openssl”, the binary that ships with the OpenSSL suite. We can use this tool to connect to SSL services, extracting debug information as shown:

I’ve filtered out the hex-dump data with grep, leaving us just the informational messages in this output. The traffic marked with “>>>” is from my system to the Apple server, “<<<” is from the Apple server to my client.

First, my system attempts to do a SSL 2.0 negotiation sending a CLIENT-HELLO message. Apple’s server responds with a TLS 1.0 ServerHello response, followed by the certificate information (such as we saw earlier). Following this delivery, Apple’s server sends a CertificateRequest to my client. My client sends an empty certificate response (as indicated with a length of 7 bytes) and tries to complete the ClientKeyExchange without the use a client-side certificate. The Apple server rejects this with a fatal “handshake_failure” and terminates the connection.

From this exchange we can see that this TLS protocol uses mutual certificate authentication; a certificate on the Apple server from Entrust and a certificate on the iPhone to complete the exchange. This is interesting since Apple has stated that FaceTime will be an open protocol, but will apparently require a client-side certificate to connect to the Apple server, which gives them a grant/deny option for all connections on a per-device basis. Steve Papa Esteban is no dummy (here’s looking at you, Android users!)

Client-Side Certificate

Returning to the Wireshark capture decoding SSL traffic over TCP/5223, we can extract the client certificate sent from the iPhone to the Apple server using the technique detailed above.

More interesting observations are now possible:

The iPhone client certificate is issued by the “Apple iPhone Device CA”;

The iPhone client certificate common name (CN) is a GUID, likely generated at the factory;

Key constraints are for authenticating the iPhone as a device entity;

Key usage is similar to the Apple Server certificate, intended for digital signatures and key encipherment.

I Probably Should Have Started Here

I probably should have started here, but it would have been much less fun. The Apple well-known TCP and UDP ports list used by Apple products indicates that TCP/5223 is used for XMPP over SSL. XMPP is the Extensible Messaging and Presence Protocol, the formal name for Jabber. Apple indicates that TCP/5223 is used for authentication in unencrypted Jabber conversations, as well as for authentication and data exchange for SSL-protected Jabber sessions.

From this analysis, we can determine that FaceTime uses XMPP to authenticate and establish a connection to an Apple “Jabber” server. Although I don’t have a packet capture for the remote session, I imagine that some kind of GSM message is sent from the initiating device to the responding device to have both devices join the Jabber server, authenticate and exchange data that initiates the FaceTime conversation including the subsequent SIP exchange. Due to the use of certificate-based mutual authentication, it’s unlikely that anyone will be sufficiently reproducing the FaceTime protocol on another device without Apple’s assistance for certificate issuance.

Evil Thinking For Future … Evil

I’ll leave you with a final thought to consider for future evildoing. The private portion of the certificate used for XMPP authentication by the iPhone is stored on the iPhone device; unless the iPhone uses a TPM, it is probably stored somewhere on the file system. If you were to jailbreak your iPhone 4g and extract that certificate, you could likely use a standard Jabber client to connect to the Apple Jabber server and monitor the activity there, including the connections on who is joining and leaving the network. Maybe even setup a Jabber Bot and automate your evil manipulation of Apple’s server.

There’s someone knocking very loudly at my door, so that’s it for me today. Next time we’ll catch up on the HTTPS traffic and more FaceTime analysis fun.