Inside iPhone 4: FaceTime video calling

Playing up his characteristic "one more thing" showmanship, Apple chief executive Steve Jobs introduced FaceTime for the new iPhone 4 as an easy to use video chat app that works over WiFi. Here's why it matters, how it's open, why it's currently WiFi only, and how it stacks up to other Voice over IP video calling apps such as Skype.

FaceTime & iChat AV

Apple revealed FaceTime as an iChat-like service exclusive to the new iPhone 4 hardware. It's not exactly iChat though, and although it shares a lot in common, there's currently no talk of any ability to chat from iPhone 4 to desktop Mac iChat clients (although this is almost certain to happen over the next year as iPhone 4 launches).

Apple's Mac iChat was originally an IM client for AOL's proprietary AIM network. Apple later extended iChat to support open XMPP "Jabber" instant messaging. It then added support for the Internet Engineering Task Force's SIP (Session Initiation Protocol) in iChat AV, in order to do standards-based video chat and video conferencing.

As a video conferencing product, Apple's Mac iChat AV client provides exceptional picture quality at little cost but runs into a lethal natural barrier on today's Internet: NAT (Network Address Translation). The routers at both corporate and home networks often hide internal IP addresses from the open Internet, making things difficult for video chat applications that want to act as both a server and client with bidirectional rich media streams to some other host across the Internet.

The NAT problem

For iChat AV to reliably connect with other clients (including compatible PC clients running the same complex suite of video chatting standards, such as AOL) across the Internet, it usually has to transverse NAT. That's particularly complex because everyone's NAT works a bit differently, and there's so many technical issues involved with handling different types of routers and their different implementations of NAT.

There are different kinds of NAT, and no complete standards in place on how to implement them for ideal interoperability. Additionally, the security policy a company establishes for itself might rule out individuals from setting up their own server, which is a problem for video chat because iChat AV needs to act like a server for a remote client to initiate a transaction with it.

Apple's iChat uses its own SNATMAP protocol to allow a client to determine its external IP address and open a port mapping that remote hosts can use to return communications through the firewall. Apple also uses UPnP (Universal Plug n Play) a Microsoft-originated standard for NAT port traversal supported by a variety of consumer router/firewall makers.

These are used to punch iChat AV's traffic through NAT routers, but they aren't always supported by enterprise routers or some models of home router appliances. In Mac OS X Leopard, Apple improved things by adding support for ICE (Interactive Connectivity Establishment), an emerging IETF NAT traversal standard, but there are still vexing problems for non-technical consumers trying to set up a simple video chat.

Making FaceTime open

Apple faces the same kinds of problems in getting video calls to work on the iPhone. So do other vendors. Apple wants to make mobile video chat an open standard for interoperable video chat sessions, so it adopted the neutral FaceTime name rather than calling the service iChat, which is very much an Apple-sounding name.

Essentially however, FaceTime is iChat AV for iPhone. Jobs presented an "alphabet soup" of technologies that were involved in making FaceTime work, many of which are shared with iChat AV, including:

H.264 and AAC, its ISO/MPEG video and audio codecs (just like iChat).

SIP (Session Initiation Protocol), the open IETF signaling protocol for VoIP used by iChat AV.

STUN (Session Traversal Utilities for NAT), an IETF standard for dealing with lots of different kinds of NAT.

TURN (Traversal Using Relay NAT), an IETF standard for allowing a client behind NAT to receive incoming requests like a server.

ICE (Interactive Connectivity Establishment) an IETF standard which helps set up connections through NAT firewalls.

SRTP (Secure RTP) an IETF standard designed to provide encryption, message authentication and integrity for the data streams.

Rather than being some radically new protocol for video chat, Apple's FaceTime is an evolution of iChat's standards-based foundations, which have already been implemented by AOL in a compatible client on the desktop PC. It's therefore no stretch to think that other phone vendors will work to create compatible FaceTime clients that work with iPhone 4 phones, and it would be very surprising if Apple's own iChat AV wasn't adapted to work with the latest FaceTime protocols to enable desktop to mobile video calls at some point.

The companies that need to buy into FaceTime are networking gear companies like Cisco (who already work to support the IETF protocols involved) and phone manufacturers like Nokia, RIM, HTC and Motorola (who are already working hard to match the iPhone's features, look, and specifications). The best way for Apple to push FaceTime would be to deliver an open source implementation of the core technology stack, much like it delivered WebKit, and much like BSD provided the world a standard IP networking stack.

Apple understands the success of WebKit, but it's not yet clear that it's ready to give away software to competitors when it doesn't absolutely have to. That might result in a variety of implementations of FaceTime-compatible devices that all have various bugs that impede interoperability. Of course, such a situation might benefit Apple, too, making it the primary vendor of reliable FaceTime phones.

On page 2 of 2: Why FaceTime is WiFi only, What about Skype and Fring?

Apple similarly pushed Internet email on the iPhone in preference to SMS and MMS mobile standards, which continue to charge archaic per message fees wildly out of proportion to the actual amount of data they deliver.

I am all in favour of 'facetime' once it can work with desktop clients, and ideally also with non iPhones.
I just want to point out that the argument above is, (SMS at least), untrue in all but a few situations.
I know that US SMS fees can be ridiculous, but for the most part texting (Europe, Asia) is essentially free. Most iPhone tariffs here in the UK - and most mobile phone contract tariffs - include a vast number of texts. On the other hand, e-mails are not an immediate form of communication to most people. My mum doesn't get e-mails on her phone, my dad gets so many he is liable to ignore them. SMS have a different purpose, and Apple was wrong.

With FaceTime I hope that they are right: or rather, I hope that they get past the teething problems so that we can integrate with iChat/Skype/Fring/whatever. I know that many people will not use it, but many of us (busy parents or people with international friends and relatives) will.

Jingle is an extension to the Extensible Messaging and Presence Protocol (XMPP). It implements peer-to-peer (P2P) session control (signaling) for multimedia interactions such as in Voice over Internet Protocol or videoconferencing communications. It was designed by Google and the XMPP Standards Foundation. The multimedia streams are delivered using the Real-time Transport Protocol (RTP). If needed, NAT traversal is assisted using Interactive Connectivity Establishment (ICE).http://en.wikipedia.org/wiki/Jingle_(protocol)

Unlike most instant messaging protocols, XMPP is an open standard. Like e-mail, it is an open system where anyone who has a domain name and a suitable Internet connection can run their own XMPP server and talk to users on other servers. The standard server implementations and many clients are also free and open source software.http://en.wikipedia.org/wiki/Extensi...sence_Protocol

1. If Apple is really trying to open up the standard, why not enable FaceTime to iChat conversation, with both video and audio? Why limit it to iPhone 4 <=> iPhone 4?

2. Why not allow iPhone 3GS to iPhone 4 conversation? Remember that iPhone 3GS has the rear camera, that could be enough for many situations? I will not buy the argument that iPhone 3GS CPU is not capable enough. Just last year, it was the *screaming fast* processor, remember?

3. Why there is no mention about third party apps? Can they use the cameras to do a FaceTime like app? For instance, can Skype make use of the cameras?

4. Why no mention about any overlapped *text* communication? Many times its easier to copy_paste some text rather than spell it out during a audio/video conversation?

I love Apple, but let's face it, it's not playing the "open" standards game for world peace.

1. If Apple is really trying to open up the standard, why not enable FaceTime to iChat conversation, with both video and audio? Why limit it to iPhone 4 <=> iPhone 4?

2. Why not allow iPhone 3GS to iPhone 4 conversation? Remember that iPhone 3GS has the rear camera, that could be enough for many situations? I will not buy the argument that iPhone 3GS CPU is not capable enough. Just last year, it was the *screaming fast* processor, remember?

3. Why there is no mention about third party apps? Can they use the cameras to do a FaceTime like app? For instance, can Skype make use of the cameras?

4. Why no mention about any overlapped *text* communication? Many times its easier to copy_paste some text rather than spell it out during a audio/video conversation?

I love Apple, but let's face it, it's not playing the "open" standards game for world peace.

Are we not reading the same article?

1) This isn't even out yet and you're complaining that it doesn't already exist on Macs, too. It was just demoed yesterday as the last feature and new killer feature for the iPhone 4.

2) Who said it's only for the iPhone 4? It's included in the iPhone 4, but if they are submitting it to an open standards body it's obviously not limited only to iPhone 4. The only thing is that Apple isn't including it in the other iPhones with iOS v4.0.

3) Again, it was demoed just yesterday and they stated they were submitted it to today. I have a hard time understand what you missed from the presentation and the article but it appears there is nothing preventing anyone from using FaceTime. Apple may prevent apps in the App Store or prevent it from being using non-iPhone 4 phones (though I doubt it) but that would be another issue altogether.

4) Likely because there isn't one in the spec. That does not mean you can't also include text in your app which doesn't require all those protocols to function. Just look at iChat A/V or any other A/V messaging app. BTW, if's it's easier to copy/paste something rather than "spelling it out" then why the hell would you initiate an A/V conversation to ignore an easier solution?

Of course Apple are doing this for their own goals, but that doesn't mean it's not good for the future of technology.