How Many Voice Callers Fit on the Head of an Access Point?

An Analysis of the Theoretical Maximum Capacity of Access Points to Carry VoIP Telephone Calls on 802.11a, b, and g

Over the summer, I came across a capacity calculation in the manual for a Cisco Voice over IP phone detailing the number of simultaneous calls that can be supported on an access point. Intrigued, I extended the analysis. Voice and data on wireless LANs require opposing preconditions for good performance. High-quality voice requires that frames containing voice data can be transmitted very quickly after arrival, and they need to be transmitted on a very regular schedule with tight timing requirements. Good data throughput comes from stuffing the transmission queue as full as possible. Individual frames might suffer long delays, but the overall capacity is high. Voice quality is often very sensitive to network load.

This article develops a simple model to determine the maximum theoretical capacity of an access point to carry voice calls. (Unlike the previously cited Cisco calculation, it does not attempt to model the effects of medium contention.) Most 802.11-based phones use 802.11b, which has a maximum capacity of 23 telephone calls. 802.11a and 802.11g have somewhat higher capacities, even at comparable data rates, due to a more efficient physical layer design. As with data, 802.11g must reduce its effective data rates in the presence of older devices. Protection overhead reduces the voice capacity by roughly a quarter.

Voice Codecs

A codec is the component of any voice system that translates between analog speech and the bits used to transmit them. Every codec transmits a burst of data in a packet that can be reconstructed into voice. Each burst in a codec covers a short period of time, typically 20 milliseconds, though some codecs have adaptive periods. Within the codec, there are many different ways of encoding the analog data. Even though two codecs may have the same payload size, they may have different performance characteristics due to their coding methods. Several different codecs are commonly used, and many of them have found application in VoIP systems.

Common codecs:

G.711. This 64kbps codec is one of the oldest, which is the reason why voice calls are sometimes referred to as 64kbps streams. When you hear references to "toll-quality voice," the code is G.711. Many VoIP providers use G.711, though some also make judicious use of lower-bandwidth codecs.

G.729. Not all telephony networks have the capacity for G.711 codecs, which led to the standardization of the G.729 codec. It encodes voice as an 8kbps stream to make more efficient usage of the network.

G.723.1. This is an even lower-rate codec that can carry a call at about 6kbps. Furthermore, it increases the period from the previous two codecs so that less header information is required on the network.

GSM Full Rate (FR) and Enhanced Full Rate (EFR). The GSM system developed its own codecs for use in mobile telephony networks. They used an extremely low bandwidth for the time they were designed.

Skype. All of the previous codecs have poor robustness. When frames are lost, the voice becomes choppy. According to this analysis (PDF), Skype uses two codecs by Global IP Sound. One, the Internet Low Bit Rate Codec (iLBC), is an open standard with a freely available reference implementation. iLBC has two modes for low bandwidth operation, and was designed to be more robust than traditional telephony codecs in the face of frame loss.

To move a phone call across an IP network, the codec data is encapsulated within the Real-time Transport Protocol (RTP). RTP is a UDP-based protocol, which leads to a three-level encapsulation of the codec data. Each layer adds its own header information. The data is carried within an RTP packet (with a 12-byte header), which is carried in UDP (with an 8-byte header), which is carried in IP (with a 20-byte header). The overhead for each level of encapsulation can require significant additional capacity. With a codec operating every 20ms, there will be 50 packet headers per second. Each header is 32 bytes, so the RTP encapsulation adds 16kbps. Table 1 shows the characteristics of each of the common codecs. Codecs are typically described for a one-way stream. For a telephone call, two voice streams are needed, and the total data rate would need to be doubled.

Table 1. Codec comparison

Codec

Period
(ms)

Payload size
(bytes)

Packet size
(bytes)

Payload data
size (bytes)*

Total data
rate (kbps)*

G.711

20

160

200

64

80

G.729

20

20

60

8

24

G.723.1

30

24

64

6.4

17

GSM FR

20

33

73

13.2

29.2

GSM EFR

20

31

71

12.4

28.4

iLBC 20 ms

20

38

78

15.2

31.2

iLBC 30 ms

30

50

90

13.3

24

*In telephony, a kilobit per second is 1,000 bits per second, not the 1,024 bits per second that would be more common in computing. Thus, a 64kbps codec uses 64,000 bits per second, not 65,536 bits per second.