The IP protocol stack is comprised of four layers, namely Subnetwork, IP, Transport, and Application, as described in Part I (of this book). Each layer provides service(s) to the layer above it, and depends on the service(s) offered by the layer below it. The next several chapters will focus on the subnetwork layer of the IP model. Both LAN and WAN subnetwork technologies will be described.

What services does IP need from the subnetwork layer? IP is an unreliable datagram service, so the link layer need not offer reliable delivery or any performance guarantees. What IP needs is simply a service that will transport its packets from one IP-speaking device to the next. However, as the 1990s draw to a close, more and more QoS1-and CoS2-aware subnetwork technologies are being developed, such as ATM LAN Emulation (specifically LANE version 2.0), and IEEE 802.1p/Q-enhanced switched LANs, specifically 10/100/1000 Mbps Ethernet. Additionally, the IP "Type of Service" byte has been redefined to implement so-called "Differentiated Services," which may allow different classes of service within an IP-based network, other than the default which is "best effort." At the same time, applications of IP are being developed that will leverage these new capabilities, perhaps providing enhanced "multimedia" services over "converged" networks.3

Traditional IP-enabled applications are for "data" transfers, in varying amounts, with generally few time-related performance constraints, except for interactive applications such as Telnet, the Internet's virtual terminal protocol. Broadly, applications can be classified as in Figure II.1.

In the Internet of the late 20th century, data-oriented applications are clearly dominant, as they have been since the Internet's inception. Rather than leave the impression that data protocols have no timing requirements, note that Telnet and other protocols, with whose applications users interact directly, are delay-sensitive for that reason alone. If these protocols perform sluggishly, the users will be unproductive, and perhaps unhappy, too.

Emerging applications are creating requirements for IP-based networks to support enhanced classes of service. A small list of such emerging IP-based applications include: voice-over IP (VoIP), video- and audio-streaming, and video- and audio-conferencing.

Figure I: I.1 Broad classes of Internet applications.

The Internet enhancements required to support these new applications often boil down to some statistical assurance that most packets will arrive within a certain maximum amount of time (i.e., packets will have bounded delay). Time-sensitive applications such as interactive voice may also require that a certain bandwidth be reserved, or that the variation of packet delay (also known as jitter) be kept within certain levels.

The topic of multimedia networking has already filled many books, and for now it will suffice to observe that such applications appear to be the Internet's "Next Big Thing." Certainly, VoIP is real enough that Internet Telephony Service Providers (ITSPs) are sprouting up, such as L3, Qwest, and others. Even established telephony providers such as Bell Atlantic and AT&T are making significant investments in VoIP technology. These companies are willing to make billion-dollar-scale investments in the technology.

If the Internet can be successfully enhanced to support voice, which is one of the most demanding of all "multimedia" applications, the floodgates may open on a whole new class of both data and nondata applications that can leverage these new capabilities.

Subnetwork Functions

Encapsulation and Framing

At a minimum, WAN and LAN link layer protocols provide for the encapsulation and transmission of higher-layer protocol packets, including IP packets. The link-layer encapsulation enables the higher-layer protocol's packet to travel through the subnetwork medium and be distinguished from other packets, which may be IP packets or packets of some other protocol stack. Generically, many of the subnetwork technologies (both LAN and WAN) have frames that follow the format illustrated in Figure II.2.

Figure I: I.2 Generic subnetwork frame format

Framing, or the process of prepending a data-link header (and, optionally, appending a trailer) to the higher-layer protocol packet, provides for the synchronous transmission of large sequences of bits. Each frame starts with a pattern that allows for the destination station to synchronize its clock with the transmitter.4

The frame's synchronization pattern, which is transmitted immediately prior to the actual start of the frame, allows the receiver's hardware to synchronize itself to the transmitter's exact frequency. Once the receiver has been synchronized to the transmitter's frequency, it can then maintain synchronization throughout the duration of the frame. Synchronous subnetwork protocols, such as we have been discussing, require that timing be maintained over an entire frame (up to many thousands of bits). Clock recovery begins at the framing sequence and then the bits are transmitted such that the bit stream is self-clocking, once synchronization has been established.

Character-based asynchronous5protocols transmit a character at a time, each with its own start and stop bit(s). The start and stop bits permit synchronization to be maintained for the temporal duration of the character--usually, but not necessarily, 8 bits.

Why is synchronization important? If a LAN is supposed to run at 10 Mbps, there is no guarantee that every station will have precisely identical clock frequencies. This is due to manufacturing differences, power supply voltage differences, temperature variations, and other environmental variables such as the temperature, the age of the clock chip, and possibly even the ambient humidity. Considering these real-world factors, it would be a bad idea for each station to use its own local clock to receive data bits from another sender, since the transmitted bit timing would almost certainly not match up with the receiving station's bit frequency, ensuring data corruption.

Subnetwork-Layer Addressing

The addressing information that is carried in the subnetwork-layer header identifies the subnetwork-layer destination address (DA) as well as the subnetwork-layer source address (SA).6IEEE LANs commonly use 48-bit (i.e., six-byte) addresses, though you may see 16-bit addresses occasionally.7 IEEE LAN addresses may be globally assigned and uniquely "burned-in" to the hardware (by the manufacturer), or be manually assigned.

Besides the source and destination addresses, it is often important to have some mechanism for indicating how long the frame is. Length indications can be explicit or implicit. One feature that is almost never implicit is marking a frame to indicate what higher-layer protocol it is carrying. In a LAN context, these data-link layer protocol IDs are often referred to as "EtherTypes," because Ethernet was the first commercially successful LAN technology. (IP's EtherType is 0x0800.) In the ATM world, it is possible to create multiple virtual circuits between two points and then agree that VC#x is only for AppleTalk, while VC#y is only for IP. When sending over these links, a few bytes may be saved because there is no need for explicit higher-layer-protocol marking, as the mere fact that the packet was sent on VC#y indicated that it was an IP packet. ATM is the only subnetwork technology that has defined "null encapsulation," though conceivably frame relay or X.25 could do similar things.

Interestingly, the original Ethernet standard did not have a length field. When the transmitting station was done transmitting, it drops its carrier which is supposed to indicate to the receiver that the frame is complete. It is also possible to encode the length in the header, which gives the receiver an idea of how long this frame is going to be. Many subnetwork technologies' frames do include an explicit length indication in their header, but implicit means are also used, as in the case of Ethernet. In either case, once the frame has been completely received, its Frame Check Sequence (FCS) is compared with the calculated FCS to verify that the frame was unaltered in transit. Each data link layer has a specific minimum and maximum frame length,8 which must be adhered to by the LAN or WAN devices.

Specifications exist that define how IP should operate over most existing subnetwork technologies. In order for a station to be able to send a link-layer frame, it must know the link-layer address of the (next-hop toward the) destination. The process of discovering a neighbor's link-layer address is known as Address Resolution in the IP model.

Address Resolution

IP's "Address Resolution Protocol" (ARP) operates over LANs and allows IP endstations to learn the MAC-layer addresses of neighbors. ARP is a "helper" protocol to IP, not actually an IP-based protocol itself.9 Figure II.3 depicts ARP's status as a peer protocol to IP, also using the transmission services of the subnetwork layer to do its job. ARP is most often seen in LAN environments.

Figure I: I.3 ARP relative to the IP protocol stack.

ARP uses a subnetwork frame that contains the IP address of the desired target station, and includes the source's MAC and IP addresses. The frame is sent to the broadcast destination address since the IP stack doesn't know the correct MAC-layer address to send this packet to--that's what it is trying to learn!10 If a station with a matching IP address exists, then that station will respond to the ARP request directly, and the requesting station will know how to reach this destination. Figure II.4 shows the procedure two endstations must follow in using ARP before they can communicate.

ARP operates over any "broadcast-capable" subnetwork layer, such as Ethernet, Token Ring, FDDI, and other LANs, even including the LAN-like SMDS, which does not have a true broadcast service in the same sense that a LAN does. Broadcast-capable subnetworks are inherently "multiple access." A degenerate form of broadcast exists on point-to-point subnetworks, in that broadcasting is equivalent to sending a frame to the other side. Thus, point-to-point networks have an implicit broadcast mechanism.

Figure I: I.4 ARP procedure

However, there are WAN subnetwork technologies that do support multiple access (i.e., at least two nodes on the medium) despite not having broadcast mechanisms. The generic name for such technologies is a mouthful: "Non-Broadcast Multiple Access," or NBMA for short. Examples of NBMA technologies are X.25, Frame Relay, and Asynchronous Transfer Mode (ATM) (not LAN Emulation).

WAN encapsulations look very similar to the LAN encapsulations described earlier. The main difference is that WANs frequently carry only a destination address in their headers, while LAN frames carry both source and destination addresses. WAN subnetwork layers span from the point-to-point protocol (PPP), which always has at most two participants, and thus a very limited need for link addressing, to more complex packet- and cell-switched "cloud" NBMA technologies such as Frame Relay, X.25, ATM, and SMDS.

Despite the nonbroadcast nature of most WAN subnetworks, certain ARP variants do exist, supporting address resolution over some--but not all!--WAN subnetwork technologies. Over WANs, which generally lack broadcast capabilities, address resolution must employ very different mechanisms than the simpler LAN case. Each WAN technology's unique features determine what form an ARP-like protocol may take, or if such a protocol is even definable. Contrary to LANs, there is no generic NBMA address resolution protocol.

Inverse ARP (InARP) has been designed to work with Frame Relay networks. InARP leverages information derived from Frame Relay's associated Layer Management Interface (LMI) protocol, which periodically reports the status of each defined virtual circuit. ATM-ARP has been designed to work with RFC-1577-style "IP over ATM" networks. RFC-2225 defines an ATM-ARP server to which all address resolution queries are addressed; the ATM-ARP server maintains a centralized database of IP-to-ATM address mappings. When systems initialize their IP-over-ATM stacks, they register with the ATM-ARP server, which enables the ATM-ARP server to provide accurate address mapping information to the other stations within that logical IP subnetwork (LIS).

The Next Hop Resolution11 Protocol (NHRP) has also been designed to support address resolution over ATM for next generation applications including the Multi-Protocol Over ATM standards (MPOA) from the ATM Forum. NHRP is most certainly not a routing protocol; it is a generic address resolution protocol for NBMA networks (it could even be used with X.25). NHRP is designed to communicate with "Next-Hop Servers," which are NHRP-capable routers, to determine the best exit point from the NBMA cloud en route to a given destination. It is a matter of policy whether "shortcut calls" across the NBMA-layer may be established, allowing packets to bypass some number of router hops, proceeding directly to the best egress router for some destination.

For NBMA subnetworks that have no defined ARP variant, static configuration of each relevant (neighbor IP address, neighbor subnetwork-layer address) mapping is required. In some carefully controlled private X.25 environments, where the X.12112addressing plan and the IP addressing plan are managed by the same organization, it is possible to create a configuration in which an algorithmic mapping between IP and X.121 addresses is possible. See RFC-1236 for an example from the U.S. military's Defense Data Network (DDN).

Even when ARP variants do exist, static mappings have the attractive property of being extremely stable over time. The WAN "addresses" associated with neighbors usually have very long lifetimes, so a dynamic address resolution protocol mainly serves to ease activation of new neighbor sites. Once up, however, adding a static entry for the new site captures its likely long-term presence and stability. Whether or not a site is reachable at some future time becomes a routing protocol issue. If a neighbor router becomes unreachable, then its formerly adjacent WAN neighbors will not have a route to destinations that previously were reachable via the dead router .

Without a forwarding table entry, then, any neighbor router will simply generate an error message if it ever gets a packet heading to one of those now-unreachable destinations. The router will never forward the packet over the WAN, even if it has a static entry for the now-dead remote router's subnetwork address, because no destination forwarding table entries will be using that IP address as their next hop.

Conversely, whenever a router does need to send a packet to the Boise, Idaho, branch office, the subnetwork-layer address will almost certainly be the same whether the router in Boise is up or down. Dynamically discovering the IP-to-subnetwork address mapping every time a site is activated is a waste of effort for the routers. The network administrator knows in advance that the answer will always be the same, and setting a static mapping in the router(s) reflects this quasi-permanence of the WAN address mappings, thus enabling the routers to send data to each other provided that the link-layer path between them is up. Stale mappings are harmless because there will be no forwarding table entries that might use that mapping.

IP Forwarding Procedure

Regardless of the type of subnetwork involved, the forwarding decision is very similar. Of course, there are always subnet-specific details, but before diving into the details, it will be useful to take a high-level view and look at the forest. The trees will be examined closely enough in the remainder of Part II.

Endstation Forwarding Decision

The first step of the journey of an IP packet begins inside an endstation. The endstation needs to determine the best way to send out a packet, and then deliver it to the next hop. A special case is when the endstation is on the same subnetwork as the destination; in this case, the IP packet can be sent "directly" to the destination. This case involves only ARP or an Address Mapping Table lookup, since all that needs to be known is the subnetwork-layer address of the destination.

If an endstation can tell that the ultimate destination of the packet is not on one of its local subnetworks, then it must send the packet to a router. The endstation's IP stack configuration includes the IP address of a "default gateway," which is the IP address of a router within (one of) the endstation's assigned prefix/mask(s). The endstation will send all non-locally-destined traffic to the default gateway for further delivery. The endstation hopes that the default gateway will know how to reach the packet's destination. Note that the default gateway must share the same IP subnetwork prefix with the endstation, or else they could not exchange packets. Sending to a router on an endstation's LAN is just like sending to any other local destination,13 with the exception that the router is not the ultimate destination.

IP uses a hop-by-hop forwarding paradigm, which in and of itself does not distinguish IP from other protocols. Since IP is connectionless, the packet's destination address (DA) is always the address of the ultimate destination. In connection-oriented networking technologies, it is common to use a label-switching paradigm in which the destination's label is only significant on a hop-by-hop basis. IP also uses link-layer addresses hop-by-hop, but the key difference is that there is no preestablished series of link-layer addresses that tell you how to get from point A to point B, as there is once a virtual connection is set up.

Figure II.5 illustrates a simple topology with four routers, two of which are on the path from A to B. A word on the notation: Each of the router's interfaces have at least one IP address. These router IP addresses are denoted by RX(i, a.b.c.d), indicating that interface i of router X has the indicated IP address. The MAC address of that same interface is RX(MACi).

Figure I: I.5 Hop-by-hop IP forwarding

The following steps show how a packet proceeds from A to B.

Step 1: A needs to send a packet to B. First, A (172.16.96.214) notices that B's address (172.16.96.165) is not within A's local prefix, 172.16.96.192/27, which implies that the packet must be sent to A's default gateway, RB. The first time that A needs to send a packet to RB, A must send an ARP Request looking for RB(MAC3). A's IP stack is preconfigured with its default gateway set to 3, 172.16.96.193, which is how A knows the IP address for which it must broadcast an ARP Request. A stores RB(MAC3) in its ARP cache for later use, in case any other packets need to be sent to nonlocal destinations.

A then transmits the IP packet, with its IP destination address field set to B's IP address (172.16.96.165), and its source IP address field set to A's own address (172.16.96.214). The packet is encapsulated within a MAC-layer frame that is addressed to a destination of RB(MAC3). The MAC-layer source address will be set to MACA. The initial frame and packet are illustrated in Figure II.6.

Figure I: I.6 The packet's initial encapsulation.

Step 2: Now let's assume that the frame made it across the first LAN subnetwork, and that the router RB has received the frame. It examines the frame and looks at the IP DA field, comparing the value in that field with its forwarding table,14 which might be something like what is displayed in Figure II.7. Note that the parenthetical labels in the next-hop gateway column were added by the author as an aid in understanding the example. A real forwarding table would only have the next-hop routers' IP addresses.

The destination prefix that has the most leading bits in common with the packet's DA (172.16.96.165) is the third entry, carrying a destination prefix of 172.16.96.128/26. Expressed in binary, that prefix and the destination address are:

Prefix (172.16.96.128/26)

10101100.00010000.01100000.10000000, and

IPB (172.16.96.165)

10101100.00010000.01100000.10100101.

For reference, the other prefixes are:

Prefix (172.16.96.64/27)

10101100.00010000.01100000.01000000,

Prefix (172.16.96.96/27)

10101100.00010000.01100000.01100000,

Prefix (172.16.96.192/27)

10101100.00010000.01100000.11000000,

Prefix (172.16.96.224/28)

10101100.00010000.01100000.11100000, and

Prefix (172.16.96.240/28)

10101100.00010000.01100000.11110000.

Known Prefixes

Next-Hop Gateway

Metric

Status

172.16.96.64/27

172.16.96.67

0 (connected)

Up

172.16.96.96/27

172.16.96.66 (RD)

11

--

172.16.96.128/26

172.16.96.65 (RC)

11

--

172.16.96.192/27

172.16.96.193

0 (connected)

Up

172.16.96.224/28

172.16.96.225

9 (connected)

Up

172.16.96.240/28

172.16.96.226 (RA)

20

--

Figure II.7 Router R8's forwarding table.

The best-matching forwarding table entry's /26 mask boundary is marked by the extent of the underlining, which ends at the 26th bit. The destination address' best-matching prefix is 172.16.96.128/26.15 Thus, according to RB's forwarding table, the packet needs to be forwarded to RC. By the same exact calculation that A did when it decided that IPB was a nonlocal address, RB also sees that IPB is nonlocal, and additionally, it sees that RC claims to know how to reach this packet's destination prefix.

Just as A needed to use ARP to find RB(MAC3), RB needs to use ARP to find RC(MAC1). Once RB knows RC's MAC address on their common subnetwork, RB can forward the packet to RC. The appearance of the new frame is illustrated in Figure II.8. The MAC header structure is a bit different over FDDI, which will be covered later in Part II. The important part for this discussion is observing that the MAC-layer source and destination addresses change at each hop, as well as noting that the IP packet's addressing is unchanged at each hop.

Observe that the IP portion of the packet is unchanged, except that the Time-To-Live field will have been decremented by one, forcing the recomputation of the IP Header Checksum at each intervening router. At the MAC layer, the SA and DA fields reflect the fact that RB is the now the source and RC is now the destination on this example's in intermediate hop. Finally, due to the changes in the MAC and IP headers, the FCS will end up being a different value from the first frame as well.

Figure I: I.8 The packet's intermediate encapsulation.

Step 3: Once RC receives the packet, it also needs to examine the packet's destination address and decide how to forward it. In this case, the packet's destination (172.16.96.165) happens to match one of RC's attached subnetworks. Recalling the binary representations of the prefixes above, we see that the 172.16.96.128/26 prefix is, indeed, the longest match.

Prefix (172.16.96.128/26)

10101100.00010000.01100000.10000000, and

IPB (172.16.96.165)

10101100.00010000.01100000.10100101.

This is the very same calculation that RB did, with the difference that RC is actually directly connected to the destination's subnetwork, rather than the case with RB, in which RC was simply the next hop on the way to the final destination. Figure II.9 shows RC's forwarding table; note the differences from RB's forwarding table. Note that there is one fewer destination prefix than in RB's forwarding table, apparently because RB has aggregated 172.16.96.224/28 and 172.16.96.240/28 into a single prefix, namely 172.16.96.224/27.

Known Prefixes

Next-Hop Gateway

Metric

Status

172.16.96.64/27

172.16.96.65

0 (connected)

Up

172.16.96.96/27

172.16.96.66 (RD)

11

--

172.16.96.128/26

172.16.96.129

0 (connected)

--

172.16.96.192/27

172.16.96.67 (RB)

11

Up

172.16.96.224/27

172.16.96.67

11

Up

Figure II.9 Router Rc's forwarding table.

Now that RC has realized that it needs to deliver the packet to a directly connected subnetwork, it requires information derived from ARP as we have seen in every step thus far. Once the router knows MACB, it can send a frame directly to the destination, IPB. The final encapsulation will appear as illustrated in Figure II.10. This packet is definitely the final hop packet because the subnetwork-layer destination address (MACB) and the IP destination address (IPB) are associated with the same machine.

Figure 11.10: The last hop: Delivery to the packet's destination.

This example had the first-hop and last-hop subnetworks using the same technology, but obviously that is not a requirement. An IP packet may originate on any subnetwork type, e.g., a Token Ring LAN, and after crossing some intermediate Ethernet, FDDI, ATM, or Frame Relay subnetworks, may ultimately land on an SMDS subnetwork--or any other type of LAN or WAN destination subnetwork.

The important thing to carry away from this example is that each step in the forwarding process is the same as all the others. Whether it is the endstation deciding whether a destination is local or not, or a router along the path making the same decision, the forwarding decision is the same algorithm. What is interesting is how the intermediate routers build their forwarding tables. This is the realm of routing protocols, two of which (RIP and OSPF) will be covered in Part III.

There are more routing protocols, however, they share many similarities with RIP and OSPF. The fact that RIP and OSPF are commonplace makes them good prototypical examples. The job that routing protocols do is to exchange information among themselves so that they will know what destination prefixes are reachable.16 Also, the routing information is often tagged with "metrics." These metrics allow the routers to decide which path toward a destination is "best," in the event that multiple paths appear to exist.

For the moment, we don't need to understand the operation of routing protocols. With this foundation--understanding the IP forwarding procedure--we will now begin a detailed examination of subnetwork technologies, beginning with Ethernet.

Endnotes - II

Quality of Service. A term used to indicate fine-grained parameters related to the connection, or "flow," such as end-to-end bandwidth, delay, variation of delay (i.e., jitter), etc.

Class of Service. A term used to indicate coarse-grained QoS-like parameters related to perhaps a handful of different priorities or service levels by which traffic flows may be classified.

The author regrets invoking the terms "multimedia" and "converged," because they have become overused and are now practically meaningless. However, they may convey the type of applications, and the evolutionary direction of networking, so in that sense they help paint the proper picture.

Typically, WAN subnetworks operate alongside a reference clock that is provided by the subnetwork. In this event, the only service that the framing provides is to delimit one frame from the next; clock synchronization "just happens."

Despite its name, Asynchronous Transfer Mode (ATM) is not such a protocol; it is asynchronous in the sense that its cells have no fixed timing relationship(s).

WAN subnetworks may only have a destination address in the frame, especially switched WAN services such as ATM, frame relay, and X.25. Point-to-point links have implicit addressing, in that the other side is unique, so sending to "the other side" is logically the same as sending to "every other station." So, in the point-to-point case, unicast and broadcast are effectively the same thing.

The IEEE has officially deprecated 16-bit LAN addresses, but certain subnetworks such as FDDI and token ring do support them, and it is possible that certain installations are still using them.

This maximum frame length is more commonly known as the medium's Maximum Transmission Unit (MTU).

ARP frames have no IP header; they are layered immediately above the MAC-layer LAN header.

Broadcast is less than optimal, but some method must be used to bootstrap the address discovery process. An IP-specific "ARP multicast" address could have been used, so that only IP endstations would hear the ARP traffic. Contrast this with the broadcast method, in which even non-IP endstations must receive the broadcast frame.

The R in NHRP is often decoded as "Routing," which is incorrect. That R actually stands for "Resolution." See RFC-2332 or its successor for all the gory details (including the correct name, available on page 1).

CCITT X.121 is the standard that specifies the format of X.25 addresses.

Local destinations are on the same subnetwork medium as the endstation, and share the same IP prefix.

The forwarding table is built by the routers using routing protocols. The job of routing protocols is to exchange information among themselves so that they will know what destination prefixes are reachable.

Because we consider a matching prefix to be "best" when its mask is longest, this type of forwarding style is known as "longest-match forwarding." Another form of forwarding is "exact match," which is used by Novell's IPX, as well as other protocols that have fixed-size network numbers (as opposed to IP's variable-sized extended-network-prefixes)

I have always found it to be somewhat magical that the routers use the network itself to find out about the network. Hoisted up by its own petard, indeed!

Chapter 6 - LAN Interconnection

Introduction

All LAN types need to be interconnected; it is not practical for one LAN to span the whole world, or even an entire company (unless it is very small). LAN interconnection devices typically operate at either OSI Layer 1 (the Physical layer), Layer 2 (the Data Link layer), or Layer 3 (the Network layer).

As shown in Figure 6.1, frames arriving on a LAN interconnection device's interface have three processing choices: 1) simply flow straight through the device [literally bit-by-bit, in the case of pure layer-1 repeaters], 2) be received by the device, which verifies the layer-2 frame check sequence to ensure error-free reception17 then forwards the packet based on its layer-2 destination address, or 3) be received by the device, pass layer-2 input error checks (e.g., FCS), and then be forwarded based on the layer-3 destination address, after having proper outgoing layer-2 header and trailer attached.18 Each layer depends on the one immediately below it.

Interconnection at OSI Layer 1

Layer 1 Ethernet interconnection devices are called "repeaters" or "hubs."19Layer 1 devices simply pass bits along, with little or no knowledge of the Data Link layer frame structure. A repeater is depicted in Figure 6.2. A hub may attach to multiple endstations, or to other hubs (within limits set by the Ethernet standard). Repeaters allow the physical extent of a LAN to be expanded beyond the distance constraints of a single wire, and they allow different Ethernet media to be mixed within the same extended LAN.

Multiport transceivers, used with thick Ethernet, are a kind of hub, in that they allow multiple stations to share one "vampire tap." There are also many special-purpose repeaters, allowing dissimilar Ethernet physical layers to be interconnected. So-called "micro-transceivers" are a kind of repeater, allowing an AUI port to supply power to a small device that may have 10BASE-T, 10BASE-FL, or 10BASE2 thin Ethernet output.20 These devices allow the thick Ethernet AUI port to be a kind of universal Ethernet port. Today, most devices ship with either an AUI port or a native 10BASE-T port, since twisted pair is now, by far, the most prevalent physical layer over which Ethernet is run.

Figure 6.2: Logical internal diagram of a repeater.

Ethernet repeaters are like a "wire in a box," in that they are usually diagrammed just as an Ethernet cable. In all important ways, hubs behave just as if all the attached stations were sharing an internal 10 Mbps cable. Figure 6.3 shows a physical diagram of a repeater/hub.

Figure 6.3: Physical view of a 16-port hub.

This example hub has 16 ports, with one "backbone" port. Most hubs have some sort of backbone slot, allowing the hub to attach to another type of Ethernet. An AUI port could be used to attach to a thick Ethernet "backbone," or hubs could be chained from one medium to another. Returning to the hub in Figure 6.3, it is sometimes drawn as a wire with tick marks representing endstation attachments. It is most often drawn simply as a wire, as one might draw a thicknet. Figure 6.4 shows these two different logical views of a hub.21

Figure 6.4: Logical views of a hub.

Backbone attachment can be accomplished in 10BASE-T environments by using a dual-mode port (labeled MDI/MDIX) with a "straight-through" patch cable, or any regular MDI port with a crossover cable. MDI stands for "medium-dependent interface," referring to the Ethernet physical layer reference model, as shown in Figure 6.5.6

Figure 6.5: 10 Mbps 802.3 reference model.

10BASE-T Wiring

In the context of 10BASE-T, a hub's ports are typically MDIX ports, allowing the MDI ports of the endstation to be directly patched in with a "straight-through" patch cable. Fundamentally, an endstation's "transmit" pins need to be connected to a hub's "receive" pins; likewise, the hub's "transmit" pins need to be connected to the endstation's "receive" pins. One of the hub's functions is to perform this electrical crossover function--the crossover happens inside the hub. Figure 6.6 illustrates the 10BASE-T patch cable connector wiring diagram, as well as a schematic of the way the patch cable facilitates the endstation's attachment to the hub.

Figure 6.6: 10BASE-T pinout and hub attachment.

When a hub has a port labeled MDI/MDIX, this port may be switched so it acts like a endstation, which facilitates the chaining of one hub to another using a straight-through patch cable. Another way to interconnect two hubs (when one of the two does not have a switchable MDI/MDIX port) is to use a 10BASE-T crossover cable. Figure 6.7 shows how a "crossover cable" swaps transmit and receive inside the cable, so two endstations or two hubs may be interconnected. In a way, a crossover cable is like a two-port hub.

Figure 6.7: Crossover cable functionality.

Layer 1: Repeaters

Repeaters are the least sophisticated devices that may be used to interconnect two or more Ethernets. A repeater does nothing more than copy bits from one port to all the other ports. Repeaters are strictly Physical Layer devices, with little intelligence. One limitation of repeaters is that they propagate errored frames because they have no concept of what they are doing beyond the bit level.

Repeaters exist to extend the size of an Ethernet. A 10BASE5 Ethernet may only have 100 nodes, and a 10BASE2 segment is limited to 30 nodes. Interconnecting more nodes than that requires some kind of LAN extension device. Remember that 10BASE5 is limited to 500 meters per segment, and 10BASE2 segments cannot exceed 185 meters. Given a need to cover a larger distance, a repeater was the simplest device that could do the job. One cannot add repeaters indefinitely. The "four-repeater rule" holds regardless of which physical manifestations of Ethernet are being interconnected. It is easy to forget this rule and interconnect some set of Ethernets with more than four repeaters, which can lead to error conditions such as late collisions.22 The four-repeater rule is illustrated in Figure 6.8.

Figure 6.8: No more than 4 repeaters per Ethernet.

Multiport repeaters were commonplace in many networks at one time. Often these were modular "multi-media" devices, in the sense that they could interconnect thick and thin Ethernet media. Multiport repeaters evolved into hubs, especially once 10BASE-T was standardized. Later, so-called "enterprise" hubs, often supporting multiple logical repeaters within one chassis, and perhaps enhanced network management capabilities, arrived on the scene. Some of these devices also included bridging or routing features, or server cards. Hubs became a strategic product that centralized essential network functionality in the wiring closet.

Interconnection at OSI Layer 2

Layer 2 interconnection devices, for all LAN types, are known as "bridges." Since only four repeater hops can be in a collision domain and still have it be compliant with Ethernet's topology rules, "bridges"23 serve to interconnect Ethernets and help to further extend the distance over which they could operate. Plus, remember that a single Ethernet cable (or an extended LAN), logically a bus, is a shared medium--only one station may transmit at a time. This is another part of the motivation for bridges, which isolate conversations so that local traffic (traffic that does not need to cross the bridge) cannot collide with conversations taking place on other "collision domains" to which the bridge is attached. This traffic isolation is indicated in Figure 6.9, in which traffic between stations A and B can happen at the exact same instant as traffic between stations C and D.24

Figure 6.9: Bridges enable simultaneous data transfer.

Ethernet bridges are designed to be "transparent." They are supposed to be invisible in the sense that endstations operating over a bridged LAN cannot tell it from a "repeater-ed" LAN, or a piece of coaxial cable for that matter. After repeaters, bridges are the next most sophisticated Ethernet interconnection devices. Bridges operate at the Data Link layer. Transparent bridges are one major type of bridge; they receive at least the destination address of each frame, forwarding the frames based on a table they have learned by simply observing which (source) addresses are seen on each port.

One of the advantages of transparent bridges is that they are easy to deploy. (Be warned, however: The IEEE recommends that a given topology be no more than seven bridges in diameter.) Another benefit of bridges is that their presence is transparent to all higher-layer protocols. Bridges only forward packets based on the destination address in the Data Link layer frame.25 Figure 6.10 indicates where bridges fit within the OSI Reference Model, and logically illustrates the layer processing that happens within a bridge.

Figure 6.10: Logical internal diagram of a bridge.

In the mid-to-late 1990s, marketing lingo dispensed with the older label, so "bridges" first evolved into "switches," then into "layer-2 switches" (to distinguish them from "layer-3 switches," which is the new term for products that are otherwise known as "routers"). Despite their "new" name, they are still bridges. The principal difference between bridges and switches is speed. As far as their layer-2 functionality goes, they are identical.

Bridges were originally software-based devices that had relatively few ports (e.g., eight or less). Even though each frame was forwarded by software, it was still possible to achieve wire-speed or near-wire-speed performance with reasonably fast microprocessors--and efficient forwarding code! Switches are mostly hardware-based, especially the components related to receiving, forwarding, and transmitting frames. These are well-defined and relatively straightforward tasks, suitable for implementation in hardware.

Various companies have put the necessary MAC-layer and bridging functionality on a single chip (usually an Application-Specific Integrated Circuit, or ASIC), allowing much higher port density and enabling aggragate forwarding capacity to increase as the number of ports increases, without increasing the cost of the devices. Switches used to be more expensive, per port, than hubs, but competition and market maturity have made switches so much cheaper that hubs are fading away. Switches offer far more performance for the price, even if they cost slightly more per port than a hub.

Bridge Types

Transparent Bridging (Ethernet)

Ethernet bridges are known as "transparent bridges" because they are "plug and play" devices. If you insert a transparent bridge in the middle of a repeater-interconnected Ethernet, it just learns where stations are by watching them talk. If a source address is seen on a given port, then any traffic destined for that station should be forwarded out that port. The "bridge forwarding table" is simply a list of MAC addresses alongside the ports on which they were learned.

If a frame is seen that has an as-yet unknown destination, the bridge must flood that frame across all ports (except the one the frame arrived on)26 so that it can reach the destination station, if it exists. When a transparent bridge first starts up, it spends time in a "learning" mode so it learns which of its ports lead to which of the stations it has seen traffic from.

Source-Route Bridging (Token Ring)

Token Ring bridges are called "source-route bridges" and do not operate by learning where MAC addresses are.27Transparent bridges are semi-intelligent so that the endstations can operate the same in either bridged or nonbridged environments. On the other hand, source route bridges are less intelligent, forcing the endstations to do more work to operate in a bridged environment. The bridges serve to define "ring numbers" for each unique ring, which are used by the bridges to determine how to follow the frame's source "routing" header.

The forwarding tables in the source-routing bridges are entries of the form (ring number, interface number), thus the forwarding state in the bridges depends on their number of interfaces, not the number of endstations in the bridged LAN.

Source-route bridging puts extra demands on the endstations, while the bridges need only know how to forward the "explorer frames"28 to all rings without creating duplicates, and how to follow the source route tag29 in the frame header. Since all the bridged frames will have a source route tag, the bridges need not do any real work to figure out how to forward frames--the frames' headers contain instructions on how to forward them, in the form of a list of ring numbers.

Every time a token ring station wants to contact another staton, it sends out a special "all-routes explorer frame," which is flooded by the bridges until the station is reached (if it exists). The destination station then responds with another broadcast frame, which the station that originally transmitted the explorer frame will presumably hear.30 Now that the original station has a "source route," or a list of bridges a frame must cross in order to reach the destination station, it can create the proper Routing Information Field in the Token Ring header. Each endstation keeps a local cache of information (destination MAC address, source-route tag), so that it need not transmit a broadcast explorer frame every time it wishes to communicate with that MAC address.

"Learning Bridges" and the Spanning Tree Protocol

Transparent bridges learn where stations are by remembering which port each active source address was heard on. When a frame needs to be forwarded to some destination, the bridge looks up that destination in its "bridge forwarding table" and sends the frame out the port on which that MAC address was learned. Destinations that have not yet been learned are flooded along a "spanning tree" that is constructed among the set of bridges so that there are no loops in the topology. A spanning tree is literally a tree that spans (connects) each bridge in the topology. Trees have the property that they are free of loops in the topology.

A topology with a loop would create broadcast storms, and would even replicate unicast frames endlessly. A spanning tree is constructed because it is a tree that spans all the links in the topology. The tree ensures that only one copy of a packet will be seen on any LAN in the entire topology. The Spanning Tree Protocol is the very simple protocol that ensures that the topology is loop free and that creates the spanning tree. Each bridge periodically transmits a Bridge Protocol Data Unit (BPDU), which serves to inform all other bridges (if they exist) of its presence.

The Spanning Tree Protocol was invented by Dr. Radia Perlman, then of Digital Equipment Corporation (DEC).31 It was later standardized by the IEEE as part of the 802.1D bridging specification. The two versions are not quite identical, and you will sometimes see bridges that can be configured to speak either the DEC or IEEE dialect of the protocol. Figure 6.11 shows an example of a spanning tree. Note that certain bridge ports are "blocked," (indicated by dashed lines) so that duplicate frames are not created by the loops that would otherwise have been present.

Figure 6.11: A spanning tree.

As indicated in the topology, one of the bridges has been elected as the "root" bridge. All frames that are either broadcast, multicast, or being sent to unknown destinations are flooded toward the root bridge.32 This ensures that the frames are conveyed to all the LANs in the bridged topology.

Normally, the Bridge PDUs have a default bridge priority of 32,768 (out of a range from 0-65,535). If no configuration tuning has been done, the bridge with numerically lowest MAC address will be elected as the root bridge. This might be fine if you have a small bridged network (bandwidth is plentiful and it doesn't really matter to you where the root bridge is). However, in the event of a bridged WAN, it may be very important that the root bridge be on the main campus, near most of the users. If your main office were in New York, with small branch offices scattered throughout the U.S., you wouldn't want all your broadcast packets to be forwarded through Council Bluffs, Iowa. To prevent this from happening, it is possible to configure a bridge with a lower bridge priority value, which makes it win and become the root bridge.33 Conversely, bridges that you do not ever wish to become the root may be configured with a very high bridge priority, even as high as 65,535.

Some bridges begin forwarding frames as soon as they have received the first six bytes (the destination address), which means that the frames incur little delay in passing through the bridge. Of course, errored frames will get through, so this "cut-through" technique is not perfect. Other bridges wait until they have received entire packets and verify the Frame Check Sequence before forwarding it. The delay incurred varies with packet size, since it takes much longer to receive a 1500-byte frame than a 64-byte frame.

Bridge designers always had this choice of forwarding while receiving versus receiving before forwarding. These different forwarding approaches were mostly academic while bridges were still relatively slow, software-based devices. However, as bridging performance approached "wire speed," these differing delay characteristics became a key marketing differentiator for first-generation Layer-2 switches.

Collision Domains and Broadcast Domains

Today's Ethernet environments are increasingly dominated by Ethernet "switching," so collisions are less and less likely since the switch (still a transparent bridge) limits the "collision domain" to those stations reachable via a given switch port. Ultimately, this trend indicates that there may be just one endstation attached to a port, with its own private collision domain that it shares with the switch port. Note that this is the degenerate case for the binary exponential backoff algorithm, and thus is vulnerable to the capture effect.34 However, a switch port may also connect to a set of hubs and/or switches, which could potentially reach hundreds of endstations.

The collision domain is the set of Ethernet segments that can directly "hear" each other's frames. Remember that any collision must happen within the frame's first slot-time (512 bits, i.e., 64 bytes), so the time it takes for a signal to propagate across a LAN and return to the starting point must be less than the time it takes to transmit 512 bits. Any set of Ethernets connected with repeaters is a single collision domain, since the repeaters simply copy the bits from one port to every other port. MAC-layer bridges terminate the collision domains, but preserve the broadcast MAX-layer or multicast domain, which is the set of stations that will receive a broadcast frame sent by any of them. Collision domains and their relationship to broadcast domains is illustrated in Figure 6.12.

Figure 6.12: Collision domain versus broadcast domain.

When there are no bridges, the collision domain and broadcast domain both encompass the same set of end-nodes. Once bridges are added, each collision domain is a proper subset of the broadcast MAC-layer or multicast domain, which is the superset of all the end-nodes. Each bridge port defines the edge of a collision domain.

Evolution of Bridging: Switching

Throughout the late 1980s and early 1990s bridges had few ports, on the order of four to twelve. Bridges also were not "wire speed" devices, meaning that if they had eight ports, they could not necessarily support 40 Mbps of unicast traffic. Why not 80 Mbps? Clearly, since 10 Mbps Ethernet is a half-duplex technology, the most traffic that can ever be on a link is 10 Mbps. Any traffic that the bridge receives has obviously arrived on some port, and it needs to exit via a different port. If one considers the eight ports as being four "in" ports and four "out" ports, then there is clearly only enough capacity in the links to support 10 Mbps throughput per in/out pair. Granted, each port will have a mix of in and out traffic, but the maximum is still 10 Mbps total per pair of ports. Generically, a switch with N half-duplex 10 Mbps ports needs to have a maximum internal forwarding capacity of N/2 Mbps. This discussion is illustrated in Figure 6.13.

Figure 6.13: Rationale for high-speed uplinks.

In 1992 or so hardware implementations of "a bridge on a chip" enabled the emergence of the first Ethernet "switches." Ethernet switches were nothing more than bridges that were capable of near wire-speed or wire-speed performance. Generally, they also had more ports than the bridges which had preceded them. The inevitable march of technology has given us switches today that are fully wire-speed devices, and performance has ceased to be a differentiator for switches. Today, all switches are expected to be wire-speed and vendors differentiate their switches by incorporatng extra value-added features, such as VLANs or class-of-service queueing.

A 10 Mbps Ethernet switch will usually support Fast Ethernet (100 Mbps, half- or full-duplex) uplinks, or Asynchronous Transfer Mode OC-3 uplinks (155.52 Mbps, full duplex) supporting LAN Emulation. ATM LAN Emulation, or "LANE" for short, is a standard from the ATM Forum that standardizes Ethernet and Token Ring bridging over ATM backbones. Switches with Fiber Distributed Data Interface (FDDI) uplinks are also very common, and Gigabit Ethernet uplinks (1000 Mbps) began emerging in 1998.

Rather than give the impression that Ethernet switches are the only type, note that DEC's GigaSwitch(TM) was an early FDDI switch, and is still used at many Internet Exchange points. Many other vendors sell FDDI switches today, and Token Ring switches are also available. Of course, as implied by LANE, ATM switches are also on the market.

Practical Overview of Switch Design

The earliest LAN switches to appear on the market were not necessarily wire-speed devices. At the time, only one form of Ethernet existed: 10 Mbps, which is half-duplex by design. As discussed above, a switch can only handle (N/2)*10 Mbps, where N is the number of ports. For example, the switch on the left in Figure 6.13 can clearly only accomodate four simultaneous 10 Mbps data flows.

If multiple ports have data needing to exit via the same output port, then the switch must buffer the data. The worst-case scenario for a single-speed switch is that all N-1 of the ports will have data needing to exit via only one of the ports. In this case, the switch offers no more speed than a hub, and probably less.35 Bear in mind that these best- and worst-case scenarios have only considered unicast traffic flows. If any port receives a broadcast or multicast, that frame must be flooded out the other N-1 ports (except ones which the Spanning Tree Protocol has placed in the "blocked" state).

Now that Fast Ethernet exists, a switch may have 12 to 36 (or more) 10 Mbps Ethernet ports, with one to three (or more) Fast Ethernet "uplink" ports. These faster ports may be attached to "in-demand" devices such as links to core switches, or routers, or popular servers. This neatly addresses the unicast worst-case scenario, provided the popular device has been attached to a high-speed port. However, traffic management is still an issue that must be carefully considered, as it is still possible for more than 10 Mbps of traffic to be waiting for a 10 Mbps port at any given time.

Interconnection at OSI Layer 3

Routers can also be viewed as LAN interconnection devices, but they isolate each LAN domain into a separate "subnet" or "network number," depending on which protocol stack you are "routing." Routers forward packets based on the destination address at the Network layer (layer 3). A router's logical structure is shown in Figure 6.14.

Figure 6.14: Logical internal diagram of a router.

Like any LAN endstation, a router receives a frame if it is either a) a unicast frame addressed to one of the router's interfaces, b) a frame addressed to the LAN broadcast address, or c) a frame addressed to a multicast address. Broadcast packets for protocols that the router is not participating in may be dropped.36 Multicast packets for groups with no internal forwarding state may be dropped. Unicast frames addressed to one of its interfaces must be interpreted, and possibly forwarded, provided the packet corresponds to a protocol that is currently being routed.

If a frame is addressed to one of the router's interfaces, and the packet inside is also addressed to one of the router's layer-3 interface addresses, then the packet is meant for the router itself. Packets that the router must forward are addressed to the router at layer 2, but not at layer 3; their ultimate (network-layer) destination is elsewhere. Packets addressed to the router itself will have one of the router's interface MAC addresses in the Data Link layer Destination Address field, and one of the router's Network layer addresses in the packet's Destination Address field.

A router acts as a protocol endstation on each of its interfaces, as well as supporting forwarding between interfaces, but not all packets that a router receives need to be forwarded. For instance, if a telnet session is connected to the router, packets are sent to and fro between the router and the device at the other end of the connection. Again, these packets are actually addressed to one of the router's interface addresses, at both layer 2 and layer 3. Other packets that may be directly addressed to the router are Internet Control Message Protocol (ICMP) packets and Simple Network Management Protocol packets (SNMP), as well as packets associated with routing protocols, such as routing updates, etc. Lately, many routers are also manageable via Hyper-Text Transport Protocol (HTTP) in so-called "Web-based management."

The size of a network layer address depends on the protocol stack. AppleTalk Phase II has a three-byte address consisting of a two-byte network number and a one-byte node address. DECnet Phase IV addresses are only two bytes, consisting of a 10-bit node address and a 6-bit network number. Novell's IPX is a close derivative of Xerox's XNS "Internet Datagram Protocol," which both use four-byte network numbers concatenated with six-byte MAC addresses, which are used as the node addresses. IPv4 addresses are four bytes long, with no fixed boundary between the "network number" and node address.37 IPv6 addresses are 16 bytes, allowing for multilevel hierarchical addressing structures. The prize for the largest addresses goes to OSI's Connectionless Network Protocol (CLNP), which uses hierarchical addresses that are up to 20 bytes long. Both IPv6 and CLNP addresses support using a node's literal MAC address as a token within the host portion of the network-layer address, because the MAC address is a convenient six-byte number that is virtually guaranteed to be unique.38

Routers forward packets based on their destination address at the Network layer of the OSI Reference Model. Bridges are independent of higher-layer protocols, while routers are intimately protocol-dependent. Routers define boundaries for protocols, allowing them to scale beyond operation over a single LAN. To one degree or another, each Network layer protocol depends on the judicious placement of routers for its scalability. Some protocols, such as DECnet Phase IV and AppleTalk have very small address spaces that are incapable of supporting operation on a global scale. Generally speaking, a protocol's addresses need to be big enough to support at least two layers of hierarchical structure.

Remember that bridges forward frames based on their Data Link layer destination address, also known as MAC address in the case of most LANs.39Routers, for "routable" protocols (including IP), terminate both broadcast domains and collision domains. Broadcast domains have a one-to-one correspondence with distinct IP prefixes (a.k.a. "subnets"), or distinct network numbers in other protocols. Despite the fact that routers terminate broadcast domains, they may be configurable to selectively forward certain types of broadcast traffic, e.g., UDP/IP broadcasts of DHCP, NetBIOS, etc. In such cases, the routers are configured to pick up only certain broadcast packets and send their data--within a new unicast IP header--to a particular server.

Besides the very special cases of broadcast traffic above, the only traffic that crosses a router is unicast or network-layer multicast traffic. Unicast traffic leaving a broadcast domain is addressed to a router's MAC address, with the destination IP address indicating the ultimate target. Multicast traffic, in the case of multicast IP, is not addressed to the router's MAC layer address, but simply sent to the MAC-layer equivalent of the IP multicast destination address. IP multicast packets are not addressed to the router at either the Network or Data Link layer! The router's job is to take such packets off the wire (by noticing that they have a MAC-address prefix used by IP multicast), decide if they should be forwarded, and if so, where. How the packet is forwarded depends partly on the multicast routing protocol(s) in use, and partly on the multicast forwarding state that has already been built up within the router. (Note that OSI's CLNP also supports multicast at the network layer.)

Typically, routers are multiprotocol devices, usually also capable of supporting concurrent bridging of protocols which it is not configured to "route." If a router is bridging and routing at the same time, how can it tell if it needs to bridge a packet (i.e., forward it based on its destination MAC address alone), or route it (i.e., forward it based on its destination network-layer address)? There are two possibilities. The one that was commonplace in the late 1980s and early 1990s was based on the EtherType. If a bridge/router received a packet for an EtherType that it was configured to route, it would "route" that packet, forwarding it according to the packet's Network-layer destination address and that protocol's "routing table." If the frame's EtherType did not match a protocol that the bridge/router was configured to route, then it was bridged, or forwarded, based solely on its destination MAC address.

Another, perhaps more efficient, way to tell "bridgeable" packets from "routeable" ones is to use the packet's destination MAC address alone. If the frame is addressed to one of the bridge/router's own interface MAC addresses, then it must correspond to a routed protocol. If the frame is received and not addressed to the bridge/router itself, then it must need to be bridged. (Another possibility would be that the packet was misdelivered, or a broadcast or multicast packet. Before bridging a packet, it must be examined to ensure that it should be bridged.)

References – ch6

Perlman, R., Interconnections, Addison-Wesley, Reading, MA. 1992.

Endnotes – ch6

Bridges need not receive an entire frame before making their forwarding decision (in a mode known as "store-and-forward" operation). Alternatively once the first six bytes have been received (i.e., the MAC-layer destination address), the frame may be streamed directly to the appropriate output interface. That mode of operation is known as "cut-through."

Note that in the case of a bridge, the frame's FCS, as received, is unaltered inside the bridge. In a router, the inbound FCS is stripped, and a new FCS is created on the output LAN interface. Why is the FCS different? The LAN headers are different--in particular, the MAC SA is now the router's output interface address, and the MAC DA is that of the next-hop gateway (or of the ultimate destination). Other elements of the LAN frame should be the same. The layer-3 packet is usually unaltered by the router, except for decrementing the packet's Time-to-Live (TTL), and updating the header checksum (if it is present).

If attachment to thick Ethernet is desired, simply use an AUI cable to a transceiver.

Of course, the number of ports is irrelevant, as long as Ethernet's topology rules are observed.

In 10BASE5 environments, the MDI is the vampire tap. In 10BASE2, it is a thinnet coaxial "T" connector.

All collisions should happen within the first 64 bytes (i.e., 512 bits) of the frame. At 10 Mbps, it takes 51.2 microseconds to transmit 512 bits, which is Ethernet's "slot time." The binary exponential backoff algorithm's chosen random numbers represent how many slot times that a station must wait before attempting a retransmission.

Layer-2 interconnction devices are generically known as bridges. These devices make forwarding decisions based solely on information present in the Data Link layer header, generally just the MAC DA.

Of course, the repeaters are actually flooding the traffic out all their interfaces, one bit at a time. The figure focuses on the path between the communicating nodes, but remember that all nodes on the left side, including the bridge, receive each and every bit of every frame between A and B. Likewise, all stations on the right side receeive every frame that C transmits to D.

IEEE LAN standards formerly allowed both a 16-bit and 48-bit address format. 16-bit addressing has been officially deprecated by the IEEE, but it is still supported by certain equipment, notably FDDI.

Broadcast and multicasts are handled the same way, in that they are forwarded out of all ports--except the port on which the frame arrived.

Despite the name including the word "routing," source-route bridges operate at the Data Link layer. Source-routing is a bridging technique, in which the source must discover the route to the destination MAC address.

Explorer frames are a special kind of broadcast packet that bridges use to "thread the needle" from the source to the requested destination. Source-route bridges ensure that the explorer frame is forwarded onto every ring in the bridged topology. If the destination MAC address exists, that station simply responds by reversing the explorer frame's accumulated source route. Once the original station has the source route, then it can send packets directly to that destination.

The actual name for the source-route tag is the "Routing Information Field" (RIF). See Chapter 9 for a more in-depth discussion of Token Ring (IEEE 802.5) LANs.

Yes, it would have been a lot more efficient if the "found station" would have responded via unicast to the searching station. However, the use of broadcast does allow the other stations on the LAN to promiscuously build up their source-routing information caches. In this manner, future broadcasts are limited to stations who have restarted or aged-out old information.

Dr. Perlman's book, Interconnections, contains a very detailed, yet readable, description of all types of bridging and the Spanning Tree Protocol. I heartily recommend this book if you are interested in learning more about bridging. (Interconnections also covers Network layer issues, including the theory of routing protocols.)

Once the root bridge forwards the frame away from itself, the other bridges note the frame arriving from the direction of the root bridge, so they flood the frame on all their non-blocking parts. The universal forwarding rule that all the bridges follow is to flood out all non-blocking parts--except the one on which the frame arrived.

In the event that there is more than one bridge at the lowest priority level, the one with the numerically lowest MAC address becomes the root.

See Chapter 7 for a discussion of the binary exponential backoff algorithm and the capture effect.

At least with a hub, each station could participate in the collective CSMA/CD algorithm and the collisions would ensure fair access to the popular device. In this case, the switch must implement some fairness algorithm within itself, and provide enough memory to buffer packets during the inevitable times of congestion.

Such packets are also dropped if multicast routing is not enabled.

Using the endstation's MAC address as part of the network-layer address eliminates the need for an ARP-like protocol alongside the IPX protocol stack. The router that attaches to the destination network number can easily form a MAC-layer frame to the destination, by simply extracting the destination MAC address from the host portion of the network-layer destination address field (the least-significant six bytes).

Manufacturers, despite their best efforts, may accidentally ship duplicate MAC addresses. As long as two such devices are not connected to the same Layer-3 network number, there is no chance for confusion.

The term "MAC layer" is often used interchangeably with the term "Data Link layer," though the MAC layer is, strictly speaking, a sublayer of the Data Link layer.

We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages.

Bridges need not receive an entire frame before making their forwarding decision (in a mode known as "store-and-forward" operation). Alternatively once the first six bytes have been received (i.e., the MAC-layer destination address), the frame may be streamed directly to the appropriate output interface. That mode of operation is known as "cut-through."

2

Note that in the case of a bridge, the frame's FCS, as received, is unaltered inside the bridge. In a router, the inbound FCS is stripped, and a new FCS is created on the output LAN interface. Why is the FCS different? The LAN headers are different--in particular, the MAC SA is now the router's output interface address, and the MAC DA is that of the next-hop gateway (or of the ultimate destination). Other elements of the LAN frame should be the same. The layer-3 packet is usually unaltered by the router, except for decrementing the packet's Time-to-Live (TTL), and updating the header checksum (if it is present)

3

4

If attachment to thick Ethernet is desired, simply use an AUI cable to a transceiver.

5

Of course, the number of ports is irrelevant, as long as Ethernet's topology rules are observed.

6

In 10BASE5 environments, the MDI is the vampire tap. In 10BASE2, it is a thinnet coaxial "T" connector

7

All collisions should happen within the first 64 bytes (i.e., 512 bits) of the frame. At 10 Mbps, it takes 51.2 microseconds to transmit 512 bits, which is Ethernet's "slot time." The binary exponential backoff algorithm's chosen random numbers represent how many slot times that a station must wait before attempting a retransmission.

8

Layer-2 interconnction devices are generically known as bridges. These devices make forwarding decisions based solely on information present in the Data Link layer header, generally just the MAC DA

9

Of course, the repeaters are actually flooding the traffic out all their interfaces, one bit at a time. The figure focuses on the path between the communicating nodes, but remember that all nodes on the left side, including the bridge, receive each and every bit of every frame between A and B. Likewise, all stations on the right side receeive every frame that C transmits to D

10

IEEE LAN standards formerly allowed both a 16-bit and 48-bit address format. 16-bit addressing has been officially deprecated by the IEEE, but it is still supported by certain equipment, notably FDDI.

11

Broadcast and multicasts are handled the same way, in that they are forwarded out of all ports--except the port on which the frame arrived.

12

Despite the name including the word "routing," source-route bridges operate at the Data Link layer. Source-routing is a bridging technique, in which the source must discover the route to the destination MAC address

13

Explorer frames are a special kind of broadcast packet that bridges use to "thread the needle" from the source to the requested destination. Source-route bridges ensure that the explorer frame is forwarded onto every ring in the bridged topology. If the destination MAC address exists, that station simply responds by reversing the explorer frame's accumulated source route. Once the original station has the source route, then it can send packets directly to that destination

14

The actual name for the source-route tag is the "Routing Information Field" (RIF). See Chapter 9 for a more in-depth discussion of Token Ring (IEEE 802.5) LANs.

15

Yes, it would have been a lot more efficient if the "found station" would have responded via unicast to the searching station. However, the use of broadcast does allow the other stations on the LAN to promiscuously build up their source-routing information caches. In this manner, future broadcasts are limited to stations who have restarted or aged-out old information

16

Dr. Perlman's book, Interconnections, contains a very detailed, yet readable, description of all types of bridging and the Spanning Tree Protocol. I heartily recommend this book if you are interested in learning more about bridging. (Interconnections also covers Network layer issues, including the theory of routing protocols.)

17

Bridges need not receive an entire frame before making their forwarding decision (in a mode known as "store-and-forward" operation). Alternatively once the first six bytes have been received (i.e., the MAC-layer destination address), the frame may be streamed directly to the appropriate output interface. That mode of operation is known as "cut-through."

18

Note that in the case of a bridge, the frame's FCS, as received, is unaltered inside the bridge. In a router, the inbound FCS is stripped, and a new FCS is created on the output LAN interface. Why is the FCS different? The LAN headers are different--in particular, the MAC SA is now the router's output interface address, and the MAC DA is that of the next-hop gateway (or of the ultimate destination). Other elements of the LAN frame should be the same. The layer-3 packet is usually unaltered by the router, except for decrementing the packet's Time-to-Live (TTL), and updating the header checksum (if it is present).

If attachment to thick Ethernet is desired, simply use an AUI cable to a transceiver

21

Of course, the number of ports is irrelevant, as long as Ethernet's topology rules are observed.

22

All collisions should happen within the first 64 bytes (i.e., 512 bits) of the frame. At 10 Mbps, it takes 51.2 microseconds to transmit 512 bits, which is Ethernet's "slot time." The binary exponential backoff algorithm's chosen random numbers represent how many slot times that a station must wait before attempting a retransmission

23

Layer-2 interconnction devices are generically known as bridges. These devices make forwarding decisions based solely on information present in the Data Link layer header, generally just the MAC DA

24

Of course, the repeaters are actually flooding the traffic out all their interfaces, one bit at a time. The figure focuses on the path between the communicating nodes, but remember that all nodes on the left side, including the bridge, receive each and every bit of every frame between A and B. Likewise, all stations on the right side receeive every frame that C transmits to D

25

IEEE LAN standards formerly allowed both a 16-bit and 48-bit address format. 16-bit addressing has been officially deprecated by the IEEE, but it is still supported by certain equipment, notably FDDI.

26

Broadcast and multicasts are handled the same way, in that they are forwarded out of all ports--except the port on which the frame arrived.

27

Despite the name including the word "routing," source-route bridges operate at the Data Link layer. Source-routing is a bridging technique, in which the source must discover the route to the destination MAC address

28

Explorer frames are a special kind of broadcast packet that bridges use to "thread the needle" from the source to the requested destination. Source-route bridges ensure that the explorer frame is forwarded onto every ring in the bridged topology. If the destination MAC address exists, that station simply responds by reversing the explorer frame's accumulated source route. Once the original station has the source route, then it can send packets directly to that destination

29

The actual name for the source-route tag is the "Routing Information Field" (RIF). See Chapter 9 for a more in-depth discussion of Token Ring (IEEE 802.5) LANs.

30

Yes, it would have been a lot more efficient if the "found station" would have responded via unicast to the searching station. However, the use of broadcast does allow the other stations on the LAN to promiscuously build up their source-routing information caches. In this manner, future broadcasts are limited to stations who have restarted or aged-out old information

31

Dr. Perlman's book, Interconnections, contains a very detailed, yet readable, description of all types of bridging and the Spanning Tree Protocol. I heartily recommend this book if you are interested in learning more about bridging. (Interconnections also covers Network layer issues, including the theory of routing protocols.)

32

Once the root bridge forwards the frame away from itself, the other bridges note the frame arriving from the direction of the root bridge, so they flood the frame on all their non-blocking parts. The universal forwarding rule that all the bridges follow is to flood out all non-blocking parts--except the one on which the frame arrived.

33

In the event that there is more than one bridge at the lowest priority level, the one with the numerically lowest MAC address becomes the root.

34

See Chapter 7 for a discussion of the binary exponential backoff algorithm and the capture effect.

35

At least with a hub, each station could participate in the collective CSMA/CD algorithm and the collisions would ensure fair access to the popular device. In this case, the switch must implement some fairness algorithm within itself, and provide enough memory to buffer packets during the inevitable times of congestion

36

Such packets are also dropped if multicast routing is not enabled.

37

Using the endstation's MAC address as part of the network-layer address eliminates the need for an ARP-like protocol alongside the IPX protocol stack. The router that attaches to the destination network number can easily form a MAC-layer frame to the destination, by simply extracting the destination MAC address from the host portion of the network-layer destination address field (the least-significant six bytes).

38

Manufacturers, despite their best efforts, may accidentally ship duplicate MAC addresses. As long as two such devices are not connected to the same Layer-3 network number, there is no chance for confusion

39

The term "MAC layer" is often used interchangeably with the term "Data Link layer," though the MAC layer is, strictly speaking, a sublayer of the Data Link layer.