Using the NLR for WAN Backbone Bandwidth

Cisco supports more than 60 percent of its global employee population in the United States, and Cisco network traffic is growing on the North American backbone WAN at a faster rate than anywhere else in the world. Cisco’s North American backbone connects the company’s San Jose, California, headquarters with other large Cisco locations, including Richardson, Texas (our new primary production data center), and Raleigh, North Carolina (a key development location and our primary data center disaster recovery site). In addition, there are seven other locations on the North American backbone that serve as WAN aggregation hubs for more than 120 regional WAN sites, as well as interconnects to backbone locations in Europe, Asia, and South America.

The WAN backbone in North America has grown from three hub locations connected via DS3 (45 Mbps) in 1999 to 10 locations connected via OC-48 (2.488 Gbps) in 2007. In 2009, The Cisco OC-48 backbone was stressed by both ongoing data center migrations (thousands of applications and petabytes of data) and rapidly increasing voice and video traffic. Utilization on several individual backbone circuits routinely reached 80 percent or higher. High utilization created periodic performance issues and put the business at risk of significant impact should a backbone circuit fail during peak traffic hours.

Today we are migrating our main business data centers in San Jose and Raleigh to a new data center pair in Richardson. This migration creates several new challenges. The first is the bandwidth required to migrate applications and data between old and new data centers. Second, as the migration is quite large and will take place over two years, there will be extended periods where applications and data formerly in a single location are temporarily split between regional data centers. This places new and sometimes significant traffic demands on our WAN backbone. Last, moving applications and data creates new traffic flows between the new data centers and the locations where users are based. The most significant traffic increase is in San Jose, where an employee population of more than 20,000 is now separated by 1500 miles from application and data that used to be local.

Network video usage has increased steadily over the years at Cisco. In the beginning it was mostly nonreal-time, noninteractive video such as IPTV (multicast-based) and stored video-on-demand (VoD) service. Since the rollout of Cisco TelePresence in 2006, real-time, interactive video traffic has exploded. Cisco currently has nearly 1000 TelePresence units deployed globally, with that number expected to increase. In addition, new high-definition desktop video that enables ad-hoc video conferencing at users’ desktops and interoperability with TelePresence are adding more fuel to the fire.

In our quest to find creative and cost-effective ways to meet growing bandwidth demands on our WAN backbone, we investigated using the National Lambda Rail (NLR). The NLR is a 12,000 mile, nationwide high-speed Cisco powered optical dense wavelength-division multiplexing (DWDM) network, built and operated by a more than 280-member nonprofit consortium of regional optical networks, government labs, and research universities. The NLR provides several wavelength, Ethernet, and IP-based services to its members. It was designed to provide production transport for the consortium for cutting-edge research in disciplines as diverse as biomedicine and physics (among others). Cisco is a consortium member and uses portions of this network to transport Cisco traffic.

Once we decided to use the NLR we had to resolve several issues and challenges. The main issue was exactly what we’d use the NLR for. We decided to build a 10-Gigabit Ethernet triangle between Cisco’s three main data center facilities in San Jose, Raleigh, and Richardson. Pursuing that led to our first challenge. How would Cisco get from the Cisco facilities in each location to the nearest NLR point of presence (PoP)?

To provide access between Cisco facilities and the NLR PoPs we decided to build out our own optical metro access ring using Cisco ONS equipment and leased dark fiber. The dark fiber that reached the NLR PoPs was leased via an IRU (Indefeasible Right to Use). The IRUs consist of one or more existing fiber rings within each metro, but since none of the IRUs actually touched any Cisco building, construction of fiber “laterals” were necessary to connect to the IRUs and bring fiber connectivity to Cisco. These laterals required city permits for construction (trenching). Those tasks required some of the longest project lead times. Surprisingly, the construction of these laterals was also one of the biggest cost components (more than the IRUs themselves).

After we had fiber connectivity between the desired Cisco buildings and their local NLR PoP we needed to build out DWDM infrastructure so that we could extend circuits built on-net across the NLR end-to-end between Cisco buildings. We chose the Cisco ONS 15454 MSP (multiservice platform) to build out the required DWDM connectivity between each Cisco building and the local NLR PoP. For each of the Cisco ONS 15454 DWDM nodes installed at an NLR PoP we had to arrange lease space (rack space and power) as well as “remote hands” services for routine maintenance and troubleshooting purposes. The following diagram shows the end-to-end solution and the management demarcation points.

When the 10-Gigabit Ethernet circuits were operational we connected them to our standard backbone router platform, the Cisco 7604, with a SIP-600 and a 10-Gigabit Ethernet WAN Shared Port Adapter. From a Layer 3 perspective, the circuits built on the NLR are treated exactly as any other circuits we might purchase from a service provider. The only difference is that Cisco IT is now its own service provider for portions of these new circuits. That does create some new and interesting challenges and probably would be a good topic for a future update.

Using the NLR is not an option for most companies in North America. However, numerous dark/dim fiber and wavelength providers offer long-haul service equivalents. Designing, building, and operating optical networks probably isn’t the right option for everyone. However, this does illustrate a service model separate from traditional telco-based service offerings for enterprises with large bandwidth requirements and the courage to explore nontraditional options.

The keys to managing any multi-vendor environment are clearly defined ownership/demarcation points and well-documented operational procedures. While the use of the NLR presents some new challenges it's not the first multi-vendor solution we've deployed. The main difference with our use of the NLR is that it's the first time that Cisco IT has taken on a role that historically would have gone to a service provider.

Some of the individuals posting to this site, including the moderators, work for Cisco Systems. Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of Cisco. The content is provided for informational purposes only and is not meant to be an endorsement or representation by Cisco or any other party. This site is available to the public. No information you consider confidential should be posted to this site. By posting you agree to be solely responsible for the content of all information you contribute, link to, or otherwise upload to the Website and release Cisco from any liability related to your use of the Website. You also grant to Cisco a worldwide, perpetual, irrevocable, royalty-free and fully-paid, transferable (including rights to sublicense) right to exercise all copyright, publicity, and moral rights with respect to any original content you provide. The comments are moderated. Comments will appear as soon as they are approved by the moderator.