Pages

Tuesday, May 24, 2011

The Necessity for Highly Available WLANs
Wireless LANs are mission-critical, and have been for a while. More so than ever, the availability of the wireless network is important to operate business, educate students, and interact with consumers. Organizations across many industries have realized the benefits of that mobility, users increasingly expect ubiquitous network access, and machine-to-machine (M2M) communications are poised to grow the demand exponentially. In such an atmosphere, the wireless network must be high performing, resilient, adaptable, and highly available.

Many organizations provide highly available WLANs by using pre-shared key (PSK) security since there is no reliance on external systems for network access. EAP authentication provides higher security than PSK deployments, but suffers from higher complexity and reliance on AAA services external to the WLAN for network access. Modern wireless networks typically rely on RADIUS as this service. However, deploying local authentication services at each remote site can be cost prohibitive due to software licensing and server resources. This, coupled with higher latency involved with EAP authentication across a WAN circuit, typically leads many organizations to adopt central AAA services for non-mission critical applications, and forgo stronger EAP security and implement PSK for mission-critical applications such as VoWiFi or transaction processing. This results in a trade-off of security for availability, which is sub-optimal and introduces a much higher amount of risk to the organization.

Can't we have both high security using EAP with dynamic per-user authentication and keying coupled with the high availability of PSK networks? We can, using Aerohive credential caching!

Credential Caching Overview
Aerohive authentication cache provides remote branch offices the ability to cache successful client authentications from central directory services for use when the central site or services are unavailable, such as a WAN outage. This provides high availability of the WLAN for remote sites, enabling continuity of service for locally deployed applications, collaboration among users, and local Internet access (if deployed locally at branches and not tunneled back to the corporate head-end).

It does this by providing local RADIUS service within select HiveAPs and integrating with corporate directory services, including Active Directory, Open Directory, and native LDAP systems. In this scenario, the HiveAP provides EAP authentication for clients and functions as both the authenticator and authentication server simultaneously. EAP types supported include PEAP, EAP-TLS, EAP-TTLS, and LEAP. On the back-end the HiveAP is configured with directory access credentials to authenticate users and cache their login information locally from the directory server for offline use (similar to a Windows computer object joined to the domain). This may include Kerberos v5 if using Active Directory as the back-end directory service.

Note - I know some of you will be wondering why I'm not discussing Private PSK (PPSK) or Dynamic PSK (DPSK) offered by Aerohive and Ruckus. While these are novel approaches for small scale solutions involving user interactive device platforms, these solutions cannot address many deployment scenarios involving embedded devices or non-user devices, as is typical with voice handsets and vendor transaction processing systems. A more robust and scalable solutions is required for such scenarios.

Deployment Requirements
To deploy credential caching, administrators must configure a the following settings on HiveAPs at the branch sites:

Determine the remote HiveAPs that will provide local AAA server (RADIUS) services

Assign static IP addresses to the HiveAPs providing AAA services

Create certificate(s) for HiveAP AAA servers to use during EAP authentication

Integrate the HiveAP AAA servers with user directory services

Create the HiveAP AAA server (RADIUS) configuration

Create AAA client settings for all other NAS client APs, which point to the HiveAP AAA server(s) for RADIUS authentication

Deploy configurations to all HiveAPs

First, determine which HiveAPs at each remote site will be AAA servers providing RADIUS and EAP termination. One or multiple HiveAPs may provide AAA services for other HiveAPs. Multiple APs are recommended to provide fault tolerance at each branch location.

Second, the selected HiveAPs must be configured with static IP addresses since they will be providing services that other NAS client APs will rely on for user authentication. Assign static IP addresses from within individual AP configuration settings in the "Interface and Network Settings" section (be sure to also configure DNS servers in the WLAN profile when using static IP addresses to ensure name resolution for HiveManager discovery).

Next, create one or multiple RADIUS server-side certificates for use by the HiveAP AAA servers. The server-side certificates are required during TLS tunnel setup between the RADIUS server and client during EAP authentication. This can be accomplished by creating a Certificate Signing Request (CSR) either from HiveManager or using an external utility such as OpenSSL. Using HiveManager is a simple process; navigate to Advanced Configuration > Keys and Certificates > Server CSR. Fill out the form and click "Create".

Then send off the CSR file to the Certificate Authority for verification and certificate creation. Once the CA has issued the certificate, navigate to Advanced Configuration > Keys and Certificates > Certificate Mgmt to import the certificate into HiveManager. The private key file will automatically be generated with the CSR and listed in this section. No manual merging of the private key and issued certificate are required, but may be performed if desired. Also import the CA certificate to send the complete certificate chain to clients during outer TLS tunnel establishment. It will also be used to verify trust of client certificates if using EAP-TLS authentication. The certificate, private key, and CA certificate files will be pushed to HiveAP AAA servers later during the configuration.

Integration with AAA user directory services is accomplished in the Advanced Settings > Authentication > AAA User Directory Settings section. In this section, you will define a profile that HiveAP AAA servers will use to query a back-end directory to authenticate users. This may include Active Directory, Open Directory, or native LDAP. In this example we will use Active Directory since this is very common. Create a new profile to get started.

Aerohive HiveManager AAA User Directory Settings

Select the directory type from the option buttons and select or define a new IP address / host name network object for the Directory Server. Multiple directory servers may be configured for redundancy.

For Active Directory integration, specify the default domain in which the HiveAP computer object, AD server, and user objects to be authenticated reside (select the root domain of the forest). If users in multiple domains need to be supported, configure the "Multiple Domain Info" section at the bottom. Configure the BindDN account (user object) that HiveAP AAA server(s) will use when authenticating itself to the directory server to lookup user accounts. This allows the HiveAP RADIUS service to authenticate wireless users. Additionally, in order to perform credential caching, configure the Admin User account which is used by the HiveAP to login to Active Directory and add itself as a computer object in the domain or computer OU specified. This allows the HiveAP to cache user NTLM credentials locally in access point RAM for use when the directory server(s) are not available. The admin user specified must have rights to add computer objects into the domain or OU specified.

Note - Using this method, the admin user account information is stored in HiveManager and in the flash configuration file of HiveAPs. If this is not desirable, HiveAPs may be joined individually to the domain from the CLI which does not get stored in any configuration files. Issue the following command: "exec aaa net-join { primary | backup1 | backup2 | backup3 } username USER password PASSWORD".

Next, create a HiveAP AAA Server profile for the access points providing RADIUS service from the Advanced > Authentication > HiveAP AAA Server Settings section. Here you will define what EAP methods are supported, the certificate files to be used during EAP, database access settings, and NAS clients. For the database access settings, select the configured directory service type and previously configured AAA user directory profile. Enable RADIUS Server Credential Caching and set a cache lifetime which determines how long cached credentials will be used when the directory server(s) are not available. The remaining three interval timers determine how long an individual directory server is marked down before being retried (default 600 sec.), how long to use the local database if all directory servers are down before retrying (default 300 sec.), and how long to keep retrying an unresponsive remote directory server in the list before moving on to the next server in the list (default 30 sec.).

Enable Credential Caching in the HiveAP AAA Server Settings

In the NAS settings section be sure to configure the IP address of every NAS client AP that needs to use the HiveAP RADIUS server to authenticate clients, or configure a IP network object for broader access by multiple APs in the same subnet range.

Finally, create a AAA Client Settings profile which will be deployed to all of the other NAS client APs instructing them to contact the HiveAP AAA Server(s) for RADIUS authentication services. This is configured from the Advanced > Authentication > AAA Client Settings section.

Once all configuration is complete, assign the HiveAP AAA Server Settings profile to APs designated to be RADIUS servers from within individual AP configuration settings in the "Service Settings" section. Also embed the AAA Client Settings profile within SSID profile(s) as the assigned RADIUS server(s), and deploy configurations to all APs.

Verification
Once the configuration for HiveAP RADIUS servers and HiveAP NAS clients have been deployed to access points, verify directory integration by reviewing access point and directory server logs. Logs in both locations will show successful bind to the LDAP server and computer object creation.

Finally, verify correct credential caching operation by performing client authentication while directory services are available and then re-testing once directory services are unavailable using the local AP cache. Issue the "show auth" command to view currently authenticated clients as well as cache entries created. Below we can see one current user session followed by an entry for the same client in the local cache table.

Finally, disable the back-end directory services and re-test client authentication using the credential cache on the access point. The following log output is in reverse chronological order, and shows the HiveAP AAA server unable to bind to the directory server at 10.22.33.44, then successfully authenticating the client "HiveUser" via RADIUS using the local credential cache from the NAS client at 10.108.30.18, which is its own IP address.

Therefore, using the Aerohive credential caching feature, EAP authentication services can be maintained during a WAN outage or central site service disruption. This feature bridges the gap for branch sites, allowing continued WLAN network access by clients during temporary service disruption. The cache lifetime setting dictates the tolerable duration of a service outage.

Revolution or Evolution? - Andrew's Take
Industries from retail, education, hospitality, healthcare, and transportation are relying on mission-critical wireless networks to operate business. They expect a highly secure and highly available network at reasonable cost and with minimal complexity. And they won't tolerate trade-offs between these features. The status quo from most WLAN vendors is to provide basic RADIUS integration for client authentication. Aerohive has gone further, integrating native LDAP and Kerberos functionality which provides user credential caching enabling a highly available WLAN network without compromising security to get there.

Aerohive isn't rewriting the book on RADIUS, LDAP or Kerberos. These are existing, mature protocols. However, Aerohive has applied these features in a new and unique way that can dramatically improve WLAN availability and provides tremendous benefits for distributed organizations with branch offices.

Monday, May 23, 2011

The old adage goes, "we learn best from our mistakes". Or, to be more accurate, we learn best by "realizing we are wrong".

After all, we are wrong all the time! It's not just being wrong about minor factual details, but goes deeper, and includes our entire individual belief structures. Sure, we get that humans are fallible and wrong in the abstract, but when it comes down to ourselves, we live in this "bubble" of rightness. And individuals will go to great lengths to avoid being wrong. This is a problem for us in our personal and professional lives, as well as our culture as a whole. We need to realize that it is acceptable to be wrong.

This is the philosophy discussed by Kathryn Schulz at TED2011, where she analyzes how we mis-understand the signs around us and how we behave when that happens. It is a thoughtful introspective into human nature, and provides insightful lessons to allow us to be more open to "wrongness" which is fundamental to who we are as a species.

"This internal sense of 'rightness' that we all experience so often, is not a reliable guide to what is actually going on in the external world. And when we act like it is, and we stop entertaining the possibility that we could be wrong ... this is a huge practical problem. ... This attachment to our own rightness keeps us from preventing mistakes when we absolutely need to and causes us to treat each other terribly."

We are all out off the ledge, being wrong, all the time. Our minds can see the world as it "isn't". This is fundamental to who we are. Realize, and accept, when you are wrong. Wrongness provides the driving force for our moral, intellectual, and creative advancements as a society.

Given that the pass rate on the CCIE Wireless lab exam is really low, this gives Cisco and CCIE candidates an opportunity for course correction. It appears the pass rate is only around 13%, but Cisco stopped reporting official numbers last year so it's hard to be sure. Also, this doesn't give us insight into the number of attempts that candidates are requiring in order to pass. I'm sure Cisco gained a lot of information into why candidates were failing the lab, and from what I've heard in the candidate ranks it appears that most are stumbling on the wired network integration and wired quality of service sections.

Here is a summary of the major changes and new topics candidates will need to be prepared to encounter.

New Software

WLC updated from version 4.2 to 7.0.116.0. Cisco purposefully waited until 7.0 maintenance release 1 was publicly available before announcing the lab exam update.

Autonomous updated from version 12.3(8) to 12.4(25d)

WCS updated from version 4.2 to 7.0

ACS server updated from version 4.2 to 5.2. Candidates will need to learn the new ACS software interface since there was a major overhaul between 4.x and 5.x versions.

5500 Series Wireless LAN Controller (the WiSM and 4400 Series have been removed)

1260, 1040, and 3500 Series APs (the 1240 and 1250 Series have been removed)

3300 Series Mobility Services Engine (MSE) (the 2700 Series Location Appliance has been removed)

New Lab Exam Topics

VSS - Virtual Switching System on the Catalyst 6500 series platform used to pool multiple physical switches into one logical virtual switch.

Static Multicast Routing. PIM and IGMP were on version 1, but static multicast routing is a new addition.

IPv6 subnetting and static routing (on the wired network). I find this particularly interesting since Cisco only supports IPv6 bridging on their wireless equipment right now, and not even that when using H-REAP)

Basic PKI for dot1x and web-auth. This was inferred in version 1, but is now explicitly called out as a requirement.

AP Groups used for SSID availability, which replaces WLAN Override from version 1.

WLAN load-balancing, BandSelect, and Passive Client support.

H-REAP local auth, groups, and address learning by the WLC.

ClientLink (Beamforming)

CleanAir (Spectrum Analysis)

VideoStream (Multicast Optimization)

Mesh networking

WCS Virtual Domains

WCS High Availability

Wireless Intrusion Prevention Services (WIPS) with the MSE appliance.

Context Aware Services (CAS) with the MSE appliance replaces location tracking with the 2700 appliance from version 1.

The big items in this list that are really new to version 2 are VSS, IPv6, LSC, OfficeExtend, BandSelect, Passive Client support, ClientLink, CleanAir, VideoStream, Mesh, and WIPS.

Revolution or Evolution? - Andrew's Take
The CCIE Wireless lab exam was already extremely tough and had a comparatively low pass rate versus other CCIE tracks. The inclusion of many new wired network technologies and protocols have undoubtedly expanded the scope of the lab exam. Coupled with the feedback that I am hearing from candidates, the wired integration seems to be the stumbling block for most candidates, and the expansion of wired requirements in the lab may make it even harder to pass. Time will tell come November and beyond as candidates begin taking the new version. My recommendation is for traditional RF and wireless engineers to be fluent in wired networking technologies, as the Cisco CCIE Wireless is as much about wired networking as it is about wireless (and even less so about actual RF engineering, really). Candidates with strong route/switch background seem to be fairing a bit better at the exam in my estimation due to the exam structure and topics.

Tuesday, May 17, 2011

Everyone wants in on the "Cloud" hype, including
kitchen sinks and now centralized WLAN controllers!
Image courtesy of Accu-Tech.

Overview of the Cisco Flex 7500 Wireless LAN Controller Solution
Last week Cisco announced a new wireless LAN controller platform named the Flex 7500 Series. This solution is aimed at providing a large-scale, centralized wireless LAN controller solution using Cisco's existing Hybrid Remote Edge Access Point (H-REAP) architecture. Along with this new release, Cisco is re-branding the H-REAP solution to Flex controllers and FlexConnect access points. The FlexConnect wireless architecture distributes data plane (traffic forwarding) operation out at the edge, while centralizing control plane operations in a controller in the data center.

Having acquired hands-on lab time with a pre-release version of the Flex controller, our team was able to run it through its paces to evaluate everything from high availability and failover to performance.

This solution is aimed at the remote branch office, utilizing data center consolidation of expensive wireless controllers. Cisco calls this architecture the "Lean Branch" due to the reduction of on-premise equipment and IT staff in remote branch offices. This seems to align well with limited IT budgets by reducing the duplication of network hardware out in every branch office.

Essentially, the Cisco FlexConnect architecture amounts to an extension of existing H-REAP foundational technologies, and provides enhanced solution scalability to support thousands of access points and tens of thousands of users, across hundreds of branch locations (per-controller). It also bundles some enhancements to previous H-REAP capabilities to address remote site survivability and high availability requirements that are required in a centralized architecture when the WAN or central services are unavailable.

Also, a quick note on usage of the term "cloud". Enterprises have had central services hosted in private data centers for a long time. Just because an architecture relies on services hosted in a data center does not make it a "cloud" solution. True "cloud" solutions implement on-demand provisioning of services and capacity across multiple data centers. This is typically accomplished through elastic architecture, hardware abstraction, and virtualization of some form, and provides inherent client mobility. In my opinion, the FlexConnect architecture only meets one of those principles, inherent client mobility, and should not be called a "cloud" solution. The term is definitely over-hyped in the technology sector.

Value Proposition
The value of the Flex 7500 series is a platform by which Cisco can offer a large-scale controller-based solution that can compete against fully distributed wireless intelligence at the edge.

There is no doubt that the wireless LAN architecture is shifting back from a centralized control and data plane model pioneered a decade ago by Airespace and Trapeze, to distributed intelligence back at the edge in access points. As Bob O'Hara explains, the shift to wireless controllers made sense to enable advanced control plane functionality and coordination (dynamic radio management, L3 mobility, guest networking, key caching, etc.) among access points with limited processing capability. However, the advancements in silicon manufacturing are now at a point where this is no longer a restriction of hardware processing capacity, but of software development to enable intelligent AP coordination. Additionally, as wireless network bandwidth capacity continues to grow with the release of 802.11n, and subsequently with .11ac and .11ad, the controller can quickly become a bottleneck for both data plane traffic and access point control capacity.

It's apparent that Cisco and other large market-share wireless LAN manufacturers cannot dive straight into distributed access point intelligence due to cannibalization of existing controller product line revenue and support for their existing installed customer base. Therefore, Cisco must approach this market transition with a phased migration strategy. The Flex 7500 is large step in that direction.

Therefore, the value proposition of the Flex 7500 platform can be described as:

Lower capital expense by eliminating the need for distributed controllers at each branch location. Going by list pricing, the Flex 7500 offers 43-47% cost savings versus the 5500 series controllers to support the same amount of access points.

Less wasted controller licensing because multiple remote branches can utilize the same controller, thus pooling licenses into more of an enterprise-wide model. Cost savings will vary greatly depending on the exact size of branch AP deployments. At first release, the Flex 7500 will support up to 2,000 APs and 20,000 clients per-controller, with future software releases and licensing upgrades promising support for up to 5,000 APs without hardware upgrades.

Lower operational expense because fewer pieces of hardware have to be managed and supported, and controllers are removed from remote branches relieving the need for local IT support or truck-rolls. I highly doubt an enterprise with an efficient branch model would have unnecessary local IT support and this isn't likely to result in much expense reduction for most organizations with well-established procedures already in-place. The largest savings is likely to come from reduced SmartNET expenses.

Consistent policy enforcement across the organization with greater visibility and centralized control of access point configuration. This is arguably a function of any good management platform, including Cisco WCS and the forthcoming Cisco Prime NCS, so I don't really buy this as a value of moving to the Flex controller platform.

Simplified controller management and upgrades because less hardware is required. This should actually benefit large controller installations, where the time necessary to upgrade large amounts of equipment can become time consuming. Coupled with AP image pre-download across the WAN, these should save network administrators many hours of tedious work.

All in all, the benefit for customers is really a large-scale controller solution that is much more cost-competitive than distributed controllers. Additionally, Cisco is looking to retain existing customers that are running their legacy Aironet Autonomous infrastructure and have found distributed controllers cost prohibitive to deploy, but are increasingly requiring features and functionality only found in their Unified architecture.

Hardware Platform

The Flex 7500 is built off the IBM x-Series server platform, specifically the x3550 M3. The system physically mounts into a standard server rack (not a 2-post telecom rack us network folk are accustomed to using), and occupies 1U rack space.

On the internals, it's specified with 2x Quad-Core 2.4GHz Intel Xeon E5620 processors, 12 GB DDR3 1333MHz RAM, 2x 146GB 15K RPM SAS hard drives, and optional redundant power supplies.Cisco is attempting to reduce the complexity associated with typical server builds by pre-configuring the server with standard components and offering minimal substitutions, such as the redundant power supply for example. This should provide most customers easier ordering and less confusion when moving from wireless controller appliances (4400, 5500 series) to the new server-based platform. It's also unclear at this time how the hard drives are configured for high availability, but it's most like a RAID-0 setup.

The Flex 7500 provides a myriad of network ports, but most are dedicated for specific purposes.

For network connectivity, a myriad of ports are provided on the back of the unit (as shown). However, in order to keep complexity minimal and provide a consistent software image and configuration process with existing WLC appliance product lines, the network ports are dedicated for specific uses, as follows:

Fast Ethernet is provided for system management by IBM and is not configurable from the WLC software.

Port 1: 1G is used as the WLC Service Port.

Port 2: 1G is reserved by the WLC for future use with High Availability enhancements.

Port 1: 10G requires an external 10G SFP and is used as the WLC Management Port, similar to the existing WLC Distribution System ports on the 4400 and 5500 Series platforms.

Port 2: 10G is reserved by the WLC as a backup Management Interface in the case of port failure of the primary port.

Option Gb Ethernet ports are not used.

Serial Port is used for local console connections for staging and configuration of the system.

It is important to call out that the system only supports fiber 10G SFPs (SFP-10GB-SR) for the WLC Management Ports. This in-turn requires the Flex 7500 to connect to a 10G capable switch or line card. Additionally, the system does not support link aggregation (LAG) as the existing controller platforms do.

From a deployment perspective, image management and system configuration are almost identical to existing controller platforms. The systems uses the same initial setup and configuration wizard, CLI software command interface, and graphical user interface. Network engineers familiar with the current Unified Wireless Network solution will have a very short learning curve migrating to the Flex 7500 platform, mainly with the physical installation and cabling requirements.

Feature Enhancements & Limitations
Most of the feature enhancements included in the Flex 7500 platform are a function of software enhancements made in the latest release of WLC code version. I provided an overview of the major enhancements in this release in my previous post Cisco WLC 7.0.116.0 New Features.

As a recap, here are the major improvements previously described, as well as a few others that are of note specifically with the Flex 7500 platform:

WIPS Enhanced Local Mode provides a subset (~73%) of Adaptive WIPS capabilities into access points that also handle client connections, eliminating the need for dedicated monitor mode APs or a 3rd party overlay WIPS solution. Note that the Cisco WCS / Prime NCS and the Mobility Services Engine (MSE) are still required for WIPS services. Additionally, since most of the processing occurs on the AP prior to sending data back to the MSE, there should be minimal WAN bandwidth impact.

H-REAP Fault Tolerance provides enhanced site survivability when the link to the Flex controller is unavailable. This is an improvement to Standalone mode operation of the AP to provide seamless client connectivity throughout the failure and fail-back processes. This is a source of competitive advantage for Cisco, as they handle these processes significantly better than any other controller-based vendor today (not controller-less vendors). However, customers should be sure to review branch office design for critical services such as RADIUS, Active Directory, DHCP, and DNS to ensure complete site survivability in a WAN or data center outage scenario.

Cisco also claims complete branch office survivability when using H-REAP local authentication. However, this requires static definition of users and passwords pushed to each AP for local authentication by the AP using LEAP or EAP-FAST (in either connected or standalone mode). Most customers will find this lacking in true scalability as well as a point of risk for the organization from a security perspective.

Increased H-REAP Group Scalability allows up to 500 groups to be defined per-controller (hence the 500 branch site limit of the system) and up to 50 access points per-group to support larger branch sites. H-REAP groups are the primary method used to distribute wireless key cache for fast roaming support using CCKM and OKC, and for common configuration of RADIUS backup servers and local authentication users.

Increased WAN Tolerance allows up to 2 second WAN latency between an H-REAP access point and the controller, but only when using H-REAP local authentication. The same 300ms WAN latency limit is in place when using external RADIUS / AAA authentication, mainly due to client timeout for the authentication to complete.

AP Mode Auto-Conversion allows the Flex 7500 platform to ease deployment of new or existing Local mode APs. Upon joining the Flex controller, the system can be configured to automatically convert all APs to H-REAP mode without administrator intervention. Alternatively, they can be converted to Monitor mode instead, or this feature can be disabled. This should come as a welcome feature for network admins during migration!

Along with the good, come the bad. Unfortunately, the FlexConnect architecture is simply an enhancement to H-REAP, and therefore inherits the existing H-REAP feature limitations previously discussed.

In addition to the standard H-REAP guidelines and feature limitations, FlexConnect is also missing a few other features in the first release. Most notable among these are lack of support for location tracking and mesh networking. Location tracking is on the roadmap for inclusion in the next major software release, it just didn't make first shipment. Certain AP modes are not supported by the Flex controller at this time due to solution scalability concerns, including Local, Sniffer, Rogue Detector, and Bridge/Mesh.

Minimizing Operational Risk
There are also a few other items to be aware of when deploying centralized controllers. Since a large portion of the distributed wireless network is being controlled from a single point, a greater potential exists for large network disruptions due to mis-configuration or administrator error. This highlights the need for thorough lab testing of changes and upgrades prior to production rollout. Customers deploying the FlexConnect architecture should ensure adequate change control procedures exist, including test, implementation, verification, and backout plans. Additionally, granular administrative access should be enforced, and support procedures should be reviewed to minimize the chance for error during incident response.

From a hardware perspective, server hard drive failure should be integrated into standard server monitoring and maintenance activities to identify and replace failed units promptly. The reliance on mechanical hard drives in the Flex controller may result in increased support issues versus embedded flash drives found in existing wireless LAN controller platforms. However, when properly planned for this should not be a major concern.

Revolution or Evolution? - Andrew's Take
The Cisco Flex 7500 centralized controller platform is an enhancement of Cisco's previous H-REAP architecture, now re-branded FlexConnect. This solution is an evolutionary advancement of the controller architecture which improves scalability and high availability, while reducing customer expense of deployment versus a distributed controller architecture. It is aimed squarely at highly distributed organizations where the economics of branch site deployment are of paramount consideration. Many customers will find this evolution a welcome addition to the Cisco portfolio. However, Cisco still has a long road ahead on the migration path to a fully intelligent edge access point architecture, and will continue developing products that allow the company to replace existing wireless LAN controller revenue with new distributed features.

Friday, May 13, 2011

Competition in the wireless LAN industry is heating up. With the rapid evolution of wireless LAN architectures and feature innovation, debate has arisen regarding the most important criteria for vendor selection. I would like to here what you have to say about the subject.

Thursday, May 12, 2011

It was only a matter of time, but Wi-Fi Direct capable equipment is now hitting the consumer market. According to In-Stat research, over 173 million devices are expected to ship with Wi-Fi Direct in 2011. The functionality is really enabled in software, so existing devices should be able to support Wi-Fi Direct with firmware or software upgrades, but manufacturer support with legacy devices may not be a priority. Expect to see this capability mainly built into newly developed products.

A few of the notable announcement thus far include:

Eye-Fi X2 Card
I wrote about the initial announcement back in January. As I said back then:

This is a smart play by the company to leverage the increasing utilization of "smart" mobile devices by consumers to allow photo enthusiasts to immediately transfer photos from a DSLR or professional camera to a phone, tablet, or other computing device. This means less effort and complexity for photographers in the field. For the professional, this could mean immediate review for correct composition without the time required to change cameras, unpack editing equipment, or potentially miss a great photo opportunity. For consumers use, this can mean immediate upload to social networks without having to wait until returning home which could be beneficial during travel or vacations.

HP Wi-Fi Direct Mobile Mouse
Today, HP announced a Wi-Fi Direct capable mouse, eliminating the need to use Bluetooth or an external wireless dongle, instead using a workstation's built-in Wi-Fi receiver. HP claims first to market with this type of peripheral. The mouse should also feature twice the battery life of comparable Bluetooth models.

Revolution or Evolution? - Andrew's Take
Wi-Fi Direct will revolutionize information sharing among portable and fixed electronics. Expect to see many more announcements this year about Wi-Fi Direct products, including printers, gaming systems, workstations, laptops, tablets, and peripherals. I also wouldn't be surprised to see it pop into other consumer electronics like connected televisions, Blu-ray players, and streaming music systems.

As a Ford Sync owner, I would also be interested in seeing it included in a future firmware update. Currently, Sync pairs with portable electronics via Bluetooth for audio and data transfer, and Wi-Fi Direct would be logical extension of the system.

Wednesday, May 4, 2011

Protocol analysis skills continue to be increasingly important for network engineers in all fields, especially for wireless engineers. As the 802.11 protocol continues to increase in complexity, maintaining interoperability while implementing support for numerous optional features becomes critical. Even more critical is having the skill set to investigate, diagnose, and resolve issues.

Engineers attempting to learn protocol analysis techniques often start with free tools that allow them to get comfortable looking at packets and expected versus abnormal behavior. However, this often comes at the expense of sophisticated analysis features which can greatly simplify the process and reduce analysis time. This can be both a blessing and a curse at the same time. It's a blessing for engineers because it forces them to learn the fundamentals of protocol analysis without the aid of automated tools that abstract the underlying protocol operation. This is a good thing (despite initial grumblings by those learning). It can also be a curse, because engineers often need to resolve issues quickly and efficiently, where sophisticated analysis tools can help identify and determine the root cause much faster.

Smart IT organizations will implement a mix of both scenarios, purchasing the (expensive) analysis tools for experienced engineers and the support organization, while training junior engineers or those new-in-role using the fundamentals approach.

The first step is for an engineer to learn and understand the fundamental Wi-Fi protocol exchanges such as active scanning, association, 802.1X/EAP authentication, the 4-way handshake, as well as various packets of interest including 802.11 power management techniques, retransmissions, fragmentation, medium reservation (RTS/CTS), and protection mechanisms. Easy identification of these exchanges can be achieved using Wireshark coloring rules and display filters as previously discussed.

In this post, we will continue our look at free methods to enhance Wi-Fi protocol analysis using incrementally more sophisticated analysis techniques. In subsequent posts, we will explore professional analysis tools that can automate many of these techniques.

Wireshark WLAN Traffic Statistics
The WLAN Traffic Statistics tool provides engineers with a high-level overview of the networks (BSSIDs) that are observed within the capture.

Navigate to the Statistics menu, then select WLAN Traffic.

Wireshark WLAN Traffic Statistics View

The top frame displays network traffic volumes by BSSID as a percentage of packets observed, as well as breakdowns for common wireless frame types such as beacons, probe req/resp, authentications, and de-auths. This information can be useful to identify which base stations and SSIDs are most active in the area and time the packet capture was taken.

By selecting a network from the top frame, a list of traffic within the BSSID is shown in the bottom frame. This can give engineers valuable information about top talkers within the network and can be useful for identifying bandwidth hogs, problematic clients, or clients having issues indicated by excessive probing or de-auth behavior. This can also be a rough measure of quality of service based on packet transmissions on the network. However, be sure NOT to use this as a measure of airtime fairness, as most vendor algorithms are based on byte-level fairness to override packet-level fairness inherent in the 802.11 protocol.

If you want to limit WLAN traffic statistics to a subset of packets in the capture, apply a display filter for the desired traffic, then open the statistics tool and check the box that states "Limit to display filter". This allows more focused analysis on subsets of data within the packet capture.

If you find a network or station of interest, Wireshark does provide some basic drill-down filtering capabilities by right-clicking on the entry, as show below.

Wireshark's Basic Drill-Down Filtering

Wireshark IO Graphs
The Wireshark IO Graphs tool allows engineers to graphically represent data within the packet capture for more intuitive analysis of information. This can be useful to graph the occurrence of events or packet exchanges over time, or to graph the relationship between multiple types of packets over time. This automates many analysis scenarios, eliminating manual compilation of such data.

Navigate to the Statistics menu, then select IO Graphs.

Wireshark IO Graphs

For example, the graph above shows the relationship between wireless data frames (line graph) and wireless retransmissions of data frames (bar graph). This allows the engineer to graphically observe network health over time and identify periods of degraded performance due to retransmissions. Here we see a spike of retransmissions around time mark 13:51:28 in the packet capture.

IO Graphs use the same syntax as display filters and coloring rules, so virtually any field or information within a packet capture can be graphed. Also note, that if the filter is modified you must un-select and re-select the Graph1 through Graph5 buttons to the left for the new filter to be applied and shown.

Additional Wireshark Features
In addition to WLAN traffic statistics and IO graphs, take time to explore the use of other built-in analysis tools. These include:

Enabled Protocols - used to decode various protocols for interpretation and analysis. Be sure to enable wireless protocols such as IEEE 802.11, LWAPP, CAPWAP, EtherIP (EoIP), RADIUS, EAPoL, EAP, and WLCCP. This will aid analysis of encapsulated protocols used in lightweight architectures as well as common wireless protocols either over the air or on the wire.

Endpoints - to identify top talkers and data volume per station, based on either frames or bytes.

Set Time References - used to mark packets and adjust time displayed in subsequent packets based on the marked packet. Useful for marking the beginning of a client roam and calculating the time required for an individual roam event. It's also useful for quickly setting time references on all first packets of roaming events to at once (tip - set a display filter for EAPoL Start or EAP Request Identity frames), or to see how long a client was associated to each AP before roaming.

Sample Roam Time Calculation Using the Wireshark Set Time Reference Feature

Identify the BSSID and/or station transferring the most frames in the WLAN traffic statistics tool, apply an appropriate display filter to limit the scope of analysis, then review the frame and byte level data using the Endpoints tools.

Identify a period of time where there are a large percentage of 802.11 retransmissions in the IO Graphs, apply a display filter to narrow the packet range to just that time interval and only retransmitted frames, then view the WLAN Traffic Statistics limited to displayed packets to see what BSSIDs or stations were having the most problems. This will help identify if there is an issue with one station (hidden node, localized interference by STA, bad hardware, multipath, etc.), all stations on one access point (failing AP, localized interference by AP, installation error, etc.), or if there are problems with multiple APs and stations in the area (larger source of interference, environmental issue, etc.).

Revolution or Evolution? - Andrew's Take
Using free tools such as Wireshark are great for engineers that need to learn how protocols operate by experiencing them first hand. Also, by knowing some of the advanced features of such tools, both beginning as well as seasoned engineers can perform more in-depth and sophisticated protocol analysis.

However, there are limitations to free protocol analysis tools. They often have problems opening and analyzing large packet captures, difficulty or complexity in identifying and narrowing the focus of analysis, and limited ability to perform trending analysis. They also take time to learn and master.

In subsequent posts, I will explore more professional (paid) tools that eliminate some of these limitations, and automate sophisticated analysis techniques to reduce the learning curve required to accomplish similar tasks.