Verwaltung des Blogs

Dienstag, 3. November 2015

A WiFi client is usually connected to an AccessPoint for WPA-protected network access. When the AccessPoint becomes unreachable for example as the client moved, the client needs to switch over to an other AccessPoint providing the same extended service set (ESSID). This is called roaming.

Roaming involves the following steps:

Scanning for an other reachable AccessPoint (Probe)

Disconnect from the old AP

Changing the WiFi-channel

Authentication and Association

WPA-EAP-Handshake (only for 802.1X)

4-way WPA handshake

Connection is ready again

These steps need time and the client is disconnected until he completed them, but there are some options available to speed things up:

PSK-Authentification does not need the EAP Handshake, but adds one (backbone) round trip time (RTT) during the authentification handshake if fetched from RADIUS.

802.11i pre-Authentication enables a client to perform the EAP-Handshake while still connected to the old AccessPoint

802.11r (fast transition) over-air performs the 4-way WPA handshake piggy-backed on the Authentication and Association frames and skips the EAP-Handshake.

802.11r (fast transition) over-ds does the same as over-air but has the Authentication handshake (not association) performed while still connected to the old AccessPoint.

VLANs are used to separate WiFi-clients connected to an BSS (which identifies the AccessPoint and the ESSID used) into groups and connect them to different ethernet networks (VLANs).

The following explanations and configuration examples partly require patches applied to hostapd not yet included into hostapd upstream, but they can be found here: https://github.com/michael-dev/hostapd .

802.11i pre-Authentication

With pre-Authentication, the EAP handshake is packed into data frames with ether type 0x88C7 and destination mac equal to the new AccessPoint's BSSID, which are send to the old AccessPoint the client is still connected to. As they are data frames, the get converted into ethernet packets and bridged into the VLAN the client is assigned to on the old AccessPoint. Now the new AccessPoint listens for those packets in the same VLAN and replies using ethernet packets with destination mac equal to the client WiFi mac address, so the EAP handshake can be completed in the same way as with WPA-EAP.# hostapd.conf
rsn_preauth=1
rsn_preauth_interfaces=br-mgnt
In order to enable 802.11i pre-Authentication with pre-configured VLANs, the AccessPoint needs to listen for packets with ether type 0x88C7 in all of its VLANs. This can be configured by giving multiple (space separated) rsn_preauth_interfaces to hostapd. Thought, this does not work with full dynamic vlans which hostapd sets up on the fly, as the destination AccessPoint might not yet know about all VLANs, so it cannot listen in all VLANs the clients could be connected to. Therefore packets with ether type 0x88C7 need to be exchanged not in a WiFi clients assigned VLAN but in the VLAN where the destination AccessPoint is actually listening (br-mgnt). This is solved by making hostapd copy those packets across the VLANs used.# hostapd.conf
rsn_preauth_copy_iface=br-mgnt
As these pre-Authentication packets now do not need to be forwarded in other vlans except br-mgnt, they can be filtered using ebtablesebtables -A FORWARD --logical-in brvlan+ -p 0x88C7 -j DROP
In order for the new AccessPoint to actually receive the packets, the bssid needs to be a local mac on br-mgnt. This can be achieved by adding some dummy network devices to br-mgnt, for example:# hostapd.conf
rsn_preauth_autoconf_bridge=1

802.11r fast transition (FT)

FT-roaming is a bit more complicated and consists of three parts, as it able to avoid the EAP Handshake during roaming alltogether.

Key management

Key distribution

Over-DS communication

Key management

With both WPA-PSK (password) and WPA-EAP (802.1X) the WiFi client and the AccessPoint end up with a Master Session Key (MSK), that is used to derive the keys actually used for encryption of data packets.

802.11r splits the AccessPoint into two logical entities called R0KH and R1KH, where R0KH can be thought of as "controller" and R1KH as the mere AccessPoint. R0KH gets access to some key R0, that is derived from the master key (MSK) and an identifier called R0KH-ID (its NAS-Identifier). Similarely R1KH will only receive a derived key R1 from R0KH. R1 is derived from R0 using an identifier called R1KH-ID (which could be its mac address). This would enable the MSK to be stored outside of the AccesPoint, so an AccessPoint will not hold enough information to connect to a different AccessPoint (identity stealing) - thought this is not the case with hostapd.

hostapd includes R0KH, R1KH and it derives R0 from MSK, so you don't need a controller here. Thought, the concept of R0KH and R1KH is still needed for configuration.

To enable this key management scheme, hostapd.conf needs to contain the appropriate from the following settings:# 802.1X
# - only FT-clients
wpa_key_mgmt=FT-EAP
# - FT and non-FT clients
wpa_key_mgmt=WPA-EAP FT-EAP

Key distribution

The old AccessPoint, which the client was connected to just before roaming.

R1KH: The new AccesPoint, which the client is connecting to.

Thought, for key distribution, only R0KH and the new AccessPoint are involved.

The AccessPoint a client first connects to becomes R0KH for this session, as it stores the key R0. Whenever a WiFi client roams, it will provide the new AccessPoint the R0KH-ID of this first AccessPoint. The new access point (in the role of R1KH) now fetches the key from the first AccessPoint (R0KH). R0KH will provide to R1KH the derived key R1, which might be specific to this AccessPoint (due to R1KH-ID being involved).

hostapd can distribute the R1 in three ways:

With push mode, R0KH derives R1 for all other AccessPoints (R1KH) it knows about and sends it there, so whenever a WiFi client connects, all AccessPoints will learn the relevant keys and roaming can happend instantly.

With pull mode, R1KH sends a request packet to R0KH, which then generates R1 and sends it back to R1KH.

With PSK, hostapd can generate R1 locally as it already knows the MSK (derived from passphrase). The passphrase needs to be present, as the client could as well start its session locally. So no inter-AP communication is required.

In order for hostapd to dervive the FT keys locally when using PSK, the following settings is required:# hostapd.conf
psk_generate_local=1

Pushing can be enabled using# hostapd.conf
pmk_r1_push=1

Packets exchanged between R0KH and R1KH for key distribution are addressed using ethernet mac addresses and protected using cryptography (AES). So in order to send a pull request to R0KH, hostapd needs to be able to look up the mac address and key when given R0KH-ID. Similarly, for push or reply, R0KH needs to be able to look up the key (given the mac address) or enumerate all R1KH (key + mac address). # hostapd.conf
r0kh=R0KH-MAC R0KH-ID AES-KEY
r1kh=R1KH-MAC R1KH-ID AES-KEY

A WiFi client learns that it can roam between to FT-enabled AccessPoints in the same ESS (SSID) by looking at the mobility_domain information.# hostapd.conf
mobility_domain=0101

The interface used to send and receive packets can be configured using# hostapd.conf
ft_bridge = br-mgnt

When sending such a packet, the local bssid is used as source mac address, so each AP need to be able to receive packets on ft_bridge for its local bssid. Therefore hostapd will add dummy interfaces to ft_bridge, so that the BSSID will appear to be local to the ft_bridge bridge.

Auto-discovery of R0KH and R1KH

Over-DS configuration

With FT-over-DS, the authentication part of the initial handshake with the new AccessPoint is done via the old AccessPoint. Therefore the client sends a FT-Action-Frame to the old AccessPoint that will basically forward it to the new AccessPoint and roughly has the same content as the FT authentication request. The forwarding does not involve the R0KH/R1KH configuration but instead is done using the new AccessPoints BSSID as destination, similar to what is done in 802.11i pre-Authentication. The new AccessPoint will reply (possibly after querying the RADIUS server or fetch the key for WPA encryption) by sending a packet to the old AccessPoint, which will then send an FT-Action-Response frame to the client. After this, the client disconnects from the old AccessPoint, changes the channel and communicates with the new AccessPoint directly.

As addressing of the inter-AP packets is done using the BSSID as source and destination mac address, both AccessPoints need to receive packets destinated for their BSSID on ft_bridge. Therefore hostapd will add dummy interfaces to ft_bridge, so that the BSSID will appear to be local to the ft_bridge bridge.

Pitfalls

IEEE 802.11 Authentication and Association are time critical operations, that is, they a have quite small timeout in Linux kernel mac80211 client implementation (about 200ms per retransmission and at most 3 vs. 2 transmissions). So if hostapd (AP) does not reply within that time, the client will switch to another AP.

There were two things in hostapd that could block:

WPA passphrase hashing from RADIUS reply

VLAN setup

Maybe they only block on slow hardware with too much debugging (address sanitizer) enabled. Nevertheless the following might help if you want to run hostapd with address sanitizer enabled.

WPA passphrase hashing

Hostapd can fetch the WPA passphrase(s) per station from RADIUS. RADIUS will then return the unhashed passphrase(s), and hostapd will need to hash them including the SSID so that the keys can be derived. These hashing functions are designed to not be too fast, so when authentication waits for the RADIUS reply processing including hashing to complete, a timeout might occur on client side. This is partly mitigated as the hashed passphrase is cached along with the RADIUS reply, so when a client retries sufficiently fast at the same AP (which it usually avoids), authentication might still succeed shortly after.

For non-fast-transition (non-FT) roaming clients this can be worked around by moving passphrase hashing into when it is needed: WPA handshake after authentication and association, as it has bigger timeouts. Thought with FT roaming clients this does not help, as the WPA handshake is piggy backed on the authentication and association frames, so there is nothing to defer to. So for FT, hashing needs to be done outside of the AP. If the RADIUS server / gateway is sufficiently fast, hashing might be done there on-the-fly. Alternativly, the hashed passphrase needs to be stored in the authentication source.

VLAN setup

When a WiFi client is assigned into a VLAN, hostapd will automatically setup the interfaces and bridges required (about three per VLAN). Especially with tagged VLAN and/or NETLINK support, this might be slow - imagine 32 VLANs with 3 interfaces each, so 96 interfaces need to be created and set ifconfig up along with 64 bridge enslave commands that are to be issued. When this should complete within 100ms, only less than 1ms is left to each command. Additionally, the netlink vlan implementation in hostapd might still have room for improvement - it needed up to 300ms per VLAN interface (vlan_add) to be added during some tests with heavy debugging enabled.

Assigning a WiFi client to a vlan interface in hostapd is strictly linked to group key setup. For a station to get the correct group key, it needs to be assigned to the correct wpa_group state machine. That state machine will fail if the related AP_VLAN interface is missing. So only uplink configuration (that is bridge, tagged vlan interfaces and tagged vlans) can be deferred. Additionally, blocking the single-threaded hostapd process at any time is not a good idea with respect to the timeouts of other clients, as hostapd cannot process a single authentication or association request while being busy configuring some VLANs.

The solution is make VLAN configuration interruptible, so other requests can be served in parallel (interleaved). This requires some VLAN setup state machine, so setup and desetup can be resumed properly. It can be enabled by compiling hostapd withCONFIG_VLAN_ASYNC=y

Issues encountered so far

After enabling FT-PSK in addition to WPA-PSK (which is still enabled), some Android 4.4.2 Samsung Galaxy devices started to refuse connecting with the network. They show "unsupported security type" (or alike). Thought this does not happen with 1X (WPA-802.1X + FT-802.1X enabled).