Which means: Look for packets that are VLAN tagged, where the VLAN ID is 5 and the transport protocol port number is 25, or VLAN ID is 6 and the port is 25. The is necessary due to the way BPF works, which is it will shift the byte match by 4 bytes after each occurrence of "vlan" in the expression (hence you cannot construct an expression like (vlan 5 and port 25 || vlan 6 and port 25), because the second half of the expression would only match if traffic for VLAN 6 was encapsulated inside VLAN 5). Statically using ether[a:b] will always look a-bytes deep in the packet (starting at 0) and read b-number of bytes to match. This is supposed to work regardless of whether "vlan" has occurred in the expression (making it a reliable way to find the outer VLAN ID).

So here's the problem: It doesn't work on ixgbe (at least not in promiscuous mode). When I try to match on ether[15] (the low-order bytes of the VLAN ID) it will actually match on byte 19 (the 20th byte), which mean it's shifted 4 bytes over. If I try to match on ether[11], the expression returns true when *both* byte 11 (12) AND byte 15 (16) equal the expression, which is totally bizarre. I cannot seem to make a 2-byte pattern match at all (but maybe I just didn't run tcpdump for long enough for that amazing coincidence).

By the way, I can match any other bytes normally with ether[a:b] expressions, it's only the 12-15 bytes (VLAN ethertype and VLAN ID) that have bizarre behavior.

I strongly suspect this is due to rx_vlan_offload being enabled, but when I try to disable it with ethtool I get:

$ sudo ethtool -K eth1 rx-vlan-offload off

ethtool: bad command line argument(s)

For more information run ethtool -h

Edit: I found that 'ethtool -K <dev> rxvlan off' is the correct command, but disabling that didn't change the behavior.

This happens with both ixgbe driver version: 4.2.1-k (shipped with the CentOS kernel package), and also with version: 4.4.6 (built from source).

I found a reference to an extremely similar sounding bug here https://sourceforge.net/p/e1000/bugs/375/, but that seems to be a much earlier version of the driver. This one appears to check for RHEL_RELEASE_VERSION > 6.1 and enable 802.1P support accordingly:

We have a flat set of VLANs that are passing through passive network taps. We're duplicating the aggregated tap output (without any encapsulation or rewriting) to servers with 82599 cards in order to perform traffic inspection. Since this is a whole lot of traffic, we're filtering out a bunch of things that we know we don't have to look at, using BPF expressions (which is the most efficient way for our IDS technology to do it). BPF, as you may be aware, is the same thing tcpdump uses to capture/exclude specific traffic.

Due to a quirk of BPF, you cannot have multiple conditions based on the 'vlan' keyword, because invoking it causes the byte pointer to be shifted over 4. At this point if you invoke 'vlan' again it will only match VLAN-in-VLAN traffic (encapsulated VLANs). The way BPF gets around this to construct complex expressions (such as in plain language "match VLAN 5 port 80, OR match vlan 15 port 8080") is to explicitly read to an offset in the ethernet frame with 'ether[<start byte>:<length of bytes>]'. This technique is what is not working on the 82599. It works great on the I350, and every other card I've ever used. We literally copied the same filter rules we're using on I350s and they don't work on 82599s. We get the same behavior when using tcpdump or the IDS software, so it's not a bug in either of those. It seemed possible that it was a bug in libpcap, but I seem to have ruled that out by both downgrading and upgrading libpcap on the box with the 82599s and it hasn't made any difference.

It's not a mistake with the BPF. The exact same BPF is working perfectly on I350 cards on the same networks. I stated that several times in this thread already. I have done literally hours of troubleshooting on this and documented nearly all of it in this thread. It's frankly insulting to tell me it's a BPF error given the information I've already provided.

This appears to me to be a bug with either the driver, the firmware, or the chip.