Packet retransmission is a fundamental TCP feature
that ensures reliable data transfer between two end nodes.
Interestingly, when it comes to cellular data accounting, TCP
retransmission creates an important policy issue. Cellular ISPs
might argue that all retransmitted IP packets should be accounted
for billing since they consume the resources of their
infrastructures. On the other hand, the service subscribers might
want to pay only for the application data by taking out the amount
for retransmission. Regardless of the policies, however, we find
that TCP retransmission can be easily abused to manipulate the
current practice of cellular traffic accounting.

The fundamental challenge lies in detecting malicious TCP
retransmission in the middleboxes (e.g., accounting
systems). Middleboxes cannot reliably detect if a retransmitted
packet is innocent (sent out of necessity) or malicious (sent
even if there is no sign of packet loss) since they cannot
precisely infer the TCP states of end nodes.

Abacus is a high-speed celluar data accounting system that runs
on commodity PCs to monitor and account for packets flowing
through the cellular network. It effectively exploits the Deep
Packet Inspection (DPI) of all packets to accurately distinguish
between legitimate and malicious TCP retransmission packets and
prevents "free-riding" attack.

Abacus runs DPI on the retransmission packets to detect tunneling
attacks. Abacus extends
Monbot,
a highly-scalable flow monitoring system on commodity hardware,
to drastically reduce the flow buffer requirement by
probabilistically verifying the payload of retransmission packets.
Abacus has two modes: deterministic and probabilistic.

Figure 1. Deterministic DPI accounting process

Figure 2. Probabilistic DPI accounting process

Deterministic DPI (d-DPI)
Buffer the original payload and conduct byte-by-byte comparison
with the retransmitted payloads.

Buffer management: We set largest sequence number Abacus
has seen from one end as S. We estimate the maximum send window
size as the receive window size, W, advertised by the receiver.
Normaly Abacus buffers:

Any sequence numbers ≥ (S - W)

However, W could change on every ACK and packets can be delivered
out of order in practice. So, we decide to buffer:

Any sequence numbers ≥ (S - 2 x W)

As S advances, we slide the flow buffer window to
monitor the right window range.

Probabilistic DPI (p-DPI)
Sample the original payload (n bytes per each 1024-byte flow data
at random) and check whether retransmitted payloads have the
identical values for the sampled data.

Figure 3. Flow table format (e.g., n = 5)

Buffer management: We allocate a flow table per each
flow direction that consists of a set of sample entries. Each
sample entry has a 4-byte base sequence number (bsn) and n-byte
sampled data. Each sampled byte on the entry is randomly chosen
from the sequence number space of [bsn, bsn + 1023].

Sampling random bytes: The locations are determined by
running a hash function with
(per-flow secret key, bsn for the entry) as input. The per-flow
secret key is generated by HMAC_secret key{nonce} at connection
setup time where the nonce is a 8-byte random number generated
per each flow and the secret key is the system-wide key known
only to Abacus. Any hash function is fine as long as its output
size is (10*n) bits or larger.

d-DPI: d-dDPI works well up to 160K concurrent flows but
it starts to drop packets at 320K flows. The memory usage grows
linearly with the number of flows, showing 25.9 GB at 160K flows
and 53.6 GB at 320K flows. In case of CPU usage, d-DPI stays at
500% to 600% (where, 100% = 1 CPU core is fully utilized) most
time but it grows to 876% at 320K flows, presumably due to the
bottleneck in the memory bandwidth.

p-DPI: p-DPI does not drop any packet even with 320K
concurrent flows. The memory usage is 391.0 MB at 320K flows
and 202.7 MB at 160K flows. The CPU usage of p-DPI stays under
100% even at 320K flows. This indicates that Abacus in p-DPI
mode can monitor a large number of flows that saturate
10 Gbps even on a low-powered desktop machine.