Sorry for delaying weekly status report,
* Overview
Here are progress of the project:
- Implement set affinity ioctl on BPF
Experimental code are implemented, worked
- Implement affinity support on bpf_tap/bpf_mtap/bpf_mtap2
Experimental code are implemented, worked
- Implement sample application
Quick hack for tcpdump/libpcap, worked
- Implement multi-queue tap driver
Experimental core are implemented, not tested
- Implement interface to deliver queue information on network device driver
Partially implemented on igb(4), not tested
- Reduce lock granularity on bpf_tap/bpf_mtap/bpf_mtap2
Not yet
- Implement test case
Not yet
- Update man document, write description of sample code
Not yet
* Detail
On an ethernet card, bpf_mtap is called when RX/TX are performing.
If the card supports multiqueue, every packets through bpf_mtap should
belong to RX queue id or TX queue id.
To handle this, I defined new members on mbuf pkthdr.
In if_start function on igb(4), I added following line:
m->m_pkthdr.rxqid = (uint32_t)-1;
m->m_pkthdr.txqid = [tx queue id];
And also receive function:
m->m_pkthdr.rxqid = [rx queue id];
m->m_pkthdr.txqid = (uint32_t)-1;
Then I define following members on bpf descriptor:
d->bd_qmask.qm_enabled
d->bd_qmask.qm_rxq_mask[]
d->bd_qmask.qm_txq_mask[]
Since qm_rxq_mask[] and qm_txq_mask[] size may differ on each cards,
we need to pass size of queue from driver to bpf and allocate arrays
by the size.
I added them on struct ifnet:
d->bd_bif->bif_ifp->if_rxq_num
d->bd_bif->bif_ifp->if_txq_num
Now we can filter unwanted packet on bpf_mtap like this:
LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
if (d->bd_qmask.qm_enabled) {
if (m->m_pkthdr.rxqid != (uint32_t)-1 &&
!d->bd_qmask.qm_rxq_mask[m->m_pkthdr.rxqid])
continue;
if (m->m_pkthdr.txqid != (uint32_t)-1 &&
!d->bd_qmask.qm_txq_mask[m->m_pkthdr.txqid])
continue;
}
d->bd_qmask.qm_enabled should FALSE by default to keep compatibility
with existing applications.
And here are ioctls for set/get queue mask:
#define BIOCENAQMASK _IO('B', 137)
This does d->bd_qmask.qm_enabled = TRUE
#define BIOCDISQMASK _IO('B', 138)
This does d->bd_qmask.qm_enabled = FALSE
#define BIOCRXQLEN _IOR('B', 133, int)
Returns ifp->if_rxq_num
#define BIOCTXQLEN _IOR('B', 134, int)
Returns ifp->if_txq_num
#define BIOCSTRXQMASK _IOWR('B', 139, uint32_t)
This does d->bd_qmask.qm_rxq_mask[*addr] = TRUE
#define BIOCGTRXQMASK _IOR('B', 140, uint32_t)
Returns d->bd_qmask.qm_rxq_mask[*addr]
/* XXX: We should have rxq_mask[*addr] = FALSE ioctl too */
#define BIOCSTTXQMASK _IOWR('B', 141, uint32_t)
This does d->bd_qmask.qm_txq_mask[*addr] = TRUE
/* XXX: We should have txq_mask[*addr] = FALSE ioctl too */
#define BIOCGTTXQMASK _IOR('B', 142, uint32_t)
Returns d->bd_qmask.qm_rxq_mask[*addr]
However, the packet which comes bpf_tap doesn't have mbuf, we won't
able to classify queue id for it.
So I added d->bd_qmask.qm_other_mask and BIOSTOTHERMASK/BIOGTOTHERMASK for them.
If d->bd_qmask.qm_enabled && !d->bd_qmask.qm_other_mask, all packets
through bpf_tap will be ignored.
If we only care about CPU affinity of packet / thread(= bpf
descriptor), checking PCPU_GET(cpuid) is enough.
But if we want to take care queue affinity, we probably need
structures as referred to above.
* Argument
I discussed about this project with some Japanese BSD hackers, they
argue this plan, suggested me two things:
- Isn't it possible to filter by queue id in BPF filter language by extend it?
- Do we really need to expose queue information and threads to user
applications?
Probably most of BPF application requires to merge packet streams from
threads at last.
For example, sniffer app such as tcpdump and wireshark need to output
packet dump on a screen, before output it on the screen we need to
merge packet streams for each queues into one stream.
If so, isn't it better to merge stream in kernel, not userland?
I'm not really sure about use case of BPF, maybe there's use case can
get benefit from multithreaded BPF?
syuu