On Tuesday 2009-03-24 22:18, David Miller wrote:
>> >>> >Arches without efficient unaligned access can still perform a loop>> >assuming 16bit alignment in ifname_compare()>> >> Allow me some skepticism, but the code looks pretty much like a>> standard memcmp.>>memcmp() can't make any assumptions about alignment.>Whereas we _know_ this thing is exactly 16-bit aligned.>>All of the optimized memcmp() implementations look for>32-bit alignment and punt to byte at a time comparison>loops if things are not aligned enough.
Yes, I seem to remember glibc doing something like
if ((addr & 0x03) != 0) {
// process single bytes (increment addr as you go)
// until addr & 0x03 == 0.
}
/* optimized loop here. also increases addr */
if ((addr & 0x03) != 0)
// still bytes left after loop - process on a per-byte basis
Is the cost of testing for non-4-divisibility expensive enough
to warrant not usnig memcmp?
Irrespective of all that, I think putting the interface comparison
code should be agglomerated in a function/header so that it is
replicated across iptables, ip6tables, ebtables, arptables, etc.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

From: Jan Engelhardt <jengelh@medozas.de>
Date: Tue, 24 Mar 2009 22:23:23 +0100 (CET)
> Is the cost of testing for non-4-divisibility expensive enough> to warrant not usnig memcmp?
I think so. This is the fast path of these things.
> Irrespective of all that, I think putting the interface comparison> code should be agglomerated in a function/header so that it is> replicated across iptables, ip6tables, ebtables, arptables, etc.
Agreed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Jan Engelhardt a écrit :
> On Tuesday 2009-03-24 22:18, David Miller wrote:>>>> Arches without efficient unaligned access can still perform a loop>>>> assuming 16bit alignment in ifname_compare()>>> Allow me some skepticism, but the code looks pretty much like a>>> standard memcmp.>> memcmp() can't make any assumptions about alignment.>> Whereas we _know_ this thing is exactly 16-bit aligned.>>>> All of the optimized memcmp() implementations look for>> 32-bit alignment and punt to byte at a time comparison>> loops if things are not aligned enough.> > Yes, I seem to remember glibc doing something like> > if ((addr & 0x03) != 0) {> // process single bytes (increment addr as you go)> // until addr & 0x03 == 0.> }> > /* optimized loop here. also increases addr */> > if ((addr & 0x03) != 0)> // still bytes left after loop - process on a per-byte basis> > Is the cost of testing for non-4-divisibility expensive enough> to warrant not usnig memcmp?> > Irrespective of all that, I think putting the interface comparison> code should be agglomerated in a function/header so that it is> replicated across iptables, ip6tables, ebtables, arptables, etc.
memcmp() is fine, but how is it solving the masking problem we have ?
Also in the case of arp_tables, _a is long word aligned, while _b and _mask are not.
memcmp() in this case is slower, (and dont handle mask thing)
If you look various ifname_compare(), we have two different implementations.
So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and xt_physdev.c
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

On Tuesday 2009-03-24 22:39, Eric Dumazet wrote:
>>> memcmp() can't make any assumptions about alignment.>>> Whereas we _know_ this thing is exactly 16-bit aligned.>>>>>> All of the optimized memcmp() implementations look for>>> 32-bit alignment and punt to byte at a time comparison>>> loops if things are not aligned enough.>> >> Yes, I seem to remember glibc doing something like>> [...]>> Is the cost of testing for non-4-divisibility expensive enough>> to warrant not usnig memcmp?>> >> Irrespective of all that, I think putting the interface comparison>> code should be agglomerated in a function/header so that it is>> replicated across iptables, ip6tables, ebtables, arptables, etc.>>memcmp() is fine, but how is it solving the masking problem we have ?
You are right; we would have to calculate the prefix length of the
mask first. (And I think we can assume that there will not be any 1s
in the mask after the first 0.) It would consume CPU indeed.
>Also in the case of arp_tables, _a is long word aligned, while _b and _mask>are not.>>If you look various ifname_compare(), we have two different implementations.>>So yes, a factorization is possible for three ip_tables.c, ip6_tables.c and> xt_physdev.c.
Very well.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

David Miller <davem@davemloft.net> writes:
>> memcmp() can't make any assumptions about alignment.
Newer gcc often[1] knows how to generate specialized aligned memcmp()
based on the type alignment.
However you need to make sure the input types are right.
So in theories some casts might be enough given a new enough
compiler.
[1] not always unfortunately, sometimes it loses this information.
-Andi