Hi Marek,
On Fri, 31 Aug 2012 01:29:05 +0200, Marek Vasut <marex@denx.de> wrote:
> Dear Lucas Stach,> > > usb_hub_descriptor has to be packed as it's used for> > communication with the device. Member wHubCharacteristics> > violates the natural alignment rules.> > > > Use explicit unaligned access functions for this member.> > Fixes ARMv7 traping while using USB.> > Shouldn't a properly configured compiler prevent such behavior?
This has been discussed before. As a summary:
1. A properly configured compiler can pad a structure so that the
fields always start at an aligned address (assuming the structure base
address is itself aligned). But that alters the structure, and here
there must be no alterations to the structure.
2. A properly (and differently) configured compiler can automatically
generate native unaligned accesses to unaligned fields. This is
acceptable on armv6+ architectures, has a performance penalty on earlier
architectures, and does not necessarily work depending on the hardware
configuration.
3. A properly (and differently yet) configured compiler can
automatically generate non-native unaligned accesses to unaligned
fields. This is acceptable on all architectures, has a performance
penalty on pre-armV6 architectures for all misaligned accesses, whether
voluntary or not.
The conclusion of the discussion is as follows -- or to be more exact,
following this discussion, this is my stance as the U-Boot custodian:
i) All the code intended to run 'close to' U-Boot (i.e., U-Boot code
itself and application code) is controlled enough that we should be
able to know and limit which code requires misaligned access (such as
here for this USB structure field);
ii) On some ARM architectures, and possibly some non-ARM architectures
as well, native misaligned access incur a performance hit, and may
even simply be impossible or forbidden by a hardware or system design
decision.
iii) Thus, U-Boot should follow a strict policy of using native aligned
accesses only, possibly enforcing misaligned native access prevention in
hardware, and of explicitly emulating misaligned accesses when they
cannot be avoided.
Hope this clarifies.
Amicalement,

Dear Albert ARIBAUD,
> Hi Marek,> > On Fri, 31 Aug 2012 01:29:05 +0200, Marek Vasut <marex@denx.de> wrote:> > Dear Lucas Stach,> > > > > usb_hub_descriptor has to be packed as it's used for> > > communication with the device. Member wHubCharacteristics> > > violates the natural alignment rules.> > > > > > Use explicit unaligned access functions for this member.> > > Fixes ARMv7 traping while using USB.> > > > Shouldn't a properly configured compiler prevent such behavior?> > This has been discussed before. As a summary:
Argh, I must have missed this.
> 1. A properly configured compiler can pad a structure so that the> fields always start at an aligned address (assuming the structure base> address is itself aligned). But that alters the structure, and here> there must be no alterations to the structure.
There can be ... the compiler can even reorder the structures (that happened in
Linux kernel, as I'm on a train now with shitty internet connection, I can't
find it ... search LWN for that please).
> 2. A properly (and differently) configured compiler can automatically> generate native unaligned accesses to unaligned fields. This is> acceptable on armv6+ architectures, has a performance penalty on earlier> architectures, and does not necessarily work depending on the hardware> configuration.
Certainly, agreed.
> 3. A properly (and differently yet) configured compiler can> automatically generate non-native unaligned accesses to unaligned> fields. This is acceptable on all architectures, has a performance> penalty on pre-armV6 architectures for all misaligned accesses, whether> voluntary or not.
Correct, I'd vote for this solution -- let the compiler handle such unaligned
cases.
> The conclusion of the discussion is as follows -- or to be more exact,> following this discussion, this is my stance as the U-Boot custodian:> > i) All the code intended to run 'close to' U-Boot (i.e., U-Boot code> itself and application code) is controlled enough that we should be> able to know and limit which code requires misaligned access (such as> here for this USB structure field);> > ii) On some ARM architectures, and possibly some non-ARM architectures> as well, native misaligned access incur a performance hit, and may> even simply be impossible or forbidden by a hardware or system design> decision.> > iii) Thus, U-Boot should follow a strict policy of using native aligned> accesses only, possibly enforcing misaligned native access prevention in> hardware, and of explicitly emulating misaligned accesses when they> cannot be avoided.
I do agree with i) and ii), but why not just let compiler handle the unaligned
access for us? The compiler can optimize across the whole code, not only locally
over one access, therefore it might be able to punt some of the unaligned
accesses altogether if the code allows it. Besides, right now, the code is much
more readable. So I really don't like adding some strange macros to force crazy
aligned access if the compiler can do it for us and can do it better.
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- ./MAKEALL m28evk
Configuring for m28evk board...
text data bss dec hex filename
415964 7688 288740 712392 adec8 ./u-boot
11754 788 12 12554 310a ./spl/u-boot-spl
--------------------- SUMMARY ----------------------------
Boards compiled: 1
----------------------------------------------------------
$ patch -Np1 -i /tmp/\[PATCH\ v2\]\ usb_do\ explicit\ unaligned\ accesses.mbox
patching file common/usb_hub.c
patching file drivers/usb/host/ehci-hcd.c
Hunk #2 succeeded at 867 with fuzz 1 (offset -10 lines).
$ ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- ./MAKEALL m28evk
Configuring for m28evk board...
text data bss dec hex filename
415968 7688 288736 712392 adec8 ./u-boot
11754 788 12 12554 310a ./spl/u-boot-spl
--------------------- SUMMARY ----------------------------
Boards compiled: 1
----------------------------------------------------------
Notice the text section grew a bit too, I dunno why, does anyone care enough to
clarify please?
> Hope this clarifies.> > Amicalement,
Best regards,
Marek Vasut

Hi Marek,
(Apologies for the private mail)
On Sat, 1 Sep 2012 00:16:43 +0200, Marek Vasut <marex@denx.de> wrote:
> Dear Albert ARIBAUD,
> > I think you are talking about lumping small-sized accesses together> > into a bigger access possibly aligned.> > This is exactly what I mean.> > > If I am correct, then I don't> > think this is related to misaligned accesses.> > Why won't it be? Such access can in the end turn out to be aligned,> therefore leveraging all the penalty.
I have not expressed myself clearly. Yes, access lumping is related to
access alignment. What I meant is: disallowing misaligned native
accesses will not prevent access lumping. Misalignment restrictions do
indeed restrict how such lumpings will happen, but it does not prevent
lumping per se.
One place where lumping and misalignement prevention did clash was
raised in the previous discussion: a 7+1 bytes function-local char array
was allocated on a non-aligned address (which is possible and normal
because it is a char) and was initialized with some content. The
compiler lumped the initialization as two misaligned 32-byte native
accesses, despite misaligned native accesses being forbidden by
compiler command line options. This was a compiler bug.
> > If I am not correct, can> > you please detail what you meant?> > > > > Besides, right now, the code is much more readable. So I really> > > don't like adding some strange macros to force crazy aligned> > > access if the compiler can do it for us and can do it better.> > > > I personally would let the compiler do it too, but I prefer it to be> > clearly indicated to the reader of the code when an access is> > known to be misaligned.> > I'd enable enable the Alignment trapping in the CPU and die on an> unaligned access at runtime -- to indicate the user that he should> fix his bloody compiler.
... or fix his bloody structure, or fix his bloody f...ixing pointer
arithmetic, or... but I do agree with the trapping, and that's my plan.
However other architectures may need, or choose, another stance on
alignments, and it is best if they don't have to discover intended
misaligned accesses the hard way. Thus my opinion that any misaligned
access which cannot be fixed should not be sliently left for the
compiler to handle, but should (also) be clearly marked as such, if only
for humans to notice.
> Best regards,> Marek Vasut
Amicalement,

Dear Albert ARIBAUD,
> Hi Marek,> > On Sat, 1 Sep 2012 00:16:43 +0200, Marek Vasut <marex@denx.de> wrote:> > Dear Albert ARIBAUD,> > > > > I think you are talking about lumping small-sized accesses together> > > into a bigger access possibly aligned.> > > > This is exactly what I mean.> > > > > If I am correct, then I don't> > > think this is related to misaligned accesses.> > > > Why won't it be? Such access can in the end turn out to be aligned,> > therefore leveraging all the penalty.> > I have not expressed myself clearly. Yes, access lumping is related to> access alignment. What I meant is: disallowing misaligned native> accesses will not prevent access lumping. Misalignment restrictions do> indeed restrict how such lumpings will happen, but it does not prevent> lumping per se.> > One place where lumping and misalignement prevention did clash was> raised in the previous discussion: a 7+1 bytes function-local char array> was allocated on a non-aligned address (which is possible and normal> because it is a char) and was initialized with some content. The> compiler lumped the initialization as two misaligned 32-byte native> accesses, despite misaligned native accesses being forbidden by> compiler command line options. This was a compiler bug.
But that'd mean that instead of fixing a compiler, we'd be doing a workaround in
our code?
> > > If I am not correct, can> > > you please detail what you meant?> > > > > > > Besides, right now, the code is much more readable. So I really> > > > don't like adding some strange macros to force crazy aligned> > > > access if the compiler can do it for us and can do it better.> > > > > > I personally would let the compiler do it too, but I prefer it to be> > > clearly indicated to the reader of the code when an access is> > > known to be misaligned.> > > > I'd enable enable the Alignment trapping in the CPU and die on an> > unaligned access at runtime -- to indicate the user that he should> > fix his bloody compiler.> > ... or fix his bloody structure, or fix his bloody f...ixing pointer> arithmetic, or... but I do agree with the trapping, and that's my plan.> > However other architectures may need, or choose, another stance on> alignments, and it is best if they don't have to discover intended> misaligned accesses the hard way.
Yet still, in such case, valid compiler has to generate valid workaround code.
> Thus my opinion that any misaligned> access which cannot be fixed should not be sliently left for the> compiler to handle, but should (also) be clearly marked as such, if only> for humans to notice.
I can't say I agree here ... since it's a really ad-hoc solution. I can't say I
see any real benefit other than that it's hiding possible compiler bugs :-(
> > Best regards,> > Marek Vasut> > Amicalement,
Best regards,
Marek Vasut

Dear Albert ARIBAUD,
> Hi Marek,> > On Sat, 1 Sep 2012 16:34:09 +0200, Marek Vasut <marex@denx.de> wrote:> > Dear Albert ARIBAUD,> > > > > Hi Marek,> > > > > > On Sat, 1 Sep 2012 00:16:43 +0200, Marek Vasut <marex@denx.de>> > > > > > wrote:> > > > Dear Albert ARIBAUD,> > > > > > > > > I think you are talking about lumping small-sized accesses> > > > > together into a bigger access possibly aligned.> > > > > > > > This is exactly what I mean.> > > > > > > > > If I am correct, then I don't> > > > > think this is related to misaligned accesses.> > > > > > > > Why won't it be? Such access can in the end turn out to be> > > > aligned, therefore leveraging all the penalty.> > > > > > I have not expressed myself clearly. Yes, access lumping is related> > > to access alignment. What I meant is: disallowing misaligned native> > > accesses will not prevent access lumping. Misalignment restrictions> > > do indeed restrict how such lumpings will happen, but it does not> > > prevent lumping per se.> > > > > > One place where lumping and misalignement prevention did clash was> > > raised in the previous discussion: a 7+1 bytes function-local char> > > array was allocated on a non-aligned address (which is possible and> > > normal because it is a char) and was initialized with some content.> > > The compiler lumped the initialization as two misaligned 32-byte> > > native accesses, despite misaligned native accesses being forbidden> > > by compiler command line options. This was a compiler bug.> > > > But that'd mean that instead of fixing a compiler, we'd be doing a> > workaround in our code?> > Not exactly.> > First, in this instance, a fix to the compiler has been at least> requested, if not already applied (I would need to check this). The fix> causes the compiler to still generate misaligned 32-bit accesses *if*> misaligned native accesses are allowed, and to use only allowed> accesses otherwise.
But then again, this is compiler bug we exposed, no need to hide it. I'm firm on
this one.
> Second, I do not ask U-Boot contributors to mark code as explicitly> unaligned when the misaligned access is caused by a compiler or> code error; I ask them to mark code as unaligned when the misaligned> access is *unavoidable* because the HW or some standard imposes it.
I see, I'm starting to see your point. Maybe because I've missed the previous
discussion.
> Here, the specification from which the USB struc is derived imposes a> misaligned field. This, rather than any compiler bug, makes the> misaligned access *unavoidable*. And because it is, I ask that it be> marked so, by the explicit use of unaligned accessors.
Still, this is unaligned only on ARM, not on maybe some other arches, right?
> > > > > If I am not correct, can> > > > > you please detail what you meant?> > > > > > > > > > > Besides, right now, the code is much more readable. So I> > > > > > really don't like adding some strange macros to force crazy> > > > > > aligned access if the compiler can do it for us and can do it> > > > > > better.> > > > > > > > > > I personally would let the compiler do it too, but I prefer it> > > > > to be clearly indicated to the reader of the code when an> > > > > access is known to be misaligned.> > > > > > > > I'd enable enable the Alignment trapping in the CPU and die on an> > > > unaligned access at runtime -- to indicate the user that he should> > > > fix his bloody compiler.> > > > > > ... or fix his bloody structure, or fix his bloody f...ixing pointer> > > arithmetic, or... but I do agree with the trapping, and that's my> > > plan.> > > > > > However other architectures may need, or choose, another stance on> > > alignments, and it is best if they don't have to discover intended> > > misaligned accesses the hard way.> > > > Yet still, in such case, valid compiler has to generate valid> > workaround code.> > Yes. However, letting the compiler generate workarounds may end up> letting it generate workarounds for misaligned accesses caused by errors> or bugs also. Marking the code explicitly helps telling which is which> too.
Does this work across architectures too? Like, on arm it's misaligned, on intel
it isnt.
> > > Thus my opinion that any misaligned> > > access which cannot be fixed should not be sliently left for the> > > compiler to handle, but should (also) be clearly marked as such, if> > > only for humans to notice.> > > > I can't say I agree here ... since it's a really ad-hoc solution. I> > can't say I see any real benefit other than that it's hiding possible> > compiler bugs :-(> > Here it is barely an ad hoc solution, as the alternative would be> fixing the hardware or worse, spec (can someone tell us where this> misaligned struct field originates from exactly, hw or USB spec?)
http://www.intel.com/technology/usb/download/ehci-r10.pdf I think you're looking
around 3.6 .
> > Best regards,> > Marek Vasut> > Amicalement,

Hi Marek,
On Sat, 1 Sep 2012 17:12:29 +0200, Marek Vasut <marex@denx.de> wrote:
> > > > One place where lumping and misalignement prevention did clash> > > > was raised in the previous discussion: a 7+1 bytes> > > > function-local char array was allocated on a non-aligned> > > > address (which is possible and normal because it is a char) and> > > > was initialized with some content. The compiler lumped the> > > > initialization as two misaligned 32-byte native accesses,> > > > despite misaligned native accesses being forbidden by compiler> > > > command line options. This was a compiler bug.> > > > > > But that'd mean that instead of fixing a compiler, we'd be doing a> > > workaround in our code?> > > > Not exactly.> > > > First, in this instance, a fix to the compiler has been at least> > requested, if not already applied (I would need to check this). The> > fix causes the compiler to still generate misaligned 32-bit> > accesses *if* misaligned native accesses are allowed, and to use> > only allowed accesses otherwise.> > But then again, this is compiler bug we exposed, no need to hide it.> I'm firm on this one.
I guess I was not clear: this issue with an 8-char local array was
*not* in U-Boot. So we exposed nothing there, and none of the
discussion led to any conclusion that we should hide anything under the
carpet. Actually, if/when we meet a compiler issue, my suggestion is
always to explicitly and lavishly comment the 'fix', whatever it is,
with warnings such as /* CAUTION! BRAINDEAD COMPILER! There is an issue
with compiler X versions Y and up where [...] */. And keep an eye on the
compiler.
> > Second, I do not ask U-Boot contributors to mark code as explicitly> > unaligned when the misaligned access is caused by a compiler or> > code error; I ask them to mark code as unaligned when the misaligned> > access is *unavoidable* because the HW or some standard imposes it.> > I see, I'm starting to see your point. Maybe because I've missed the> previous discussion.
I think so.
> > Here, the specification from which the USB struc is derived imposes> > a misaligned field. This, rather than any compiler bug, makes the> > misaligned access *unavoidable*. And because it is, I ask that it be> > marked so, by the explicit use of unaligned accessors.> > Still, this is unaligned only on ARM, not on maybe some other arches,> right?
The notion of 'misaligned' (rather than unaligned) is pretty much
architecture-independent: the fact that a 32-bit int is not on a
4-byte boundary makes it misaligned. Now, depending on arches, this
misalignment may be irrelevant because the arch can do native misaligned
accesses with little or no penalty, or may be a worry because the arch
can (be made to) do native misaligned accesses but at a performance
cost, or it may be a blocker because the arch cannot do native
misaligned accesses.
So maybe on some other arches misalignment is 'not a problem', or
maybe it is 'a serious issue'. Anyway, for ARM is ranges from one ed
to the other, and for any arch, on a given system the seriousness of the
issue may be set by the designers of the system.
> > Yes. However, letting the compiler generate workarounds may end up> > letting it generate workarounds for misaligned accesses caused by> > errors or bugs also. Marking the code explicitly helps telling> > which is which too.> > Does this work across architectures too? Like, on arm it's> misaligned, on intel it isnt.
Each architecture has its own capabilites regarding native misaligned
accesses... This is why I consider that as a general rule U-Boot should
always align its data properly, because (hopefully) all architectures
can do aligned native accesses; OTOH, if we accept misaligned code on
the grounds that 'it works on such and suh arches' or that 'any normal
arch should be able to handle misaligned accesses some way' or 'no one
in their right mind would physically forbit misaligned accesses', then
we're just giving Murphy a chance to kick us at some point.
Consider this an application of Postel's principle: we liberally
accept architectures that maybe allow misaligned accesses and maybe
handle them well; and we conservatively do not do such accesses unless
we have no other choice.
> > Here it is barely an ad hoc solution, as the alternative would be> > fixing the hardware or worse, spec (can someone tell us where this> > misaligned struct field originates from exactly, hw or USB spec?)> > http://www.intel.com/technology/usb/download/ehci-r10.pdf I think> you're looking around 3.6 .
I see section 3.6 (Queue Head) but I don't readily see the link
between the header and the patch we're discussiong (but I'm not an
USB expert either). Can you connect e few more dots for me? Thanks in
advance.
> > > Best regards,> > > Marek Vasut
Amicalement,

Dear Albert ARIBAUD,
In message <20120901182841.3ae8b35f@lilith> you wrote:
> > Each architecture has its own capabilites regarding native misaligned> accesses... This is why I consider that as a general rule U-Boot should> always align its data properly, because (hopefully) all architectures> can do aligned native accesses; OTOH, if we accept misaligned code on> the grounds that 'it works on such and suh arches' or that 'any normal> arch should be able to handle misaligned accesses some way' or 'no one> in their right mind would physically forbit misaligned accesses', then> we're just giving Murphy a chance to kick us at some point.> > Consider this an application of Postel's principle: we liberally> accept architectures that maybe allow misaligned accesses and maybe> handle them well; and we conservatively do not do such accesses unless> we have no other choice.
Full agreement.
Best regards,
Wolfgang Denk

Dear Albert ARIBAUD,
> Hi Marek,> > On Sat, 1 Sep 2012 17:12:29 +0200, Marek Vasut <marex@denx.de> wrote:> > > > > One place where lumping and misalignement prevention did clash> > > > > was raised in the previous discussion: a 7+1 bytes> > > > > function-local char array was allocated on a non-aligned> > > > > address (which is possible and normal because it is a char) and> > > > > was initialized with some content. The compiler lumped the> > > > > initialization as two misaligned 32-byte native accesses,> > > > > despite misaligned native accesses being forbidden by compiler> > > > > command line options. This was a compiler bug.> > > > > > > > But that'd mean that instead of fixing a compiler, we'd be doing a> > > > workaround in our code?> > > > > > Not exactly.> > > > > > First, in this instance, a fix to the compiler has been at least> > > requested, if not already applied (I would need to check this). The> > > fix causes the compiler to still generate misaligned 32-bit> > > accesses *if* misaligned native accesses are allowed, and to use> > > only allowed accesses otherwise.> > > > But then again, this is compiler bug we exposed, no need to hide it.> > I'm firm on this one.> > I guess I was not clear: this issue with an 8-char local array was> *not* in U-Boot. So we exposed nothing there, and none of the> discussion led to any conclusion that we should hide anything under the> carpet. Actually, if/when we meet a compiler issue, my suggestion is> always to explicitly and lavishly comment the 'fix', whatever it is,> with warnings such as /* CAUTION! BRAINDEAD COMPILER! There is an issue> with compiler X versions Y and up where [...] */. And keep an eye on the> compiler.
Agreed
> > > Second, I do not ask U-Boot contributors to mark code as explicitly> > > unaligned when the misaligned access is caused by a compiler or> > > code error; I ask them to mark code as unaligned when the misaligned> > > access is *unavoidable* because the HW or some standard imposes it.> > > > I see, I'm starting to see your point. Maybe because I've missed the> > previous discussion.> > I think so.> > > > Here, the specification from which the USB struc is derived imposes> > > a misaligned field. This, rather than any compiler bug, makes the> > > misaligned access *unavoidable*. And because it is, I ask that it be> > > marked so, by the explicit use of unaligned accessors.> > > > Still, this is unaligned only on ARM, not on maybe some other arches,> > right?> > The notion of 'misaligned' (rather than unaligned) is pretty much> architecture-independent: the fact that a 32-bit int is not on a> 4-byte boundary makes it misaligned. Now, depending on arches, this> misalignment may be irrelevant because the arch can do native misaligned> accesses with little or no penalty, or may be a worry because the arch> can (be made to) do native misaligned accesses but at a performance> cost, or it may be a blocker because the arch cannot do native> misaligned accesses.> > So maybe on some other arches misalignment is 'not a problem', or> maybe it is 'a serious issue'. Anyway, for ARM is ranges from one ed> to the other, and for any arch, on a given system the seriousness of the> issue may be set by the designers of the system.> > > > Yes. However, letting the compiler generate workarounds may end up> > > letting it generate workarounds for misaligned accesses caused by> > > errors or bugs also. Marking the code explicitly helps telling> > > which is which too.> > > > Does this work across architectures too? Like, on arm it's> > misaligned, on intel it isnt.> > Each architecture has its own capabilites regarding native misaligned> accesses... This is why I consider that as a general rule U-Boot should> always align its data properly, because (hopefully) all architectures> can do aligned native accesses; OTOH, if we accept misaligned code on> the grounds that 'it works on such and suh arches' or that 'any normal> arch should be able to handle misaligned accesses some way' or 'no one> in their right mind would physically forbit misaligned accesses', then> we're just giving Murphy a chance to kick us at some point.> > Consider this an application of Postel's principle: we liberally> accept architectures that maybe allow misaligned accesses and maybe> handle them well; and we conservatively do not do such accesses unless> we have no other choice.> > > > Here it is barely an ad hoc solution, as the alternative would be> > > fixing the hardware or worse, spec (can someone tell us where this> > > misaligned struct field originates from exactly, hw or USB spec?)> > > > http://www.intel.com/technology/usb/download/ehci-r10.pdf I think> > you're looking around 3.6 .> > I see section 3.6 (Queue Head) but I don't readily see the link> between the header and the patch we're discussiong (but I'm not an> USB expert either). Can you connect e few more dots for me? Thanks in> advance.
Nevermind, I took a second look and noticed how the structure looks.
Anyway, I see USB is pioneering this new rule, nice :-) I now basically see the
issue at hand, sorry for being a blind ass. Remind me next time to look first
and talk afterwards.
361 struct usb_hub_descriptor {
362 unsigned char bLength;
363 unsigned char bDescriptorType;
364 unsigned char bNbrPorts;
365 unsigned short wHubCharacteristics;
Let's apply this patch once (if?) WD pulls my USB tree.
> > > > Best regards,> > > > Marek Vasut> > Amicalement,