And of course, you can ask questions here. In that case it is helpful
if you can manage to simplify the source to a small piece of code that
triggers the problem and allows others to reproduce the problem. (i.e.
no #include in the code, no ... (except for varargs), a.s.o).

Snippets of .s may point to the problem when you add -dp -fverbose-asm

And there are lots of places where avr-gcc produces suboptimal or even
bad code, so feedback is welcome.

But note that just a few guys are working on the AVR part of gcc.
I would do more if I had the time (and the support of some gurus to ask
questions on internals then and when...)

OK, I only spent a few minutes looking at old code and I found some
obviously sub-optimal results. It distills down to this:

The problem, it seems, is that the compiler doesn't realize that the
right hand side of the _expression_ can only have any non-zero values in
the bottom 8 bits, since it's an unsigned char which is being
implicitly expanded to 32 bits for the or operation. In fact, it's only
the bottom bit that's ever non-zero. As a result it's spending a number
of cycles and registers doing useless things. I'll copy a report to the
locations you mention in your e-mail.

There are probably ways to work around this, such as making "packet" a
union of an unsigned char and a long, then shifting the long and only
ORing in the unsigned char. I'll note that there's also an optimization
to be had with the right hand side of the _expression_. I would write the
assembly something like this:

lsl r18
rol r19
rol r20
rol r21
in r24,38-0x20
bst r24, 1
bld r18, 0

I'm sure I can find other examples of poor code generation in this
particular file, since I remember coming across many cases where I
replaced the generated code with inline assembly when I was originally
working on it, but that will have to wait for later.

Thanks for your help, I appreciate it. As I said avr-gcc is pretty
good, but I would love it if it could get even better :)