these clean-ups and minor speedups complete some TODOs and semi-finished
changes I have gathered in the ELF backend. In a nutshell:

Fixed comment style, used INT_BITS_STRLEN_BOUND from gnulib to be future
proof on integer representation string length, replaced long arguments in
fast printing functions with HOST_WIDE_INT that is always a larger type
(also asserted that), converted some never-negative ints to unsigned.
Guarded the output.h:default_elf_asm_output_* declarations, mimicking
varasm.c (I'm not exactly sure why this is guarded in the first place).
Changed default_elf_asm_output_* to be clearer and faster, they now
fwrite() line by line instead of putting char by char. Implemented fast
octal output in default_elf_asm_output_*, this should give a good boost to
-flto, but I haven't measured a big testcase for this one.

All in all I get a speed-up of ~30 M instr out of ~2 G instr, for -g3
compilation of reload.c. Actually saving all the putc() calls gives more
significant gain, but I lost a tiny bit because of converting [sf]print_*
functions to HOST_WIDE_INT from long, for PR 51094. So on i586 which has
HOST_WIDE_INT 8 byte wide, I can see slow calls to __u{div,mod}di3 taking
place. I don't know whether there is a meaning in writing LEB128 values
greater than 2^31 but I could change all that to HOST_WIDEST_FAST_INT if
you think so.

Time savings are minor too, about 10 ms out of 0.85 s. Memory usage is the
same. Bootstrapped on x86, no regressions for C,C++ testsuite.

Thanks Andreas, hp, Mike, for your comments. Mike I'd appreciate if you
elaborated on how to speed-up sprint_uw_rev(), I don't think I understood
what you have in mind.