One more important note: the current implementation has a potential
buffer overrun issue, because it writes first, and only then checks
whether that may have overrun the buffer. And the check itself is off
by one, too:

I agree your solution is more C++-like and smart.
However, from the view point of performance, just inline
static function is better. Attached code measures the
performance of access speed for wpbuf.
I compiled it by g++ 7.4.0 with -O2 option.

The result is as follows.

Total1: 2.315627 second
Total2: 1.588511 second
Total3: 1.571572 second

> However, from the view point of performance, just inline
> static function is better.

I don't see how that could be the case. Inline methods of a static C++
object should not suffer any perfomance penalty compared to inline
functions operating on static variables.

> Attached code measures the
> performance of access speed for wpbuf.
> I compiled it by g++ 7.4.0 with -O2 option.
>
> The result is as follows.
>
> Total1: 2.315627 second
> Total2: 1.588511 second
> Total3: 1.571572 second

And on inspection, all three bench*() functions do appear to have
exactly the same machine code, too. They may be inlined and mixed into
main() somewhat differently, though. That might explain the difference
more readily than any actual difference in speed between the three
implementations.

> Am 01.03.2020 um 07:33 schrieb Takashi Yano:
>
> > However, from the view point of performance, just inline
> > static function is better.
>
> I don't see how that could be the case. Inline methods of a static C++
> object should not suffer any perfomance penalty compared to inline
> functions operating on static variables.
>
> > Attached code measures the
> > performance of access speed for wpbuf.
> > I compiled it by g++ 7.4.0 with -O2 option.
> >
> > The result is as follows.
> >
> > Total1: 2.315627 second
> > Total2: 1.588511 second
> > Total3: 1.571572 second
>
> Strange. The result here (with GCC 9.2) is rather different:
>
> $ g++ -O2 -o tt wpbuf-bench.cc && ./tt
> Total1: 0.753815 second
> Total2: 0.757444 second
> Total3: 1.217352 second
>
> And on inspection, all three bench*() functions do appear to have
> exactly the same machine code, too. They may be inlined and mixed into
> main() somewhat differently, though. That might explain the difference
> more readily than any actual difference in speed between the three
> implementations.

I looked into the code generated by g++ 7.4.0 with -O2. The codes
generated are different.