Since non-Haar wavelets need to look into pixels outside the frame, we
need to pad the buffer. The old factor of two seemed to be a workaround
that fact and only padded to the left and bottom. This correctly pads
by the slice size and as such reduces memory usage and potential
exploits.
Reported by Liu Bingchang.

Ideally, there should be no temporary buffer but the encoder is designed
to deinterleave the coefficients into the classical wavelet structure
with the lower frequency values in the top left corner.