Also slightly move around code not allocate a new frame if we won't
decode it. This prevents us from putting undecoded frames in frame
pointers, which (in mt decoding) other threads will use and wait on
as references, causing a deadlock (if we skipped decoding) or a crash
(if we didn't initialized next_framep[] at all).

next_dts is used for estimating the dts of the next packet if it's
missing. Therefore, it makes no sense to set it from the pts of the last
decoded frame. Also it should be estimated from the current packet
duration/ticks_per_frame always, not only when a frame was successfully
decoded.

apedec: allow the user to set the maximum number of output samples per call

It makes sense in some cases to split up the output packet to save on memory
usage (ape frames can be very large), but the current/default size is
arbitrary. Allowing the user to configure this gives more flexibility and
requires minimal additional code.

Calculates based on total file size and wavetaillength from the header.
Falls back to multiplying finalframeblocks by 8 instead of 4 so that it will
at least be overestimating for 24-bit. Currently it can underestimate the
final packet size, leading to decoding errors.

Before, drawtext filter deliberately altered given text coordinates if
text didn't fully fit on the picture. This breaks the use case of
scrolling large text, e.g. movie closing credits.
Add 'fix_bounds', to make it usable in such cases (by setting its value to 0).
Default behavior is not changed, and non-fitting text coords are fixed.

Do not use AVStream's duration for dts generation since it contains in
some cases the duration of the whole file instead of duration of the
samples in the moov. This happens if the mdhd holds the duration of the
whole file but has no entries or a zero duration in its stts.

Return the correct number of consumed bytes and set *data_size = 0.
Returned size is 1 too small, leading to that 1 byte being read as the next
frame, which results in an extra blank frame at the beginning of the stream.

get_ue_golomb_long() is only tested for values up to 2^15 - 2 since
we can not write larger values.
Silence the test on success and return a non-zero value on error.
Use an heap scratch buffer instead of large stack buffer.
Remove unneeded includes.