In libswcale/tests/swcale.c, the function fileTest() calls sscanf in
an argument of "%12s" on character srcStr[] and dstStr[], which are
only 12 bytes. So, if the input string is 12 characters, a
terminating null byte can be written past the end of these arrays.

The implementation is pretty straight-forward. Most of the existing
NV12 codepaths work regardless of subsampling and are re-used as is.
Where necessary I wrote the slightly different NV24 versions.

Finally, the one thing that confused me for a long time was the
asm specific x86 path that did an explicit exclusion check for NV12.
I replaced that with a semi-planar check and also updated the
equivalent PPC code, which Lauri kindly checked.

These formats are not used much, so we've never had a reason to add
them until now. VDPAU recently added support HEVC 4:4:4 content
and when you use the OpenGL interop, the returned surfaces are in
NV24 format, so we need the pixel format for media players, even
if there's no direct use within ffmpeg.

Separately, there are apparently webcams that use NV24, but I've
never seen one.

perfer avctx->framerate first than use avctx->time_base when setting
the frame rate to encoder. 1/time_base is not the average frame rate
if the frame rate is not constant, so use avctx->framerate if the
value is not zero.

perfer avctx->framerate first than use avctx->time_base when setting
the frame rate to encoder. 1/time_base is not the average frame rate
if the frame rate is not constant. In this case, we need to setting
avctx->framerate and avctx->time_base both, but avctx->framerate not
equal to 1/(avctx->time_base).

After the last few commits, the functions for writing master elements
with CRC-32 elements didn't really make use of the ebml_master
structure any more, so remove these parameters from the functions.

The only things that still need to be kept are the positions of the
level 1 elements that are written preliminarily and updated later.
These positions are stored in the MatroskaMuxContext and
replace the corresponding ebml_master structures.

Up until now, a block's relative offset has been reported as the offset
in the log messages output when writing blocks; given that it is
impossible to know the real offset from the beginning of the file at
this point due to the fact that it is not yet known how many bytes will
be used for the containing cluster's length field both the relative
offset in the cluster as well as the offset of the containing cluster
will be reported from now on.

Furthermore, the TrackNumber of the written block has been added to the
log output.

Also, the log message for writing vtt blocks has been brought in line
with the message for normal blocks.

Up until now, the length field of most level 1 elements has been written
using eight bytes, although it is known in advance how much space the
content of said elements will take up so that it would be possible to
determine the minimal amount of bytes for the length field. This
commit changes this.

Given that in both the seekable as well as the non-seekable mode dynamic
buffers are used to write level 1 elements and that now no seeks are
used in the seekable case any more, the two modes can be combined; as a
consequence, the non-seekable mode automatically inherits the ability to
write CRC-32 elements.

There are no differences in case the output is seekable; when it is not
and writing CRC-32 elements is disabled, there can still be minor
differences because before this commit, the EBML ID and length field
were counted towards the cluster size limit; now they no longer are.

Up until now, the writing process for level 1 elements (those elements
for which CRC-32 elements are written by default) was this in case the
output was seekable: Write the EBML ID, write an "unkown length" EBML
number of the desired length, then write the element into a dynamic
buffer, then write the dynamic buffer (after possible calculation and
writing of the CRC-element), then seek back to the size element and
overwrite the unknown-size element with the real size. The seeking and
overwriting part has been eliminated by not writing the size initially.

Up until now, the check for whether to write CRC32 elements was always
mkv->write_crc && mkv->mode != MODE_WEBM. This is equivalent to simply
set write_crc to zero in WebM-mode. And this is what this commit does.

Up until e7ddafd515dc9826915b739d0b977a63c21e96af the Matroska muxer
wrote a secondary seek head referencing all the clusters. When this
was changed, a (now completely wrong) comment remained and the unique
remaining seek head was still called main_seekhead. This has been
changed.

Up until now the EBML Header length field has been written with eight
bytes, although the EBML Header is always so small that only one byte
is needed for it. This patch saves seven bytes for every Matroska/Webm
file.

The upper bounds currently used for determining the size of a CuePoint's
length field can be improved somewhat; as a result, a CuePoint
containing three CueTrackPositions will now only need a size field
with one byte length.

The earlier code included the size of the BlockGroup's length field and
the EBML ID in the calculation of the size for the payload and ignored
the size of the duration's length field. This meant that Blockgroups
corresponding to packets with size 2^(7n) - 17 - n - i, i = 0,..., n - 1,
n = 1,..., 8 (i.e. 110, 16364, 16365, 2097130..2097132, ...) were written
with length fields that are unnecessarily long.

the clean up code in this patch is a little complex, it is because
that set_input_output_tf could be called for many times together
with ff_dnn_execute_model_tf, we have to clean resources for the
case that the two interfaces are called interleaved.

Currently, within interface set_input_output, the dims/memory of the tensorflow
dnn model output is determined by executing the model with zero input,
actually, the output dims might vary with different input data for networks
such as object detection models faster-rcnn, ssd and yolo.

This patch moves the logic from set_input_output to execute_model which
is suitable for all the cases. Since interface changed, and so dnn_backend_native
also changes.

In vf_sr.c, it knows it's srcnn or espcn by executing the model with zero input,
so execute_model has to be called in function config_props

Cuvid supports clips with a limit on maximum number of macroblocks.
This check was missing after cuvidGetDecoderCaps API call allowing
unsupported clips to proceed.
Added the missing check, same as the one in hwaccel nvdec implementation.

10 bytes (id3v2 header amount of bytes) were being read before any checks
were made on the bitstream. The result was that we were overreading into
the next frame if the current one was 8 or 9 bytes long.

New VdpYCbCr Formats VDP_YCBCR_FORMAT_Y_U_V_444 and,
VDP_YCBCR_FORMAT_Y_UV_444 have been added in VDPAU with libvdpau-1.2
to be used in get/putbits for YUV 4:4:4 surfaces. Earlier mapping of
AV_PIX_FMT_YUV444P to VDP_YCBCR_FORMAT_YV12 is not valid.

This commit was merged in a couple years ago as a no-op because we
had already switched from GetProcAddress to dlsym some time before
that. However, not applying the actual cast causes warnings about
FARPROC and when attempting to build FFmpeg in MSVC with AviSynth-GCC
32-bit compatibility, those FARPROC warnings turn into FARPROC errors.