I don't know weather it is possible for encoder to do such an analysis of a source audio, but it would be great it yes.

It's only a matter of finding the right formula/algorithm.

x264 video encoder has encoding mode called Constant Rate Factor. In this mode number (16, 17, etc) is used to define desired quality (lower - better quality and higher bitrate), and encoder does not care about bitrate, only about keeping rate factor constant. It is a question, why nobody has invented something similar for audio encoding (except lossyWAV, which needs too much bitrate for acceptable quality)?----------------Opus 1.1 Alpha has some bugs, which can be found using samples from thread High Frequency Listening Test Samples. For example, at 16-24 kbps Opus gives this:and for 32-40 kbps it gives this:For samples 1_12kHz, 1_20kHz, 2_8kHz, 2_12kHz and 2_20kHz Opus sounds wrongly even at 512 kbps.Full set of files is here (problematic sampes are marked with exclamation mark). Hope, developers will use this samples in their work.

x264 video encoder has encoding mode called Constant Rate Factor. In this mode number (16, 17, etc) is used to define desired quality (lower - better quality and higher bitrate), and encoder does not care about bitrate, only about keeping rate factor constant. It is a question, why nobody has invented something similar for audio encoding (except lossyWAV, which needs too much bitrate for acceptable quality)?

I think every encoder with real vbr (not abr) does that? Lame has V(0-9), QT AAC has --tvbr (0-127), Vorbis has -q((-2)-10). The bitrate may vary a lot with these settings between different songs/genres.

Mods, could we get a threadsplit for these quality level posts? I really think this deserves its own thread.

If you have a mixed-content file, then for an encoder to do a good job of targeting a bitrate for the whole file while providing "constant quality" it would have to do a two-pass type deal. It has no other way of knowing, when you ask for 64kbps, if e.g. half is effectively-mono speech (both channels identical) and half is stereo music and thus it can target 32kbps for the speech, 96kbps for the music, and give you basically the bitrate you asked for.

For almost all purposes, it'd be better to let the user specify the quality level. With a single user-specified quality level, a file that was all speech could come out as ~32kbps, a file that was all stereo music could come out as ~96kbps, and the mixed-content file above could come out as ~64kbps, with constant quality and without having to do two passes.

The "VBR bitrate setting=quality level" idea we've heard so many times says an ideal VBR encoder is supposed to encode things at a constant quality level which averages out to the target bitrate across some generic ideal reference collection. But it really makes no sense to try to say how much of an ideal reference collection is mono speech. In the opus-tools suggestion thread, NullC mentioned the "bitrate setting is for fullband stereo equivalent quality" idea i.e. considering the ideal reference collection to consist entirely of FB stereo music. As he said there, the downside is that someone encoding just mono speech ends up with their files encoded at ~1/3 of their target rate. If you shift the balance of the ideal collection you ameliorate that but give those encoding music some of the opposite problem. Multichannel users have to guess at how their bitrate translates to a stereo equivalent bitrate too.

The user specified target bitrate thus becomes sufficiently unhinged from the end result's bitrate and quality that it would no longer make sense to tell people it's a target bitrate; instead you just call it a quality mode and provide some kind of table of what range of result bitrates to expect given channel count, bandwidth, and speech vs. music.

Even if your content is not mixed but is in separate files, having such a quality setting would enable people to encode mixed collections of files- whether just tracks of the same CD (changing quality settings for different tracks when ripping=ugh) or their entire audio collection- with a single setting without worrying that they're either bloating the speech files or starving the music.

QUOTE (saratoga @ Feb 14 2013, 13:22)

VBR gives you constant quality. Having a file with two different quality levels is not what VBR is meant to do.

But if you have a file with 64kbps effectively-mono speech and 64kbps stereo music, those are two vastly different quality levels. Being adaptive and constant-quality rather than constant-bitrate most definitely is, as you admit, what VBR is meant to do.

QUOTE (saratoga @ Feb 14 2013, 13:22)

I think to have that work you'd have to have some kind of filter or processing that attempted to classify the signal as audio or music and then adjusted the encoder's parameters from frame to frame. Its probably not too hard to do, but its also a very strange thing to want to implement so maybe no one has done so.

The Opus encoder already tries, from frame to frame, to classify the audio as speech or music, determine its bandwidth, and determine channel separation. Wanting this analysis to show up in giving lower bitrates for speech and higher bitrates for music is not very strange or even slightly strange.