I'm looking for suggested features to get into opus-tools prior to the libopus 1.1 release.

Before I get 1001 suggestions for it: One frequently requested feature which was recently added is flac input in opusenc. I'm contemplating changing opusdec to use the new opusfile libraryŚ which would give it seeking and integrated http(s) streaming support. Also already on my todo list are default comment packet padding so updating metadata doesn't require rewriting the files and adding a replaygain tool.

Some people would really like it if opusenc/opusdec supported taking multiple input files e.g. opusenc *.flac but the implicit output file naming is pretty ununixy, and would break the interface and I got flamed all to heck last time I changed the opusenc interface.... so I'm not sure if/how I want to accommodate that usage.

There are a few things that I'd like to see. In opusenc, an option to force SILK or CELT. Goldwave already does this, and when I encode a 24kbps with SILK in Goldwave, it sounds much better than the opusenc file. This changes when you get down to 16kbps.

Going along with the voice/music detection in the 1.1-alpha, I don't know if it would be feasible, but it would be nice to dynamically scale the bitrate between detected music and voice portions of a clip with user specified bitrates for each, where the user might choose to set a bitrate of, say, 24kbps for voice, and a higher bitrate of 128 kbps for music.

And in opusinfo, a printout or graph of the bitrate vs time in the file, or maybe just a printout of the bitrate each frame, or the average bitrate at a user specified interval, so you could get a listing of the average bitrate every second, or whatever the user might specify. This might be useful for analyzing how your audio is being encoded.

Wrote much of this almost a week ago; realized I still had the unfinished reply sitting around, and though it's late it still seemed worth saying.

QUOTE (moosehunter @ Feb 9 2013, 21:43)

There are a few things that I'd like to see. In opusenc, an option to force SILK or CELT. Goldwave already does this, and when I encode a 24kbps with SILK in Goldwave, it sounds much better than the opusenc file. This changes when you get down to 16kbps.

I find it somewhat odd that you're getting better results by forcing SILK-only at those rates; it'd normally be using hybrid. Once you get down to 16kbps it'll usually be doing SILK-only even if you don't force it. Have you looked at the output bandwidth and verified that it's only wideband? ABC/HR type blind test data confirming your preference for forced SILK would be interesting.

The mode decisions in the present encoder are fairly reasonable and will continue to be refined; thus I don't think forcing a mode is usually going to be beneficial. Also, the developers are understandably reluctant to give people options that either give them a false sense of control or allow them to shoot themselves in the foot. Nevertheless, I do want to echo the request for ways to constrain the encoder's mode and bandwidth decisions with opusenc, because I do think there are legitimate use cases. For instance, for playback on very constrained devices (e.g. many Rockbox targets), it'd be nice to be able to restrict the encoder for performance and/or battery life reasons.

With the 1.0.x encoder you could just choose a bitrate which would reliably give you the mode and bandwidth you wanted, but the adaptability of the master branch eliminates that predictability.

Another use case: I have speech recordings where the master branch encoder encodes them as fullband or keeps switching between FB and SWB but the content above 12kHz seems unhelpful. Some consonants, applause, and some noises (like dropping stuff on the podium) have appreciable HF content, but I think they're adequately represented by their energy below 12kHz. There are some noises in a few recordings- like background silverware- that are audibly affected by a 12kHz cutoff, but that's actually an improvement, since the noise is made slightly less distracting. Outside of that I don't know that I can even ABX 12kHz lowpassed versions of this material. I'd rather have those bits go to lower frequencies, and though the FB<->SWB transitions often aren't audible I think there's more danger of such transitions being audible than there is of the >12kHz content being missed. I think this situation is pretty common for speech, and while I could of course pass the encoder versions that I've lowpassed or resampled myself, it would be convenient to be able to just tell the encoder to not do FB.

QUOTE

Going along with the voice/music detection in the 1.1-alpha, I don't know if it would be feasible, but it would be nice to dynamically scale the bitrate between detected music and voice portions of a clip with user specified bitrates for each, where the user might choose to set a bitrate of, say, 24kbps for voice, and a higher bitrate of 128 kbps for music.

This seems to come up frequently- one discussion has been moved from the old huge "ready for testing thread" to its own thread.

QUOTE

And in opusinfo, a printout or graph of the bitrate vs time in the file, or maybe just a printout of the bitrate each frame, or the average bitrate at a user specified interval, so you could get a listing of the average bitrate every second, or whatever the user might specify. This might be useful for analyzing how your audio is being encoded.

opusenc's --save-range already dumps each frame's framelength and size, mode and bandwidth info, etc. Not the most handy format for visualization etc, and the only documentation that tells you what's what is reading the code, but the info is there for the parsing. Obviously that doesn't help if you're trying to analyze files you encoded with some other program or files you received from a third party.

I find it somewhat odd that you're getting better results by forcing SILK-only at those rates; it'd normally be using hybrid. Once you get down to 16kbps it'll usually be doing SILK-only even if you don't force it. Have you looked at the output bandwidth and verified that it's only wideband? ABC/HR type blind test data confirming your preference for forced SILK would be interesting.

Ahh, that was the difference. Goldwave was encoding it as super-wideband. The difference was usually night and day, depending on the voice of the person. One of the samples with the least difference between the two was the speech sample on the opus example page. I attached the two different encodes.

Given that discovery, I'd agree with the option to let the user choose the bandwidth of the output file.

QUOTE (jensend @ Feb 15 2013, 13:22)

opusenc's --save-range already dumps each frame's framelength and size, mode and bandwidth info, etc. Not the most handy format for visualization etc, and the only documentation that tells you what's what is reading the code, but the info is there for the parsing. Obviously that doesn't help if you're trying to analyze files you encoded with some other program or files you received from a third party.

Well, just feed the data into Excel, and it displays it just fine. And nevermind about my question about documentation, I didn't read this before I posted that.