Hi all,I've been playing with Opus for a bit and really like it, but I do have some signals that trip it up as high as 128 kbps, and even higher sometimes. I even created a very simple sample which tripped it up at 160 and above, but I will need to do more testing to make sure the codec was to blame. If that indeed is the problem, I will put them here along with ABX so hopefully it can be improved. In fact I am probably violating a rule by saying I can hear these without posting evidence, but again I have to do further testing to make sure this is legitimate.But first I was going to run these few samples past the experimental builds which seem to try to address these issues. But I don't know where to obtain them. I once found one, but I can't remember the version, or where i got it. I would be looking for the opusenc utility like the one in the command line opus-tools. I really hope these are being made somewhat regularly because I have no clue on how to work with sources.Also, if these builds really do improve quality, and I think they will, will they begin folding back into the main release like AoTuV did with the official Vorbis years ago?And another interesting thought. Could the research that went into developing Opus be used to refine a codec like Vorbis which is only for storage? Maybe some new coding strategies based on Celt but something which works well in high delay situations. I know Opus is great for storage as is, and was intended to be so, but if low delay codecs are at so much of a disadvantage, then it would make sense to wonder what something like Opus could do without that restriction. It's probably impractical at least for now but it's an interesting thought.Thanks for your answers and have a good day!

The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.

A lot of the big gains over Vorbis come in other areas (e.g. the way it encodes the explicit band energy is more efficient and produces better results than the noise normalization version discovered late in Vorbis development).

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode. Similarly with bandwidth detection, there's scope to make it more consistent offline with greater lookahead and the option to look backwards. This could offer great improvements for mixed materials at lowish bitrates (around 24-48 kbps) such as audiobooks and podcasts with incidental music, and it's also feasible to set two different bitrates such as 24 kbps mono SILK for speech and 64 kbps VBR stereo CELT when music is detected, to give highly bitrate-efficient high quality podcasts, with commensurate savings of server bandwidth costs for the low-budget, often amateur podcast creators who have to pay the likes of libsyn for their many gigabytes of monthly use and beg for donations from listeners. Some of this could be built into an audio editor/DAW too (where certain tracks could be labelled as speech or music), for the person compiling the podcast allowing the rendering to signal the required mode, but automated and accurate music detection would allow people to use the editor/DAW of their choice and just send it to the encoder without worrying.

There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).

I'm sure it's the sort of thing that the Opus collaborators consider from time to time. I think they do an admirable job of ensuring that Opus came to fruition first of all and that it continues to be tuned for optimum performance, so I'd be reluctant to get them to shift their priorities, which seem very well placed to me. It has a real chance of making a wider impact than Vorbis because it also addresses the unmet needs for interactive uses of all kinds and covers the bulk of the useful internet-packetized bitrate range about as well as or better than any other codec to date. It can gain a foothold in numerous potential killer applications and bring compatibilty to many devices through that.

The quality versus latency tradeoff at 64 kbps for 20ms frames seems to indicate that 10ms frames require about 10% higher bitrate, 5ms frames need about 32.5% higher bitrate and 2.5ms frames need 75% higher bitrate for same PEAQ score. This limited data set makes it look as though we're approaching an asymptote of marginal gains and the bitrate reduction is going to be pretty small for typical signals if we move to 40ms. However, it's plausible that the long-frame efficiency would be greater for highly tonal signals or signals with multiple tones at the same time. Is the advantage worth it given the typical nature of music?

I expect some of the people like Jean-Marc and NullC can add experience to confirm or contradict this.

Using 40 ms CELT frames would not be useful except at rates below 48 kb/s -- these just add too many temporal artefacts (Vorbis sill use them only below 40 kb/s or so). What we *could* have done to improve transient performance is simply add support for full-overlap windows on 20-ms frames. The problem is that it would have made the code more complicated and would have been bad for latency (remember that Opus is designed for interactive applications).

QUOTE (Dynamic @ May 7 2013, 14:44)

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.

This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

QUOTE (Dynamic @ May 7 2013, 14:44)

There are other ideas for a few years down the line that are being modelled. Some of the Xiph Ghost ideas, like separating tonal from transient parts of the signal and encoding them separately in their most efficient manner, require a lot of processing from today's CPUs, but might be viable in a few years if they can be made to work well enough, and might be a better use of developers time, as NullC said (Monty's sinusoidal modelling).

The problems with Ghost go far beyond "not enough CPU". Separating tones, from noise and transients is *really* hard -- sometimes you're not even sure what a tone really means! And you have to do a really good job for such an algorithm to work at all. That's very different from an algorithm like CELT where you can make an incredibly dumb encoder that still gives pretty good quality (and easily out-performs MP3).

Right now, there's scope for an offline mode in Opus, where music detection when not live streaming can be modified, for example, to check more carefully the few seconds before the existing detector activates, perhaps with a longer FFT to detect lower frequency tones that would get missed by the short lookahead of the current, live mode.

This is already implemented in git, though opus-tools does not support it out-of-the-box yet (it's an experimental branch).

Is it already implemented into the recently posted "new test build"? Thanks for all the answers on the framesizes btw. Didn't really get the thing with the lower frequencies detected when using more lookahead, though, and what its supposed to do when there's no full overlap between the frames anyway Can someone explain (or give a link)?