Opus combines the speech-oriented linear predictive codingSILK algorithm and the lower-latency, MDCT-basedCELT algorithm, switching between or combining them as needed for maximal efficiency.[4] Bitrate, audio bandwidth, complexity, and algorithm can all be adjusted seamlessly in each frame. Opus has the low algorithmic delay (26.5 ms by default)[8] necessary for use as part of a real-time communication link, permitting natural conversation, networked music performances, and live lip sync; by trading-off quality or bitrate, the delay can be reduced down to 5 ms. Its delay is exceptionally low compared to competing codecs, which require well over 100 ms, yet Opus performs very competitively with these formats in terms of quality per bitrate.[9]

Opus has very short latency (26.5 ms by default), which makes it suitable for real-time applications such as telephony, Voice over IP and videoconferencing; research by Xiph led to the CELT codec, which allows the highest quality while maintaining low delay. In any Opus stream, the bitrate, bandwidth, and delay can be continually varied without introducing any distortion or discontinuity; even mixing packets from different streams will cause a smooth change, rather than the distortion common in other codecs. Unlike Vorbis, Opus does not require large codebooks for each individual file, making it more efficient for short clips of audio and more resilient.

As an open standard, the algorithms are openly documented, and a reference implementation (including the source code) is published. Broadcom and the Xiph.Org Foundation own software patents on some of the CELT algorithms, and Skype Technologies/Microsoft own some on the SILK algorithms; each offers a royalty-free perpetual for use with Opus, reserving only the right to make use of their patents to defend against infringement suits of third parties. Qualcomm, Huawei, France Telecom, and Ericsson have claimed that their patents may apply, which Xiph's legal counsel denies, and none have pursued any legal action.[10][11] The Opus license automatically and retroactively terminates for any entity that attempts to file a patent suit.

The Opus format is based on a combination of the full-bandwidth CELT format and the speech-oriented SILK format, both heavily modified: CELT is based on the MDCT that most music codecs use, using CELP techniques in the frequency domain for better prediction, while SILK uses linear predictive coding (LPC) and an optional Long-Term Prediction filter to model speech. In Opus, both were modified to support more frame sizes, as well as further algorithmic improvements and integration, such as using CELT's range encoder for both types. To minimize packet overhead at low bitrates, if latency is not as pressing, SILK has support for packing multiple 20 ms frames together, sharing context and headers; SILK also allows Low Bit-Rate Redundancy (LBRR) frames, allowing low-quality packet loss recovery. CELT includes both spectral replication and noise generation, similar to AAC's SBR and PNS, and can further save bits by filtering out all harmonics of tonal sounds entirely, then replicating them in the decoder.[12] Better tone detection is an ongoing project to improve quality.

The format has three different modes: speech, hybrid, and CELT. The basic speech mode is pure SILK, up to 8 kHz, while the hybrid speech mode combines SILK for the speech and uses CELT for the frequency range above 8 kHz, allowing an easy fallback to pure SILK at very low bitrates. The third mode is pure-CELT, designed for general audio. SILK is inherently VBR and cannot hit a bitrate target, while CELT can always be encoded to any specific number of bytes, enabling hybrid and CELT mode when CBR is required.

The reference implementation is written in C and compiles on hardware architectures with or without a floating-point unit, although floating-point is currently required for audio bandwidth detection (dynamic switching between SILK, CELT, and hybrid encoding) and most speed optimizations.

Opus was originally specified for encapsulation in Ogg containers, specified as audio/ogg; codecs=opus, and for Ogg Opus files the .opus filename extension is recommended.[2]Matroska,[13]WebM,[14]MPEG-TS[15] all officially support Opus streams.

Opus allows the following bandwidths during encoding. Opus does not require the input sampling rate for encoding or the output sampling rate in decoding to correspond to the bandwidth chosen. For example, audio can be input at 16 kHz for encoding yet be specified to generate narrowband audio.[16]

Opus was proposed for the standardization of a new audio format at the IETF, which was eventually accepted and granted by the codecworking group. It is based on two initially separate standard proposals from the Xiph.Org Foundation and Skype Technologies S.A. (now Microsoft). Its main developers are Jean-Marc Valin (Xiph.Org, Octasic, Mozilla Corporation), Koen Vos (Skype), and Timothy B. Terriberry (Xiph.Org, Mozilla Corporation). Among others, Juin-Hwey (Raymond) Chen (Broadcom), Gregory Maxwell (Xiph.Org, Wikimedia), and Christopher Montgomery (Xiph.Org) were also involved.

The development of the CELT part of the format goes back to thoughts on a successor for Vorbis under the working name Ghost. As a newer speech codec from the Xiph.Org Foundation, Opus replaces Xiph's older speech codec Speex, an earlier project of Jean-Marc Valin. CELT has been worked on since November 2007.

The SILK part has been under development at Skype since January 2007 as the successor of their SVOPC, an internal project to make the company independent from third-party codecs like iSAC and iLBC and respective license payments.

In March 2009, Skype suggested the development and standardization of a wideband audio format within the IETF. Nearly a year passed with much debate on the formation of an appropriate working group.[17] Representatives of several companies which were taking part in the standardization of patent-encumbered competing formats stated objections against the start of the standardization process for a royalty-free format: representatives of Polycom and Ericsson—the creators and licensors of G.719—as well as France Télécom, Huawei and the Orange Labs (department of France Télécom), which were involved in the creation of G.718. The working group finally formed in February 2010, and even the corresponding Study Group 16 from the ITU-T pledged to support its work.

In July 2010, a prototype of a hybrid format was presented that combined the two proposed format candidates SILK and CELT. In September 2010, Opus was submitted to the IETF as proposal for standardization. For a short time the format went under the name of Harmony before it got its present name in October 2010.[18] At the beginning of February 2011, the bitstream format was tentatively frozen, subject to last changes.[19] Near the end of July 2011, Jean-Marc Valin was hired by the Mozilla Corporation to continue working on Opus.[20] In November 2011, the working group issued the last call for changes on the bitstream format. The bitstream has been frozen since January 8, 2012.[21] On July 2, 2012, Opus was approved by the IETF for standardization.[22] The reference software entered release candidate state on August 8.[23] The final specification was released as RFC 6716 on September 10, 2012.[24][25] and versions 1.0 and 1.0.1 of the reference implementation libopus were released the day after.

On July 11, 2013, libopus 1.0.3 brought bug fixes and a new Surround sound API that improves channel allocation and quality, especially for LFE.[26]

libopus 1.1.1 was released on November 26, 2015, and 1.1.2 on January 12, 2016, both adding speed optimizations and bug fixes. July 15, 2016 saw the release of version 1.1.3 and includes bug fixes, optimizations, documentation updates and experimental Ambisonics work. libopus 1.2 Beta was released on May 24, 2017.

Comparison of coding efficiency between Opus and other popular audio formats

Opus has been shown to have excellent quality,[9] and at higher bit rates, it turns out to be competitive with audio formats with much higher delay, such as HE-AAC and Vorbis.[31]

In listening tests around 64 kbit/s, Opus shows superior quality compared to HE-AAC codecs, which were previously dominant due to their use of the patented spectral band replication (SBR) technology.[32][6] In listening tests around 96 kbit/s, Opus shows slightly superior quality compared to AAC and significantly better quality compared to Vorbis and MP3.[7]

Opus has very low algorithmic delay,[4] a necessity for use as part of a low-audio-latency communication link, which can permit natural conversation, networked music performances, or lip sync at live events. Total algorithmic delay for an audio format is the sum of delays that must be incurred in the encoder and the decoder of a live audio stream regardless of processing speed and transmission speed, such as buffering audio samples into blocks or frames, allowing for window overlap and possibly allowing for noise-shaping look-ahead in a decoder and any other forms of look-ahead, or for an MP3 encoder, the use of bit reservoir.[33]

Total one-way latency below 150 ms is the preferred target of most VoIP systems,[34] to enable natural conversation with turn-taking little affected by delay. Musicians typically feel in-time with up to around 30 ms audio latency,[35] roughly in accord with the fusion time of the Haas effect, though matching playback delay of each user's own instrument to the round-trip latency can also help.[36]It is suggested for lip sync that around 45–100 ms audio latency may be acceptable.[37]

Opus permits trading-off reduced quality or increased bitrate to achieve an even smaller algorithmic delay (5.0 ms minimum).[38] While the reference implementation's default Opus frame is 20.0 ms long, the SILK layer requires a further 5.0 ms lookahead plus 1.5 ms for resampling, giving a default delay of 26.5 ms. When the CELT layer is active, it requires 2.5 ms lookahead for window overlap to which a matching delay of 4.0 ms is added by default to synchronize with the SILK layer. If the encoder is instantiated in the special restricted low delay mode, the 4.0 ms matching delay is removed and the SILK layer is disabled, permitting the minimal algorithmic delay of 5.0 ms.[8]

The format and algorithms are openly documented and the reference implementation is published as free software. Xiph's reference implementation is called libopus and a package called opus-tools provides command-line encoder and decoder utilities. It is published under the terms of a BSD-like license. It is written in C and can be compiled for hardware architectures with or without a floating-point unit. The accompanying diagnostic tool opusinfo reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ogginfo from the vorbis-tools and therefore — unlike the encoder and decoder — is available under the terms of version 2 of the GPL.

RFC 6716 contains a complete source code for the reference implementation written in C.

The FFmpeg project [39] and GStreamer project [40] have encoder and decoder implementation not derived from the reference library.

The libopus reference library has been ported to both C# and Java as part of a project called Concentus. These ports sacrifice performance for the sake of being easily integrated into cross-platform applications.[41]

Since version 3.13, Rockbox enables Opus playback on supported portable media players, including some products from the iPod series by Apple, devices made by iriver, Archos and Sandisk, and on Android devices using "Rockbox as an Application".[92][93] All recent GrandstreamIP phones support Opus audio both for encoding and decoding. OBihai OBi1062, OBi1032 and OBi1022 IP phones all support Opus. Recent BlueSound wireless speakers support Opus playback.[94] Devices running Hiby OS, like the Hiby R3, are capable of decoding Opus files natively.