Just curious if there is a feeling that lossy encoding is near its limit of compression-to-quality efficiency, or is there a lot of life yet?

Is lossy research even a worthy pursuit with lossless compression codecs gaining support in large amount of CE devices?

To be clear, I'm a fan of lossy. It takes <30 seconds for me to purchase and download a very good quality MP3 file from Amazon. But what I'm wondering, are there any breakthroughs in sight that would increase the compression efficiency significantly while maintaining the same level of quality?

Has the failure of SACD/DVD-Audio put us into a situation where lossless compression + ever increasing internet download speeds = widespread lossless usage in the next 5 years? Because CD format is the de factor standard, digital file sizes have not grown and will continue not to grow, but our ability to losslessly compress and then download quickly have grown.

Well, lossless downloads still take up quite a bit of space. There are still many devices out there that have 1-4GB of storage and lossless would fill such devices rather quickly. I don't know if you have been paying attention to the portable market but hard drives were the standard while flash memory (or SSD) based players were mainly meant for physical activities. Hard drives were fast approaching the 100+GB capacity but now the market is switching towards flash memory/SSD. So the capacities have gown down quite a bit. There is a 160GB hard drive based iPod or a 32GB SSD based iPod/Zen.

It is true that capacities are increasing but we are kind of down now as the market is just beginning to transition to SSD/flash memory.

I don't feel that we are at the extent of lossy encoding though. Lame is continually adding improvements with every new release and the AAC encoders out there are still relatively new especially when compared against Lame. I doubt we have even come close to the maturity of either the iTunes, CoreAudio (pretty much the same as iTunes), or Nero AAC encoders. I don't know about Lame mp3 though. I remember reading not too long ago whenever 3.97 went final that some aspects of 3.97 could not be improved due to the limits of the mpeg-1 standard (or something along the lines).

Either way, lossy encoders still have lots of room for improvement. Don't forget about HE-AAC which can be used for much lower bitrate encoding. I am sure there is still plenty of room for improvement there. So no, in my opinion we have yet to see the limits of lossy encoding and I don't know if we ever will.

I think a more appropriate question would be whether we have reached the limits of lossless encoding. Lossless encoders seem to improve over time and file sizes are cut down by 0.25-1MB per song. That is great and all but that is still a lot more than file sizes with lossy encoding. Most lossless files still take up about 2-2.5 times more space than a 320kbps AAC/mp3 file. I would love to see a lossless encoder where a 4 minute song didn't take up ~40MB but rather 20MB.

Our technology is making lossless a more approachable standard when it comes to portable listening but lossy will be used for quite a bit of time. Lossless will become more of a standard whenever affordable SSD based players hit the 120+GB capacities. Right now you can purchase a 32GB iPod touch (which uses SSD storage) for $499 or a 32GB Zen for $299. So SSD/flash memory prices are still rather expensive and we probably won't see 64GB units until sometime this fall.

Then again, my perspective is mainly from the portable audio side of things. I am not a lossy encoder developer so I don't know if the limits of lossy encoding have been reached. I am just looking at Lame here and comparing it to other AAC encoders. Lame 1.0 came out in mid-1998 with version 3.0 coming out in May 1999. So Lame is about 10 years old yet they are still adding improvements. Apple's AAC encoder was first made public in 2003 and I think Nero's AAC encoder first came out in 2005 (someone please correct me on this as I know a free public version came out in May 2006 but I don't know when Nero started implementing their AAC encoder with their software). So theoretically, these AAC encoders still have a lot of room for improvement.

Once you evolve into the 5.1 192kHz 24-bit area, the order of magnitude difference between lossy and lossless gets bigger again.

The next big evolution will be 5.1 or 7.1 in 32 or 48kbps. The technology exist and is standardized. Alternatively we will see scalable codes which combine lossy with lossless.

The push to improve is IMHO not so big, if you just look at MP3 popularity versus other formats.

Lossy makes some sense from a battery perspective. Although I'm not sure how that peters out in practise (I'm sure people on this site have tried!).

QUOTE (kornchild2002 @ Apr 19 2008, 09:24)

I don't know about Lame mp3 though. I remember reading not too long ago whenever 3.97 went final that some aspects of 3.97 could not be improved due to the limits of the mpeg-1 standard (or something along the lines).

Every codec developer is always constrained by what the standard allows. This doesn't mean the encoder can't be improved!

QUOTE

Then again, my perspective is mainly from the portable audio side of things. I am not a lossy encoder developer so I don't know if the limits of lossy encoding have been reached. I am just looking at Lame here and comparing it to other AAC encoders. Lame 1.0 came out in mid-1998 with version 3.0 coming out in May 1999. So Lame is about 10 years old yet they are still adding improvements. Apple's AAC encoder was first made public in 2003 and I think Nero's AAC encoder first came out in 2005 (someone please correct me on this as I know a free public version came out in May 2006 but I don't know when Nero started implementing their AAC encoder with their software). So theoretically, these AAC encoders still have a lot of room for improvement.

Nero AAC has its roots in Psytel AAC, which is much older. Apple is based on Dolby AAC, I think.

Every codec developer is always constrained by what the standard allows. This doesn't mean the encoder can't be improved!

I guess that was my kind of my point in but I focused more on how old a audio encoder is. Lame mp3 is still providing increases in audio quality yet the mpeg-1 layer III standard is rather old and so is the Lame mp3 encoder. I don't know if we will ever reach full efficiency with lossy encoders.

QUOTE (Garf @ Apr 19 2008, 02:01)

Nero AAC has its roots in Psytel AAC, which is much older. Apple is based on Dolby AAC, I think.

Interesting, that is some good information to know. I consulted the Wiki before posting but there wasn't anything really about either iTunes AAC or Nero AAC.

I don't think we're near the end for either. Even though MP3 is still the most common (maybe .m4a with IPod's popularity), there are better compression codecs. You can also see it in video - .avi is slowly being replaced by .mkv files and H.264 encoding. On the lossless side, FLAC and WavPack are still improving, though as Kornchild2002 stated (great post BTW), somewhat slowly. You can compress a .wav 40-45%. However, that's about 3x the size of a 320bit MP3 file so there's a significant gap b/t the two.

GKornchild2002 really hit the real issue IMO. Flash memory size is increasing rapidly while costs are still going down. I've seen 16GB USB flash drives for $25-$30 w/ rebate. 2GB is now fairly small for a portable flash player - 4GB & 8GB are more common. While that may not be enough for most people to use lossless, in 5 years I wouldn't be surprised if 128GB was common. That should be more than enough storage for lossless files.

Even with bigger storage, I think it all comes down to widespread adoption. SACD & DVD Audio failed (IMO), because most people just didn't care VHQ music or 5:1 music to justify the higher costs of the players and CDs/DVDs. I know a lot of people who think $500 pre-packaged surround sound system sound just fine and I'm crazy for spending much more on a set of Paradigm speakers. It's the same reason that almost all music downloads are 128bit or 256bit at best (if you can find it). Frankly, I absolutely HATE the fact that I still can't buy & DL lossless music online. If I have to pay $1.50 or even $2 instead of $1 fine, but don't give me anything but true CD quality. Point is that the vast majority of people are satisfied with the current music compression technology so there's not much to drive further development.

My guess is that over the next few years, both HDD & portable storage will increase more rapidly than any improvements in lossless or lossy compression. As that happens, lossless becomes an attractive option in more areas. Personally, I hope lossy (and lossless) compression continues to advance - at least until the point where file size isn't an issue.

I don't know if we will ever reach full efficiency with lossy encoders.

There is no such thing as "full efficiency" aka "perfection heaven". The gains just get smaller and smaller but never hit a brickwall. A technology isn't at its end of life when nothing can be improved anymore, but when the cost of improvement vastly outweigths the gains. Without any disruptive innovation - a wonder if you want to call it like that - stereo lossy encoding has already reached this point - the current improvements only provide very minor improvements or fix exotic use-cases. If you disagree, then i propose to get your referencepoints right: Compare the current improvements with the improvements made 6 years ago, and you get a better picture of where we are now.

So, without a wonder, stereo lossy encoding has already reached its optimum - significant progress may only happen outside of that scenario (i.e. surround encoding). HOWEVER, if we change the criteria from soundquality/filesize to something else - like for example platform support and ways to use lossy compression, then there definatelly is a lot of room left for improvement.

- Lyx

P.S.: With this, i do not mean that every developer of lossy codecs should drop the project. What i propose however is this: How many lossy compression codecs do we need? One? Two? Three? Okay, lets asume three. How many devs do we need per codec, to care about maintaining deadend stereo lossy codecs? Probably no more than 5. Thats 15 decs worldwide for developing compression-efficiency of stereo lossy codecs. Not much. Now add more devs who look into improving usability, features and support of codecs. Thats what i would call an efficient distribution of dev resources. The primary target should no longer be improving conventional stereo lossy codec compression - this does not mean abandonning improvements, but making it second or even third priority.

My opinion is that HE and the various spectral filling techniques (SBR, etc) will start to lag as people start to hear what they sound like and notice, much like we went from people originally showing astonishment (unwarranted, yes, that is my point) at how GOOD 64 kb/s MP3's were (shudder), to present, when people complain about 192kb/s or 256kb/s MP3's, of course how good their encoder is, and how accurate their complaints are, is likewise something that could be researched further. I also wonder what happens to parametric stereo and parametric surround when signals with natural time delays are used as inputs. (I do have some idea, but not enough to say anything concrete, and such signals are, in the day of panpots, rather annoyingly rare.)

Having said that, I think that until somebody comes up with a "whole new way" to do this, we won't see a lot of advance in the transparent codec design.

But does that mean that research is pointless? No. First, somebody needs to figure out this "whole new way" if it exists, although the existing data on perceptual entropy, etc, suggests that it might not be, such existing data does not count sophisticated nonlinear modelling, etc, so it's not ruled out.

Second, when I started doing copmression in 1976, people were scoffing that we'd need 32 kb/s speech coding (4 bit/sample goodman/gersho ADPCM). In 1979, they were scoffing that we would need 16 kb/s speech coding, and likewise that we might need 56kb/s AM radio quality coding. In 1986, they were scoffing that we might need 128kb/s/channel audio coding, but in fact the ADPCM, SBC, and G.722 coding were all out there and doing their job, and CELP was looming on the horizon for speech. In 1992, the idea that we would need 64 kb/s/channel audio coding was likewise scoffed at. In 1997, the idea that someone would be able to send/sell music over the internet was ridiculed (look up "a2b music" in google for that). And so on.

And you know what, in each case, the idea was "transmission capacity is growing so fast we'll never need it."

I think, in 2008, we know better about each of those coding algorithms, so I think that while I've said publically that "audio coding is mostly done until somebody comes up with a really new idea", that means that we need the new idea, not that no further work is necessary. I've heard "that's not necessary" or "that will never sell" more than one too many times.

So, those are my own impressions. They are based on my experience, my inventions, and my understanding of the field. No ABX test is involved in this article.

The thing is that at around 80-128kbs, we seem to have reached some kind of "sweet spot". New tech improved sound aesthetics in that area, but "transparency for uncritical untrained listening" didn't shift much. So, ignoring special cases like speech, and this "whole new way", would it be THAT bad, if that couldn't be improved much more, considering nowadays storage spaces? I'm not of the "lets waste resources, capacities are increasing anyways"-camp - but i ask the question: "is what we currently have, possibly enough?"

My feeling is that CD Audio is likely the last widespread physical media for audio. Now personally I don't have a problem with that, because even with the "loudness war" I have purchased some new equisite-fidelity CD's that sound lightyears better than CD's I've had for 15 years, a few noticeably better fidelity than CD's only 5 years old. So it seems that we haven't reached the resolution limit of 44/16 PCM, let alone getting playback equipment to that limit. Thus, to me it seems the 50MB wav file (or ~30MB lossless compressed equivalent) is nearing a time when transmission & storage won't be an issue (though I'm sure the record companies can make a host of other excuses).

So I wonder if in 5 years or so, MP3 (or Itunes equivalent) will have given way to wav/flac/wma etc. And if it does, who would care to do the research in order to find "that new idea" in perceptual encoding.

@Woodinville:but i ask the question: "is what we currently have, possibly enough?"

Well said - I don't hear very many folks looking for lower bit rates, and once solid state storage reaches 128GB+ at an affordable level I think it won't matter much for music. We'll run another cycle for video, but it will end in the same place.

The effort (R&D and marketing $$) being invested in increasing storage and bandwidth seems to vastly outstrip the effort to reduce bit rates, so I guess the market is speaking with their dollars and answering this question - "yes, what we have is good enough and we'll crank up the storage and bandwidth to make sure you get enough music/video, etc."

The effort (R&D and marketing $$) being invested in increasing storage and bandwidth seems to vastly outstrip the effort to reduce bit rates, so I guess the market is speaking with their dollars and answering this question - "yes, what we have is good enough and we'll crank up the storage and bandwidth to make sure you get enough music/video, etc."

I'd go even further and say, that even nowadays storage spaces are already "satisfying" - there definatelly is room for improvement, but there also isn't really a "shortage". Compare the amount space and bandwidth with the age of casette tapes - its definatelly enough for "on the go" - even with flashram - and if you need more because you are on a larger trip - well, why isn't there an easy and affordable tech to usb-plug a 2,5" harddrive to a flash-based player, including the ability to manage transfers? The tech is there - but the implementation isnt. Sure, it may not be perfect, but its manageable, considering the small size of those gadgets.

And let 5-7years go, and flashram will be at the size of nowadays harddrives. Thats tenthousands of tracks on a small memory card. And: the relevant tech to make that jump in flashcapacity happen is already known, cheap and just needs to mature. Does anyone think that we can manage a 90% increase in performance of lossy compression happen in 5 years?

Lossy stereo music compression isn't an issue anymore. Surround and especially video is the next big hurdle - and inet-ISPs need to ******* understand, that this is no longer 1995 and burry ADSL for good - this is no strict server/client network anymore.

With the current level of compression you can get with LAME in VBR whilst still maintaining perceptual transparency, you can already comfortably fit over 100 hours worth of music onto one of the latest generation 8GB flash MP3/MP4 players. These devices already retail in the UK for under £60 and some (if not most) of them allow the use of an external micro-SD card with a capacity of up to 2GB, so you can carry a few of those to allow you to plug in additional 25 hour chunks of your favourite music if you really want to.

With the price of flash memory dropping as rapidly as it is, do we need better compression from lossy formats? After all, it's mostly the need for something that drives the efforts behind achieving it.

I for one am p*ssed off that you cannot buy lossless. Of course, the "download this and re-encode for your portable device" would impliy *yikes!!!!!!* copying, and for religious reasons the record industry would not encourage anything like that.

Lossy makes some sense from a battery perspective. Although I'm not sure how that peters out in practise (I'm sure people on this site have tried!).

"Me too" on the last paragraph, but -- without any empirical knowledge -- I'd suppose that there is a decoding/reading trade-off where a lossless should need more energy to read the larger file, but not necessarily more energy to decode if it is optimized for decoding speed (possibly at cost of compression ratio). I'd guess that this trade-off could change from device to device too -- flash memory vs. hard drive.

Doesn't the whole lossless vs lossy argument hinge around what you personally define as 'lossless' though?

Anything that doesn't reproduce an input with 100% accuracy is, technically speaking, lossy. This includes all microphones, cables, mixing desks, recording equipment, CD players, amplifiers, headphones, and loudspeakers. Vinyl is horribly lossy in almost every respect imaginable, and all forms of digital recording, even the 'lossless' ones, are only defined as lossless because they reproduce accurately within the confines of their specified resolution in order to provide perceptual transparency.

The audio CD standard is most certainly not truly lossless when compared to the original sound which effectively has an infinite resolution and an infinite bit-depth. Stretching a point of pedantry, divide infinity by any finite number and, on paper at least, the audio CD standard is infinitely lossy.

Once the threshold of perceptual transparency has been reached with any recording/encoding method, be it a compressed format or not, it's 'lossless'.

Doesn't the whole lossless vs lossy argument hinge around what you personally define as 'lossless' though?

We're not having a lossy/lossless argument. Let's stay on topic.

QUOTE (Slipstreem @ Apr 21 2008, 14:28)

Once the threshold of perceptual transparency has been reached with any recording/encoding method, be it a compressed format or not, it's 'lossless'.

No. You're just confusing matters here. Lossless has nothing to do with perceived transparency; lossless means that there is no loss, whether it is imperceivable or not. You can't redefine a word like that, it does not come down to interpretation. Yes, the recording process has lossy stages, but a lossless codec encodes a source without loss, plain and simple.

All I'm saying is that, like it or not, we live in an analogue world and it's completely impossible for any digital system to capture everything. A lossless transform guarantees that we don't lose any more information than we've lost already, but comparing the enormous amount of real-world information already lost throughout the entire recording process to the miniscule amount that you lose in a well-configured lossy codec in perceptual terms doesn't seem to make much sense to me on an analogue level.

Have you ever heard a CD-quality recording that comes even vaguely close to replicating the sound of actually being at a live rock concert, for example? I certainly haven't.

Lossless encoding obviously makes sense for archival purposes, but I don't understand why people complain about not being able to download music in a lossless format when the chances are incredibly high that they won't hear any difference whatsoever between, say, a LAME MP3 encoding at VBR -V2 and the original CD recording anyway.

Ignore the fact that we live in an analogue world with analogue sound sources and listen to them with analogue ears if you like. It doesn't alter the undeniable fact that we do.

Cheers, Slipstreem.

EDIT: @Synthetic Soul: Apologies. I didn't see your last post until I'd already posted this. I thought that the OP was asking if we'd reached the limit with lossy. I was just expressing the opinion that the 'lossiness' of lossy is a very small part of the overall picture with current lossy codecs. Please delete this post if you think it's of no relevance here.

Many people use lossy codecs to acheive transparency. With the world of lossless codecs now being developed it would seem that this isn't really as important anymore - if you want it to sound identical, the easiest way is to just go lossless. Because of this I don't think that the soul reason for lossy codecs anymore is 'transparency or bust,' I think it is instead trying to reach that sweet spot where it makes the compromise between sounding 'good enough' and keeping a small file size. If this is the only thing you are concerned with, in regards to lossy, anyway, there are still many wonderful technologies being developed, such as He-AAC which sets out to achieve that 'sweet spot' at about half the bitrate of MP3. My personal tests have proven it more than convincing enough for me, even at ridiculously low bitrates. However these codecs don't seem to be achieving a wide amount of popularity... they aren't even being implemented in wide spread devices like iPod's... why is this the case? Well, there doesn't seem to be such a demand these days, because of the reasons mentioned - drive space is getting cheaper, internet connections have bigger bandwidth, etc.

Say I was a budding codec developer, and created a stunning new lossy codec. It would achieve transparancy at half the bitrate of MP3, and encode and decode way faster than MP3. How many years would it take before my codec was even a blip on the radar of the masses? How long before I drew attention from sources such as iTunes, if ever? Clearly I'd have several hurdles to overcome - the first one being the proliferation of MP3. The masses have already been educated towards MP3's, know what they are and that they are everywhere and will play everywhere. It would take a long time for a new codec to acheive this widespread appeal. Another hurdle might be the bitrate: even though my codec was geared for bitrates such as ~64 kbps for transparency, how many people out there would still be saying: Oh, I need to encode in ~128 or even ~192, otherwise it won't be good enough! What if a new music site starting selling music in this codec - how many people would inspect the file and say 'oh, 64 kbps, this must be junk, no way I'm paying money for this!' As long as the masses keep these belief systems reinforced within themselves, it would seem that advancements in lossy codecs won't matter much anyway.

A lossless transform guarantees that we don't lose any more information than we've lost already, but comparing the enormous amount of real-world information already lost throughout the entire recording process to the miniscule amount that you lose in a well-configured lossy codec in perceptual terms doesn't seem to make much sense to me on an analogue level.

Ok, then how about "96/24 digital audio is transparent to the human ear on all physically realizable reproduction systems, and all samples, under all circumstances". What this means is that *no perceptable information whatsoever* is lost during digitization. Nobody sensible is arguing that about any lossy codec - they all have killer samples, and other problems.

Saying "analogue has infinite resolution" is also not true. Analog systems have a defined passband, and a defined SNR - therefore limited resolution.

QUOTE (Slipstreem @ Apr 21 2008, 07:53)

Have you ever heard a CD-quality recording that comes even vaguely close to replicating the sound of actually being at a live rock concert, for example? I certainly haven't.

If you had a large arena and a rock concert sound system, and the orginal concert was well miked, then yes - you could certainly get that sound. The limit isn't anything to do with digital - it's your small speakers, with a small number of channels in a small room.

QUOTE (Slipstreem @ Apr 21 2008, 07:53)

Lossless encoding obviously makes sense for archival purposes, but I don't understand why people complain about not being able to download music in a lossless format when the chances are incredibly high that they won't hear any difference whatsoever between, say, a LAME MP3 encoding at VBR -V2 and the original CD recording anyway.

Because they want the freedom to transcode? The freedom not to worry about that 1 in 1000 that does have artifacts? Sure -V2 is transparent (to me) on almost all music samples - but not all - and that's why I would prefer to have lossless downloads. I'm totally happy with MP3 on my portable, on my party/living room speakers or in my car. For discerning listening, however, I don't want to be worried that the music I am downloading has been audibly degraded by the compression process. It won't happen offen, but it will happen.

QUOTE (Slipstreem @ Apr 21 2008, 07:53)

Ignore the fact that we live in an analogue world with analogue sound sources and listen to them with analogue ears if you like. It doesn't alter the undeniable fact that we do.

Again, analogue does not mean infinite bandwidth or infinite SNR. If you like, you can call the record->A/D->store->D/A->playback an analogue process with a certain passband and SNR - the fact that at some point the signal is stored digitally is utterly, utterly irrelevant. Analogue in, analogue out. All that happens in the middle is the addition of noise and bandpass filtering. Show me an analogue recording format that doesn't restrict bandwith and add noise and I'll send you a case of beer.

QUOTE (Slipstreem @ Apr 21 2008, 07:53)

EDIT: @Synthetic Soul: Apologies. I didn't see your last post until I'd already posted this. I thought that the OP was asking if we'd reached the limit with lossy. I was just expressing the opinion that the 'lossiness' of lossy is a very small part of the overall picture with current lossy codecs. Please delete this post if you think it's of no relevance here.

I think lossy was great at the time, but is becoming obselete. Sure, as Garf pointed out, it's still a good choice for surround sounds, portables, VOIP, streaming, etc. These remain good arguments for lossy. The applications which will be pusing storage requirements beyond 2010 the way sound did before 2000 will probably be dominated by HD video.

However, I do respect your point that lossy codecs are nearly perfect anyways - so we have reached the limit. The development of lossy codecs for storage of HiFi music probably has neared the end - killed of by dimishing returns and diminishing demand. The development of lossy codecs for transferring voice and music over error prone, high latency and low bandwidth data links is, in my opinion, still interesting.

I take all of your points on board and can't fault your logic. From my point of view, I guess it comes down to where an individual's limits lie with regards to what sounds transparent and what doesn't. With the cost of storage media constantly dropping as it is, I'm sure we'll all look back on threads like this in 10 or 20 years time and laugh about the debates that rattled back and forth regarding saving the odd Gigabyte of space here and there. Heh.

Show me an analogue recording format that doesn't restrict bandwith and add noise and I'll send you a case of beer.

He cannot do that because its logically impossible without replacing causality. One cannot interact without interacting. No cause without effect, no effect without cause. The mere fact that you access information already means that you interact with it and therefore change/distort it. You can reduce that effect by adding error-tolerance via symbolic abstraction (digitalization) but that process itself in turn causes distortion (as already pointed out by you).