Just a little typo in the last line: "Sorenson is good, but it's price is prohibitive." It should be "its" which is possessive.

Interesting test. I guess we shouldn't be too surprised to see how well the QuickTime's AAC did. I was a little surprised to see FAAC getting beaten so badly though. Would you say it's still a better alternative to 128kbps MP3? (I assume we're using this bitrate for the purpose of portables use.)

QUOTE

If you don't mind about illegality, I suggest using AACenc, since it has good quality and is free.

This may have been addressed a million times elsewhere, but how is AACenc illegal? Is this the encoder used with PsytelDrop? I have it but don't recall where I downloaded it. Didn't realize it was illegal. (Warez?) Or is it just a licensing issue?

Very interesting, especially considering how bad Dolby did in the 64kbps test.Overall I was quite suprised by the high quality of all codecs except for FAAC, but there definetly is still room for massive improvments. None of the codecs was transparent to a degree where I'd consider using it for archival purposes, but for portable/casual use, which is the main field of usage at 128kbps anyway, AAC looks very interesting.

Thanks to Roberto and everyone else, who helped and participated, for investing time and work on this test. I can't wait for the next test to see how AAC compares to Vorbis, Musepack and MP3...

dev0

This post has been edited by dev0: Jun 16 2003, 08:56

--------------------

"To understand me, you'll have to swallow a world." Or maybe your words.

I am also interested to see how aacenc -streaming compares to QT. I was very surprised when I was informed which samples were which encoder after I submitted my test results. Psytel -streaming in my own tests has been quite impressive for the bitrate, and to have it fixed as CBR in this test was not really letting it work at its best. I also understand why this testing decision was made however, for the sake of fairness.

As for FAAC vs mp3 128 kbit, I haven't done a direct back to back comparison on identical samples, but some of the FAAC samples in this test were really bad, as can be seen from the charts. I'm relatively new to codec listening tests, but in some FAAC samples, the voices/instruments actually sounded different in their tonality from the others, let alone pre-echo or other more typical codec introduced artifacts.

I am also interested to see how aacenc -streaming compares to QT. I was very surprised when I was informed which samples were which encoder after I submitted my test results. Psytel -streaming in my own tests has been quite impressive for the bitrate, and to have it fixed as CBR in this test was not really letting it work at its best. I also understand why this testing decision was made however, for the sake of fairness.

There is a ongoing improvement to Ahead Nero "streaming" preset as well, some changes already were made (still not out) as long as some grouping improvements and bug fixes. This will probably be available in the next web version of the Nero.

Interesting results. Does this test say anything about how the decoders are doing with VBR files in that bit rate area? (i.e. the encoders that are able to produce VBR files). I'm thinking if 128 CBR isn't good enough for portable use, maybe ~128 VBR is?

I did some preliminary tests, and I compared blindly QT 6.3 to PsyTEL & Nero -streaming. Tested on 6 samples of this test. Difference is impressive. Most audible one is the higher lowpass, which provide a richer sound than QT (appreciable on Atrain and 41_30), and less artifacts in some case. Now, PsyTEL & Ahead Nero AAC encodings are close to QT (for my taste), and sometime better than QT (41_30 for exemple). Unfortunately, I didn't have too much time last week to test it further. And I broke my headphone. The new one is here now, and I have some time the 2 next days

It should be "You've Got the Love" by The Source feat. Candi Staton (Although a popular dance tune and available on many compilations I don't know if the track was ever featured on an actual album by The Source)

What I find strange is the difference between the Sorenson (FhG Pro codec) and the QT (Dolby codec) results. I have the impression that the Dolby codec is based on the FhG codec with optimisations for speed rather than quality. Still, according to my experience, QuickTime Pro 6.3 and Sorenson Squeeze 3 take almost the same amount of time to encode an audio file in AAC, when the quality setting for QT is set to the highest setting. Could it be that this setting disables all options which reduce the quality of the encoding in favour of speed?

Still, according to my experience, QuickTime Pro 6.3 and Sorenson Squeeze 3 take almost the same amount of time to encode an audio file in AAC, when the quality setting for QT is set to the highest setting.

We didn't have the same experience. According to my timings, QT is one of the fastest AAC codec, and Sorenson the slowest with PsyTEL :

Still, according to my experience, QuickTime Pro 6.3 and Sorenson Squeeze 3 take almost the same amount of time to encode an audio file in AAC, when the quality setting for QT is set to the highest setting.

We didn't have the same experience. According to my timings, QT is one of the fastest AAC codec, and Sorenson the slowest with PsyTEL :

Hmm... I am a MacOS X user and on that platform QT6.3 (with highest quality setting) and Sorenson take almost the same amount of time to encode at 128Kbps.

Interesting. Therefore, for 95% of the users in this world, QT is three time faster than Sorenson (and a lot cheaper).What is interesting to note is that QT is the absolute winner of the test, and in the same time, one of the fastest codec (FAAC is faster on my Duron, but quality isn't as good). Unfortunately, the interface is not the prettiest I saw... Hope this will change, by implemeting soon the same codec in iTunes.

2. It should be noted somewhere, probably in the recommendations section, that this was a CBR test only, and that Nero and Psytel also have VBR modes, which perform better, according to Guruboolez. You might link to his listening results.

3. The crack about people advertising for FAAC is unneccessary. and doesn't help you win over a certain enthusiast to participate in your next test.

4. You mention that you used an ANOVA analysis, but maybe you should also mention that this is different from what the 64 kbit/s test used. The similar presentation format might make people think that all the analysis was identical. The difference is mainly one about risk. The ANOVA / Fisher LSD method is more at risk for falsely identifying differences between codecs. On the other hand, it's more sensitive than the Tukey HSD.

5. I'm still uncomfortable with the squishy way that a summary graph is constructed. But since I can't think of a better way, and people have a need to see things in one, concise picture, I suppose it must be that way.

6. In the more detailed pages to follow, I'd like to see some mention about how a time misalignment of only 25 msec spoiled at least one result. Also, I'd like to see some mention of the results you threw out for rating the original less than 5.

Some ideas for a future test:

1. Perhaps another call for samples -- classical and jazz samples -- would be profitable.

2. You might think about adding at least one anchor sample -- a lowpassed version of the original, a la MUSHRA. This can be done with a small filesize penalty using Sox. That would help to keep the ratings in perspective.

3. Verifying VBR average bitrates: I think that this task could be split up among several people, each encoding whole albums with all codecs.

ff123

Edit: Oh, and if iTunes doesn't use the same codec that you used for this test, I would make some mention of that fact too.

Edit2: The next test you'll probably want to be sure to check for level (volume) differences too.

Hmm... I am a MacOS X user and on that platform QT6.3 (with highest quality setting) and Sorenson take almost the same amount of time to encode at 128Kbps.

Interesting. Therefore, for 95% of the users in this world, QT is three time faster than Sorenson (and a lot cheaper).What is interesting to note is that QT is the absolute winner of the test, and in the same time, one of the fastest codec (FAAC is faster on my Duron, but quality isn't as good). Unfortunately, the interface is not the prettiest I saw... Hope this will change, by implemeting soon the same codec in iTunes.

This is true. The QuickTime interface does not allow the batch processing of files and therefore it is not easy to use it for mass encoding of CDs. Also note that on MacOS X the iTunes application does not allow the use of the highest quality setting of the AAC codec and therefore the only option left for someone is to use the Ovolab AAChoo front-end

for batch encoding at any quality setting. This is what I use for my iPod encodings and it works well. Still, I hope that the next version of iTunes will allow the setting of the encoding quality. I also hope that the next QuickTime revision will support VBR AAC encodings...

What do you think about 'eyeballing' the results to get the ranks. Wouldn't a straight ranking be more solid and not necessarily less powerfull? I see the eyeballing was done in the 64kbps test also.

What about a bootstrap analysis of the results. Possible?

The problem I was having with just using the data directly to get an overall summary is the idea that one sample might have a greater influence on the overall results than another. For example if one sample had a clear winner with a mean of 4.5, but another sample had a different clear winner with a mean of 2.5, I was thinking that the first sample would be given more weight in the overall results.

However, perhaps that's not really an issue when using a "blocked" analysis, which is supposed to take care of such things. Different listeners have varying internal quality scales. So I suppose that's analagous to the different samples having varying difficulty levels.

In that case, I would just try the exact same ANOVA / Fisher LSD on the data directly.

Problem with bootstrap is that it doesn't easily give you the nice 95% confidence intervals for the graphs. Also, it's similar to the Tukey HSD in being more conservative with risk (and less sensitive).

ff123

Edit: Roberto, don't forget to include the ATrain and Layla numbers when you do this analysis

Yeah I'd Never Expect QT to come in as top cuz i thought there audio support was crap. The whole domination over movie trailers is pretty good though (QT trailers dominate WMP or REAL). Do you think Apple Store rips their albums using Quicktime? And which version and up supports AAC ripping (obviously Pro)?