Colloms Sound Quality Ratings: subjective sound quality scaling

These comparative sound quality ratings are mainly used for audio electronics: power amplifiers, preamplifiers, CD players and integrated amplifiers.

How it was done



Some 25 years ago, I adopted the IEC based scale operating from 0 to 10. Zero represented nil fidelity while 10 was assigned to essentially perfect reproduction. Given a sufficiently large and representative group of products, a score of 5 naturally became the group average for sound quality. Such results were self-limiting and self-sufficient for each group test.

Maintaining a historic standard based on such results has proved difficult as equipment continued to improve and the sound quality scores edged inexorably higher towards 8 and 9. The need to differentiate degrees of excellence demanded increasingly finer resolution as this scale was being logarithmically crushed towards that pre-chosen limit of perfection, namely 10.

I solved this problem by removing the barrier represented by 'scale 10' and decided that my numeric scores should rise in proportion to the gauged percentage improvement over previous references, many of which are maintained in my own equipment store. These references help immensely in achieving a measure of long term consistency.

Historically, low grade, marginal quality integrated amplifiers score as they always did, at the 5 to 6 point level, while worthy hi fi alternatives attain 12 to 18. Improved power amplifiers lie in the 18 to 28 range, while top of the line references have now reached 40 points and more. Subjectively, weighing all the subjective parameters, including 'involvement' – the ability to hold the listeners attention – a 40 is considered to be subjectively 'twice as good' as one scoring 20. Appreciating sound quality at these higher levels requires a commensurate performance from all other elements in the chain that is used to evaluate the products.

A further decade of commercial endeavour since the sequence began has taken the scoring to a present maximum over 150. Nevertheless a superb score such as this cannot fully account for individual taste and how good a match might be achieved with a given partnering system. Thus the intending purchaser's own discretion, judgment and experience will still be required to make sense of these sound quality scores. A number of historic and interesting products are included in the listing for reference purposes.

Method and ratings







Auditioning begins with preliminary trials, an extended running-in and warm-up period, plus informal trials with the partnering system, exploring the more obvious aspects, and seeking complementary combinations of audio components to see how to best understand the quality of the DUT, the device under test. Different shelves, platforms or feet can assist, while other important aspects include cables overall, including mains cables. An amplifier may sound bright with a given speaker and cable set. This is useful information but if other aspects of its potential are to be fairly assessed it is worth trying to fine-tune the system first. Preliminary descriptions of sound quality are developed in this phase.

The second stage involves powering up reference amplifiers, checking levels and absolute phase, and making comparisons, comparing and contrasting the many dimensions of sound quality. With reference scores pre established for the comparison products an idea of the special and overall merits of the DUT can be established. At this stage, given some practice, it fairly easy to determine a difference score for the test products.

The mental framework is one of percentages. If, when compared with a familiar reference which has a long established sound quality score of 30, the DUT is subjectively about 50% better; for example a balance of deeper soundstage, better reverberant field, sharper focus, more rhythm and listener involvement, crisper bass, sweeter treble, higher resolution, and these and other aspects are all weighed in the balance. On this basis the new product is rated 15 points better than the chosen reference and thus gets 45 marks. This will be a provisional score because one further step needs to be taken.

Such subjective ratings inevitably have embedded in them the musical sensibility of the listeners, and here the test bias leans more to the involvement and entertainment aspects of sound reproduction, and where a beautiful sound and a big focused soundstage alone will not score at the highest level if the sense of drama, dynamics and rhythm is also unduly diluted.

It is unfortunate that there are many nice sounding products on the market where the designers have unduly focused on beauty and detail and seem to have quite missed the point about musical performance; these designs are good at reproducing the notes but fail to show the musicians working as a team.

If Hi Fi reproduction is rendered elegant but boring then we all might as well pack up now, as we inexorably head towards Musak performances, hastened by the advancing dissemination of MP3 lossy coding and its ilk.

Consider that the final published score could be regarded as artificial, an illusion since you are unlikely to hear this full potential at a dealer, and will only hear it at home if you take the product to the practical limit. So there is a third stage in the judgement process which concerns powering down and disconnecting, and ideally physically removing, all the comparison and redundant source components from the test system. The DUT is now operated alone, free of unnecessary supply interference, digital noise from other components, physically well supported and located, in as optimal a system match as can be practically achieved. At this point a final sound quality score is confirmed, following extended listening with a variety of program and program sources, the best that could be optimally achieved.