It turns out that there were some unexpected consequences associated with making this change, which means that this change was not as simple as it might seem. I think that this issue has not been appreciated by the wine public, or probably even the people at Decanter, either; and so I will point out some of the consequences here.

We do expect that a 20-point scale and a 100-point scale should be inter-changeable in some simple way, when assessing wine quality. However, there is actually no intrinsic reason why this should be so. Indeed, Wendy Parr, James Green and Geoffrey White (Revue Européenne de Psychologie Appliquée 56:231-238. 2006) actually tested this idea, by asking wine assessors to use both a 20-point scale and a 100-point scale to evaluate the same set of wines. Fortunately, they found no large differences between the use of the two schemes, for the wines they tested.

This makes it quite interesting that when Decanter swapped between its two scoring systems it did seem to change the way it evaluated wines. This fact was discovered by Jean-Marie Cardebat and Emmanuel Paroissien (American Association of Wine Economists Working Paper No. 180), in 2015, when they looked at the scores for the red wines of Bordeaux.

Cardebat & Paroissien looked at how similar the quality scores were for a wide range of critics, and then compared them pairwise using correlation analysis. If all of the scores between any given pair of critics were closely related then their correlation value would be 1, and if they were completely different then the value would be 0; otherwise, the values vary somewhere in between these two extremes. Cardebat & Paroissien provide their results in Table 3 of their publication.

Of interest to us here, Cardebat & Paroissien treated the Decanter scores in two groups, one for the scores before June 2012, which used the old 20-point system, and one for the scores after that date, which used the new 100-point system. We can thus directly compare the Decanter scores to those of the other critics both before and after the change.

I have plotted the correlation values in the graph below. Each point represents the correlation between Decanter and a particular critic — four of the critics have their point labeled in the graph. The correlation before June 2012 is plotted horizontally, and the correlation after June 2012 is plotted vertically. If there was no change in the correlations at that date, then the points would all lie along the pink line.

For two of the critics (Jeff Leve and Jean-Marc Quarin), there was indeed no change at all, exactly as we would expect if the 20-point system and 100-point system are directly inter-changeable. For seven other critics the points are near the line rather than on it (Tim Atkin, Bettane & Desseauve, Jacques Dupont, René Gabriel, Neal Martin, La Revue du Vin de France, Wine Spectator), and this small difference we might expect by random chance (depending, for example, on which wines were included in the dataset).

For the next two critics (Robert Parker, James Suckling), the points seem to be getting a bit too far from the line. At this juncture, it is interesting to note that the majority of the points lie to the right of the line. This indicates that the correlations between Decanter and the other critics were greater before June 2012 than afterwards. That is, Decanter started disagreeing with the other critics to a greater extent after they adopted 100 points than before; and they started disagreeing with Parker and Suckling even more than the others.

However, what happens with the remaining two critics is quite unbelievable. In the case of Jancis Robinson, before June 2012 Decanter agreed quite well with her wine-quality evaluations (correlation = 0.63), although slightly less than for the other critics (range 0.63-0.75). But afterwards, the agreement between Robinson and Decanter plummeted (correlation = 0.36). The situation for Antonio Galloni is the reverse of this — the correlation value went up, instead (from 0.32 to 0.56). In the latter case, this may be an artifact of the data, because only 13 of Galloni's wine evaluations before June 2012 could be compared to those of Decanter (and so the estimate of 0.32 may be subject to great variation).

For the Cardebat & Paroissien analyses, both Jancis Robinson and Antonio Galloni have the lowest average correlations with all of the other critics, with 0.46 and 0.45, respectively, compared to a range of 0.58-0.68 for the others. So, in this dataset there is a general disagreement between these two people and the other critics, and also a strong disagreement with each other (correlation = 0.17). It is thus not something that is unique to Decanter, but it is interesting that the situation changed so dramatically when Decanter swapped scoring schemes.

7 comments:

"We do expect that a 20-point scale and a 100-point scale should be inter-changeable in some simple way, when assessing wine quality. However, there is actually no intrinsic reason why this should be so. ..."

Those who use the UC Davis 20-point scale are obliged to address how a wine scores against a checklist of "components."

"The Davis system is quite straightforward. It assigns a certain number of points to each of ten categories [components] which are then totaled to obtain the overall rating score for a given wine:

In Wine Spectator, wines are always rated on a scale of 100. I assume you assign values to certain properties [components] of the wines (aftertaste, tannins for reds, acidity for whites, etc), and combined they form a total score of 100. An article in Wine Spectator describing your tasting and scoring procedure would be helpful to all of us.

(Signed)

Thierry Marc CarriouMorgantown, N.Y.

Editor’s note: In brief, our editors do not assign specific values to certain properties [components] of a wine when we score it. We grade it for overall quality as a professor grades an essay test. We look, smell and taste for many different attributes and flaws, then we assign a score based on how much we like the wine overall."

I would be much happier in my professional life if I were never required to assign a score to a wine.

. . .

Even I have to admit, however, that scores have their uses. . . . however much we professionals may feel our beloved liquid is too subtle to be reduced to a single number.

I find myself using all sorts of different scoring systems depending on the circumstances. . . .

In most of my tasting and writing I don't really need scores. . . .

I like the five-star system used by Michael Broadbent and Decanter magazine. Wines that taste wonderful now get five stars. Those that will be great may be given three stars with two in brackets for their potential. . . .

I know that Americans are used to points out of 100 from their school system so that now they, and an increasing number of wine drinkers around the world, use points out of 100 to assess wines. Like many Brits, I find this system difficult to cope with, having no cultural reference for it.

[Note her discomfort with the 100-point scale. ~~ Bob]

So, I limp along with points and half-points out of 20, which means that the great majority of wines (though by no means all) are scored somewhere between 15 and 18.5, which admittedly gives me only eight possible scores for non-exceptional wines -- an improvement on the five star system but not much of one. (I try when tasting young wines to give a likely period when the wine will be drinking best, so I do cover the aspect of its potential for development.)

One would "think" that finding a reproducible version of the UC Davis 20-point scoring score would be easy on the Web.

Not so . . .

Found here at Wines.com(February 18, 2011):

“UC Davis scoring system”

Link: http://www.wines.com/wiki/uc-davis-scoring-system/

"The Davis system was developed by Dr. Maynard A. Amerine, Professor of Enology at the University of California at Davis, and his staff in 1959 as a method of rating the large number of experimental wines that were being produced at the university.

The Davis system is quite straightforward. It assigns a certain number of points to each of ten categories which are then totaled to obtain the overall rating score for a given wine.

“A Better Wine Scorecard?;Napa Valley College's new wine scoring system objectively analyzes wine while also allowing for relevant notes on wine style, character, aging, cost and where the wine can be purchased.”

David has apprised me via e-mail that the 1976 version of the UC Davis scale revising the original 1959 version drops Volatile Acidity and adds 2 points to Aroma & Bouquet, and splits the 2 points for Bitterness into 1 point each for Bitterness and for Astringency.

So reworking the numbers to arrive at the "modified" UC Davis scale:

APPEARANCE (2 points)

COLOR (2 points)

AROMA and BOUQUET (6 points)

VOLATILE ACIDITY (0 points -- deleted)

TOTAL ACIDITY (2 points)

SWEETNESS (1 point)

BODY (1 point)

FLAVOR (2 points)

BITTERNESS (1 point)

ASTRINGENCY (1 point)

GENERAL QUALITY (2 points)

TOTAL RANKING (20 points)

("Opine on wine": it shouldn't be this difficult to find the current UC Davis scale -- somewhere -- at the university's department of viticulture and enology program website, or on the Web.)

About this blog

In the interests of doing something different to every other wine blogger, this blog will delve into the world of wine data, instead of wine itself. The intention is to ferret out some of the interesting stuff, and to bring it out into the light, for everyone to see. In particular, I will be drawing pictures of the data — as William Playfair said (in 1805): "whatever can be expressed in numbers may be represented by lines". Hopefully, this will be both interesting and informative.