it would be very rare that some judges would have skater A somewhat higher on all five components, and the other judges would have skater B higher on all components

Originally Posted by Mathman

Thanks to randomization of judges' scores, we do not know whether this is rare or common. My intuition is that it is not rare at all.

Here are all the protocols for US Nationals. Scores are not anonymous or randomized -- judge #1 on the officials list is always judge #1 for all skaters, etc.

Last I heard that was also true for the JGP, if you think international events are a better example.

For any two skaters (not necessarily near the top or even adjacent in the standings) in any event can we find examples in which
1) a majority of judges thought that skater A was better (or equal to) than B on all 5 components
and
2) all the remaining judges thought that skater B was better (or equal to) than A on all 5 components.

I.e., not even one judge had one component reversed from their overall opinions of the two skaters' relative PCS quality. I'll allow ties on some of the components.

It will be tedious to look for them. I'll take a quick look at the senior medalists to see if there are any examples there and report back if I find any.

ETA: In 24 head-to-head matchups among senior medalists in short and free programs for all disciplines, I found one example:
In the ladies' SP, 8 judges marked Gold higher than or equal to Edmonds in all components. Judge #1 marked Edmonds higher in all.

But anyway, now I am sorry that in illustrating the question I presented sample scores for only one component. This sent the discussion off on a tangent.

I agree this discussion has nothing to do with Hersh and little to do with anonymity. I was wondering a post or two ago whether to take it to a new thread -- it does interest me.

Fine with me if mods want to split off the last page or so of this thread.

I guess that is what the whole controversy comes down to. What is the purpose of a sports competition? Is it to see which competitor outperformed the other, or is it to decide which competitor did a better job of conforming to an objective standard?

The point of the competition is to see who outperformed.
However, the task of the judges under IJS is not to rank the skaters, vote for which skater they thought performed best, or choose who they think should finish higher. Unlike under 6.0, they're just supposed to score each skater independently.

With IJS it's possible to score skaters who have no one to compete against. This won't happen in international competition, but it does happen at some club competitions or even at some national championships of smaller federations: one skater (usually male) or team enters an event, no one else enters, or one or two others enter and then withdraw. The remaining skater has invested money to travel to the event, paid an entry fee for a club competition, or needs to skate and be scored to make a national title official.

With 6.0, the judges can write down whatever scores they like before the skater performs, and all ordinals will be 1s. The accountants could even print out the result sheets in advance. Or the judges could all sleep through the program then wake up and input random marks. All the judges need to do is rank the skaters, and with a field of 1 skater the result is literally a no-brainer.

With IJS, the tech panel calls the elements and the judges assign GOEs and PCS, based on what the skater actually does.
Their scores are not about ranking, but about evaluating the performance.

With IJS, even in large fields the tech panels' and judges' process is supposed to be about evaluating each performance independently. Then the numbers are added up and the skater with the highest total wins. But unlike with 6.0, none of the officials is tasked with deciding who should have the highest total.

Staying in the corridor has nothing to do with bunching the PCS tightly for each skater. It has to do with being not too far off from the other judges [b]for each component. If the other judges spread out their marks, you had better do so, too, or you risk being outside the corridor on some of them.

For each of the five (5) Program Compnoents, the Judge's corridor will be based on 1.50 Deviation Points (15,0% of the maximum 10.0 points per Component) between the score of a Judge and the calculated Judges' average score for the same Component, i.e. in total 7.50 Deviation Points for the 5 Program Components. Plus and minus Deviation Points are subtracted.

The example they give has Judge A giving a skater component scores of 4.00 4.00 6.25 7.25 7.00 on a panel with averages of 5.75 5.85 5.45 6.00 5.55. Judge A has a pretty extreme spread here but is "well within the allowed corridor" because the very low marks balance out the very high ones.

The way the ISU calculates deviation from the average for components (as opposed to GOEs, for which the plus and minus deviation points are added), spreading marks for the various components of the same skater can actually help a judge stay within the corridor better than bunching them too closely but marking in a different range than the rest of the panel.

I'm not sure how many judges actually realize that though.

Is this true? Do you mean that the ISU officially encourages judges to do this?

As far as the ISU is concerned, all I know is what's in this document about judge evaluation. And the e-mail several years ago from a member of the assessment commission reminding other judges to evaluate Transitions independently.

I have heard US judges discuss the concept of spreading marks, as a good thing.

The scoring scale has to accommodate all skaters from beginners to world champions. There cannot be too much of a spread between the best skater in the world and the second best.

True.
As I understand, the recommendation to spread marks between skaters means to use the whole range of marks as appropriate, regardless of the type of competition.

Just because a skater is entered in an ISU championship -- let's say Euros or 4Cs -- doesn't mean that they automatically deserve championship-level scores. Or that just because the vast majority of senior level skaters deserve scores in the 5s, 6s, maybe 7s, that judges should be limited to that range. At Euros or 4Cs you might well see the first place skater earning 9s and the last place skater earning 3s or even 2s for some components.

At Junior Worlds, 2s at the bottom of the field are more common but should only be given if warranted, if the skater is clearly below typical junior quality for that component. Very high scores are even rarer among juniors than seniors, but judges shouldn't go into the event thinking that they should cap their scores in the 6s just because this is a junior event -- if a great junior performance is just as good in some components as a senior performance that deserves 8s, then it should get 8s.

Sometimes the second-best skater in an event (in each judge's opinion) is pretty close to the best skater and should receive similar scores. And maybe the third, fourth, and fifth best as well. Sometimes the best skater in an event is in a class by him/herself and deserves much higher scores than the next best skater(s) in the field. Depends on how they skate -- their overall skill level, and how well they actually deliver on that day.

Spreading marks within the scores for a single skater means that just because the skater deserves a high score for Skating Skills doesn't mean they automatically deserve a high score for Transitions or Performance/Execution or Interpretation . . . or vice versa.

I think some fans want judges always to give large gaps between a skater's highest and lowest component.

As far as I can tell the ISU wants judges to give large gaps when the skater's skills are unbalanced from one component to another, and small when the skater is at close to the same level in all component areas.

I don't think so. If the contest is close, the scores should be close together. If one skater is much better than the other then the scores should be farther apart.

Absolutely.

If a judge honestly believes that the best skater was significantly better than the next best (in their opinion), on one or all components, they should reflect that significant difference with scores more than 0.25 apart. If they think the skater was significantly better on all components, the larger gaps will add up to several points across 5 components.

If a judge honestly believes the two skaters were about equal on a component, she can give the same mark. Or give 0.25 more to the skater she thinks was slightly better. If she thinks skater A was slightly better than skater B on all components, that will add up to a full point or two on PCS as a whole.

But really the judges shouldn't be comparing the skaters directly, they should be comparing each to their own mental standards. Ideally they should have a good internal sense of the difference between Good (7) and Very Good (8), and halfway between Good and Very Good (7.5) or between but closer to one or the other (7.25 or 7.75), and then match each performance to that mental image.

But ultimately, components are just numbers. They are not--in fact, cannot be--objective standards. What happens in the end is ranking skaters, because that determines the medals/placements everyone cares about. I don't think most judges are capable of keeping an objective scale in their heads. They'll have to, at points, go, "Oh, I gave Edmunds 7.50, that means I need to give Gold 8.25 because she's better." If they don't do that, they'll run into fatigue from looking at so many competitors, and likely end up giving scores they don't truly believe in (I think this might be a factor in why people who don't make the final group are low-balled. They're superior to the group they're in, but judges aren't comfortable giving out sudden 9s when the best they've given so far is a 7.50. They don't "need" 9s to place the skater ahead. But by the end of the night, judges are comfortable giving out 9s, thus potentially "screwing over" the earlier skater).