This is likely to be a minority interest. It relates to the SF-36 questionnaire; this is used a lot in ME/CFS research so I think a few people e.g. those who have been interested in discussing various PACE Trial papers, may find it of interest.

However, I don't intend to write a big summary.

Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: summary of results from the Medical Outcomes Study.

Med Care. 1995 Apr;33(4 Suppl):AS264-79.

Ware JE Jr, Kosinski M, Bayliss MS, McHorney CA, Rogers WH, Raczek A.

Source
Health Institute, New England Medical Center, Boston, MA, USA.

Abstract

Physical component summary (PCS) and mental component summary (MCS) measures make it possible to reduce the number of statistical comparisons and thereby the role of chance in testing hypotheses about health outcomes.

To test their usefulness relative to a profile of eight scores, results were compared across 16 tests involving patients (N = 1,440) participating in the Medical Outcomes Study.

Comparisons were made between groups known to differ at a point in time or to change over time in terms of age, diagnosis, severity of disease, comorbid conditions, acute symptoms, self-reported changes in health, and recovery from clinical depression.

The relative validity (RV) of each measure was estimated by a comparison of statistical results with those for the best scales in the same tests.

Differences in RV among scales from the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36) were consistent with those in previous studies.

One or both of the summary measures were significant for 14 of 15 differences detected in multivariate analyses of profiles and detected differences missed by the profile in one test.

Most of these scores are relative validity scores (all except the Manova F row). If there is a gap, it means the result wasn't statistically significant. The higher the relative validity co-effecient, the better the measure is.

So what this shows the subscale which is most valid for a conditions/symptom cluster varies.

An F score checks for variations between groups. The way I recall understanding it is if one had a certain number of trees (or sectors) and things grew around them, one would could check to see whether certain sectors were better or worse for growth (to take one measurement). Some sectors might have better soil, light, water, etc. So it checks for variations.

There are other tables in the paper that are also interesting, but I don't have time at the moment to do a proper summary.