SAMPLE HOMOGENEITY, RESPONSE SKEWNESS AND ACQUIESCENCE: A REPLY TO FEATHER

JOHN RAY

University of New South Wales

Feather (1980) claims that the low correlation between the positive and negative halves of the C-scale reported in an earlier paper may have been due to the homogeneity of the samples used. It is pointed out that (a) even highly skewed C-scale items do still, on at least some occasions, correlate well with other items; (b) low pos-neg correlations are sometimes in the C-scale associated with generally high levels of inter-item correlation; (c) the C-scale does not collapse when applied to other homogeneous samples; and (d) the demographic homogeneity of the Ray and Pratt (1979) "homogeneous" samples was not associated with response homogeneity. It is concluded that Feather's suggested explanation for the occasional poor functioning of the C-scale is contra-indicated by such considerations.

Ray and Pratt (1979) reported or referred to several instances wherein the correlation between the positive and negative halves (rPN) of the Wilson (1973) Conservatism scale was found to be much lower than claimed by Wilson. Wilson (1974) claims levels of around -.7 (before reverse-scoring) whereas levels of around -.2 were reported. It was concluded that, on some occasions the operation of acquiescent response set may seriously flaw the validity of the Wilson instrument.

In his follow-up to that paper, Feather (1980) hypothesizes that the low rPNs may have been due to the unusual homogeneity of the Ray and Pratt samples. He gives no detailed reasoning for this hypothesis, but simply reports two samples of his own wherein the rPNs were more satisfactory (levels of -.42 and -.54 for general population and student-based samples respectively). He relates these more satisfactory rPNs to the higher standard deviations of C-scale total score observed in his samples -- apparently regarding this as confirmation that the responses in the Ray and Pratt samples were unusually "homogeneous".

There are two basic problems with Feather's paper: (a) Feather appears to be mistaken in concluding that responses in the Ray and Pratt samples were unusually homogeneous, and (b) even if the responses in the Ray and Pratt samples had been homogeneous, it is not clear what effect this would have had on rPN.

HOMOGENEITY IN RAY AND PRATT SAMPLES

If anything, it would appear that it was the standard deviations in Feather's samples which were unusually large rather than the standard deviations in the Ray and Pratt samples being unusually low. Of the 31 administrations of the C-scale tabulated by Wilson (1973, p. 54) and referred to by Feather, 18 in fact have standard deviations lower than that of the Ray and Pratt conscript sample. In the context chosen by Feather himself then, the Ray and Pratt C-scale SDs were on the high side rather than on the low side. If SDs such as those reported by Feather are needed before the C-scale rPN becomes satisfactory, we must conclude that most of the C-scale administrations reported by Wilson himself were also as flawed as those reported by Ray and Pratt (1979). Yet we have Wilson's (1974, p. 27) general assurance that the rPNs observed in the administrations of the scale known to him were eminently satisfactory.

Feather points to two important reasons why the responses in the Ray and Pratt samples may have been homogeneous. The first is a limited range in the age of respondents. Yet it is not at all obvious that the Ray and Pratt samples were unusually limited in this respect. In fact the Ray and Pratt officer sample (age mean 30.6 years with SD of 6.5) would seem to be less homogeneous than the usual student sample to which the C-scale has been applied. Yet student samples such as the one studied by Cloud and Vaughan (1970) -- consisting in fact of 183 Introductory Psychology students - gave a C-scale rPN of .51 [1]. Regrettably, Cloud and Vaughan do not give the mean and SD of age in their sample, so the matter cannot be finally tested. One can only say that if age homogeneity is such a problem in administrations of the C-scale, then it must be a scale singularly unsuited to the (student) samples to which it is usually applied. It might also be noted that Ray (1980) tabulates rPNs for samples of New Zealand University students, Australian University students, South African University students and English Occupational Therapy students. All showed levels of rPN comparable to those reported by Feather. Are we to assume that these student samples were as heterogeneous in age as Feather's community samples? Although precise evidence is not available, it would seem distinctly unlikely.

The second reason Feather suggests for the Ray and Pratt responses being unusually homogeneous is that the "authoritarian/conservative" military environment might have skewed responses in a generally conservative direction. The fact is that neither of the Ray and Pratt Army samples did give particularly conservative responses. The means for both the conscript and officer samples given in Ray and Pratt (1979) were in fact very close to the theoretical midpoint for the scale. Both groups of respondents were, in other words, almost equally likely to endorse radical as conservative choices. Radical responses were overall almost exactly as common as conservative responses. This is very similar to what Wilson and Patterson (1968) and Ray and Wilson (1976) found for the general population. Australian Army personnel are, then, in no way particularly conservative on general social issues. The "homogeneity" of an Army sample is more apparent than real.

Feather also mentions the all-male character of the Ray and Pratt Army samples as a possible reason for the low rPN. Note in this however, that Feather is assuming that sample homogeneity leads to response homogeneity. He is assuming that the responses of an all-male sample will be less varied than those of a mixed male/female sample. Yet his own data belie this. In fact the SD of the C-scale total score of his mixed group is in both his samples less than that of the males alone (Feather, 1980, Table 1).

The argument concerning homogeneity has so far centred on total scores. Feather however also mentions homogeneity of item scores as an important consideration. Could it be that the homogeneity of the total scores does not adequately reflect homogeneity in responses to individual items, and that homogeneity at this level explains the effects observed in the Ray and Pratt samples? In considering this, it should be noted as an initial clarification, that what is at issue (the levels of rPN) is in fact the correlation between two whole halves of the scale when each is treated as a scale in its own right. It is in fact the variability of these scores that should properly be under examination. To date no-one (not even Feather) has provided any information on this question. It would seem likely, however, that the variability most closely allied to the variability of the two sub-scales would be the variability of their total.

Be that as it may, what we find when we re-analyse the Ray and Pratt data, item by item, is precisely the opposite of what Feather predicts.

Both the Ray and Pratt samples were in fact found to have several items highly skewed in one direction or the other and were, as such, items with reduced variability. They were items with "homogeneous" responses. When these items were deleted, however, the rPN, far from rising, actually dropped. For both samples, items showing means less than 1.3 or greater than 2.7 were deleted and all totals recomputed. In the officer sample, 13 items were deleted while 6 items were deleted in the conscript sample. The rPN for the officer sample dropped only marginally to -.224, but the rPN for the conscript sample dropped to -.159. The original values were -.226 and -.199. Although the drop on the officer sample is clearly not significant, the strong rise predictable from Feather's hypothesis is certainly nowhere to be found. It might also be of interest to note that the reliability (alpha) on both occasions dropped to .72 (from .75 and .74). The skewed items had in fact been of some benefit to the scale.

THE EFFECT OF HOMOGENEITY

As mentioned above, Feather gives little in the way of detailed reasoning to
explain why homogeneous samples should have low rPNs. If the Ray and Pratt samples had in fact been homogeneous, Feather's paper would therefore have constituted little more than a counter-example showing that the C-scale does on some occasions function satisfactorily. Why sample homogeneity was picked out as the critical difference between the two sets of samples is not immediately clear.

It is nonetheless probably safe to assume that Feather had in mind the fact that any correlation will be restricted in range if there is little variability in the values that one or both of the variables concerned actually take. In the extreme case, no variability means no correlation. As well as restricted variability in response because of "homogeneous" ages in the Ray and Pratt samples, Feather also proposes in his first footnote that the respondents might have varied relatively little in their responses because of their military environment.

This line of reasoning, then, as already noted, relates response homogeneity to respondent homogeneity. It uses homogeneity in the sample to predict that responses by the sample will be homogeneous. This may not of course be true and in fact even in Feather's (1980) own samples, it is the more homogeneous (student based) sample that has the more heterogeneous C-scale scores (as shown in the C-scale total score SDs).

Furthermore, Feather's assumptions about skewness are not empirically correct. He reasons that because responses are skewed in a conservative direction, they must have low correlations with other items. Kirton (1978) appears to have reasoned similarly when, in producing his shortened C-scale, he eliminated all items showing highly skewed mean scores. It should be noted of course that skewness is not always the same thing as saying that a set of scores have reduced variability. In the case of C-scale item scores however, there are only three possible scores in any case and a highly skewed mean will also necessarily reflect an item with reduced variability. The only way a highly skewed mean can be attained is for most respondents to give the one answer. Thus, if we depict Feather's reasoning as follows: Sample homogeneity ---> skewed responses ---> responses with little variability ---> low inter item correlations ---> low rPNs (where "--->" may be read as "leads to"), then the second arrow represents a correct inference and the first arrow represents a false inference (because the Ray and Pratt samples did not give unusually skewed responses overall even where they were homogeneous as to age and sex). The third and fourth arrows will be examined hereunder.

It has already, in fact, been shown above that skewed items in the Ray and Pratt samples did not lead to reduced inter-item correlations. The only remaining question in this connection would appear to be whether the Ray and Pratt samples were peculiar in this respect. Would the same thing be true where the C-scale was functioning more adequately (i.e. where the rPNs were higher)? The same thing is in fact true of all the samples tabulated in Ray (1980) -- all of which have satisfactory
rPNs. Take, for instance, the sample of 55 students gathered by Barling (see Table 1). The rPN was -.55, but his administration items 7 and 13 both showed quite skewed response distributions (the post-reversal means of 2.6 and 2.43 indicating that most respondents answered "No"; in the C-scale all odd-numbered items are "Radical" and are reverse-scored), yet the same items showed corrected item-total correlations of .43 and .46. They were, in other words, items which showed unusually high correlations with other items. Skewness does not automatically mean that an item is not working well. Even quite extraordinarily skewed items (such as no. 34 with a mean of 1.16 indicating that practically everyone answered "No") showed item to total correlations (.35 for item 34) which were at least average. Skewness is nothing like as big a problem as it has been assumed to be.

Feather's fourth implied "arrow" is also incorrect. Low inter-item correlations and low rPNs are not necessarily intimately associated. Note that if all correlations are depressed then the coefficient alpha should also be depressed. As Lord and Novick (1968) show, alpha can be conceived as a weighting of average inter-item correlation by the number of items. If interitem correlations are generally low, then alpha should also be low (the number of items being constant). While this is true in some of the cases under discussion (e.g. both cases reported at length in Ray & Pratt, 1979), it is not true of all. The alpha for the student sample mentioned in Ray (1971) was .83, which is virtually identical to Feather's (1980) metropolitan sample finding. If, then, we have samples with similar alphas but vastly different positive-negative correlations (rPNs), we cannot assume that low inter-item correlations and low rPNs go together. One can even have high inter-item correlations with low rPNs.

VARIABILITY IN SUB-SCALE SCORES

As mentioned earlier, the place we should be looking for special "homogeneity" in scores is surely in the two vectors giving rise to rPN. If a low rPN is to be explained by low variability, then the positive and negative sub-scale scores that give rise to rPN should surely show this low variability. In Table 1 therefore, the SDs are given for both sub-scales in both the Ray and Pratt (1979) Army samples and the Ray (1980) samples.

TABLE 1

"Positive "and "Negative" sub-scale SDs from the Ray and Pratt (1979) Army samples and from the samples tabulated in Ray (1980)

The most notable thing in this table is the lack of any systematic connection between sub-scale SDs and rPN. In the two Barling samples, for instance, we see that the sample with the higher rPN in fact had a "Negative" sub-scale SD of less than half the "Negative" subscale SD of the other sample -- emphatically the opposite of what we would expect on Feather's hypothesis. A higher rPN went with less variability -- much less. On the other hand, it is true that the lowest SD was generated by one of the Ray and Pratt samples.

Perhaps the most constructive comparison possible from Table 1 is that between the SDs of the Ray and Pratt conscript sample and those of Pearson's sample. The SDs could hardly be more similar, yet the rPN in the Pearson sample is more than twice as big as that in the Ray and Pratt sample. Clearly, on this criterion also, "homogeneity" of the scores has nothing to do with rPN.

CONCLUSION

We must conclude that responses to the Ray and Pratt (1979) samples of Army personnel were not especially homogeneous, and that even if they had been this would not have constituted an explanation of the low rPNs there observed. We are then once again thrown back on acquiescence as an explanation for the phenomenon.

---------------------------------

FOOTNOTE:

1. Cloud and Vaughan (1970) in fact give the rPN for their administration of the C-scale only after it has been "corrected" by the Spearman-Brown formula. We can however apply the said formula in reverse to get back to the original rPN. Thus: .68/(2 -- .68) = .51.

REFERENCES

{Articles below by J.J. Ray can generally be accessed simply by clicking on the name of the article. I am however also gradually putting online a lot of abstracts, extracts and summaries from older articles by other authors so if an article not highlighted below seems of particular interest, clicking here or here might just save you a trip to the library}