Validity and reliability of MOS short form health survey (SF-36) for use in India

Richa Sinha1, Wim J A van den Heuvel1, Perianayagam Arokiasamy21 Department of Community and Occupational Health, Research Institute SHARE, University Medical Centre Groningen, University of Groningen, Groningen, Netherlands2 Department of Development Studies, International Institute for Population Sciences (IIPS),Mumbai, India

Background: Health is defined as the state of complete physical, mental and social well-being than just the absence of disease or infirmity. In order to measure health in the community, a reliable and validated instrument is required. Objectives: To adapt and translate the Medical Outcomes Study Short-Form Health Survey (SF-36) for use in India, to study its validity and reliability and to explore its higher order factor structure. Materials and Methods: Face-to-face interviews were conducted in 184 adult subjects by two trained interviewers. Statistical analyses for establishing item-level validity, scale-level validity and reliability and tests of known group comparison were performed. The higher order factor structure was investigated using principal component analysis with varimax rotation. Results: The questionnaire was well understood by the respondents. Item-level validity was established using tests of item internal consistency, equality of item-scale correlations and item-discriminant validity. Tests of scale-level validity and reliability performed well as all the scales met the required internal consistency criteria. Tests of known group comparison discriminated well across groups differing in socio-demographic and clinical variables. The higher order factor structure was found to comprise of two factors, with factor loadings being similar to those observed in other Asian countries. Conclusion: The item-and scale-level statistical analyses supported the validity and reliability of SF-36 for use in India.

How to cite this article:Sinha R, van den Heuvel WJ, Arokiasamy P. Validity and reliability of MOS short form health survey (SF-36) for use in India. Indian J Community Med 2013;38:22-6

How to cite this URL:Sinha R, van den Heuvel WJ, Arokiasamy P. Validity and reliability of MOS short form health survey (SF-36) for use in India. Indian J Community Med [serial online] 2013 [cited 2019 May 25];38:22-6. Available from: http://www.ijcm.org.in/text.asp?2013/38/1/22/106623

Introduction

Health is defined as the state of complete physical, mental and social well-being than just the absence of disease or infirmity. [1] Therefore, patient's perspective about his/her state of health is recognized as an important parameter for assessing health outcomes and efficacy of an intervention. Validated and reliable self-reported questionnaires [2],[3] are needed to enable this assessment.

The Medical Outcomes Study Short-Form Health Survey (SF-36) [4] is a widely used generic health-related quality of life (QoL) instrument consisting of 36 questions and measuring health in eight dimensions: physical functioning (PF), role limitations due to physical health problems (RP), bodily pain (BP), social functioning (SF), general mental health covering psychological distress and well-being (MH), role limitations due to emotional problems (RE), vitality, energy and fatigue (VT) and general health perceptions (GH). SF-36 has been adapted and translated into several languages, [5],[6] and its validity and reliability established in several countries. [7]

SF-36 has been used in India to assess health outcomes in several diseased populations. [8],[9],[10],[11],[12],[13],[14],[15] However, no studies regarding the validity and reliability of SF-36 in the general Indian population have been cited in electronic scientific databases. The primary objectives of this study were to adapt and translate SF-36 for use in India and to study its validity and reliability. Additionally, the study aimed to explore the higher order factor structure of the eight SF-36 scales.

Materials and Methods

Adaptation and translation

Conceptually equivalent cultural adaptations [5] were made to the questionnaire to make it suitable in the Indian context. In PF01, vigorous activities such as lifting heavy objects was exemplified as lifting a bucket of water to improve the understandability of the question. In PF02, moderate activities such as moving a table, pushing a vacuum cleaner, bowling or playing golf were replaced by brooming, shopping for groceries and cooking, as these activities can be considered to be equivalently moderate.

Climbing several flights of stairs was explained as climbing five to six floors (PF04), and one flight of stairs was expressed as one floor (PF05). One mile was expressed as 1 km (PF07) in accordance with the metric standards in India. A block was represented by a distance of 100 m (PF08 and PF09). Words like pep, dumps and blue in VT01, MH02 and MH04, respectively, were translated appropriately to capture their essence.

The questionnaire items were translated from English into Hindi by a professional experienced in translating health survey questionnaires. The translated version was back-translated into English by an independent person followed by a review to control for possible discrepancies. The final translated version evolved following an iterative process, [6] which included pilot-testing and incorporating appropriate changes.

Data collection

The study sample was selected from Mumbai, its suburbs and a village in its vicinity following purposive sampling. Face-to-face interviews were conducted by two trained interviewers in residential areas, industrial estates, business centers, offices and market place by selecting the units systematically and interviewing the people present at the time of visit.

The Institutional Review Board of one of the co-authors' institute in India approved the study. Subjects 18 years and above who could communicate in Hindi were included. The study purpose was explained to the subjects and interviews were performed after obtaining written informed consent. Two people did not participate in the study, one because of lack of time and the other due to being psychologically disturbed due to the death of a family member.

Statistical analyses

The statistical analyses were performed using Statistical Package for Social Sciences (SPSS), version 15. [16]

Tests of item internal consistency, equality of item-scale correlations and item-discriminant validity [17] were carried out to establish item-level validity. Item internal consistency is considered satisfactory if an item correlates 0.40 or more with its hypothesized scale after correction for item-scale overlap. Equality of item-scale correlations is considered satisfied when items have similar item-scale correlations. Item-discriminant validity is supported if the correlation between an item and its hypothesized scale is significantly higher than the correlations between that item and the other scales.

Scale-level validity and reliability were assessed by testing whether the SF-36 scale scores showed substantial variability and if their reliability estimated using the internal consistency method (Cronbach alpha coefficient) was acceptable, namely 0.70 or higher for group comparisons, [18] and whether the internal consistency of each scale was higher than the correlation between that scale and the other scales, which tests if each scale measured a distinct health concept. [17]

Tests of known group comparison were performed to assess the ability of SF-36 to distinguish between groups differing in factors known to affect QoL. Gender, age, residential area and comorbidities were used as grouping variables. Non-parametric tests (Mann-Whitney U or Kruskal-Wallis one-way ANOVA) were performed to test the statistical significance of observed differences between the groups.

Exploratory factor analyses were performed on the eight SF-36 scales using principal component analysis with varimax rotation [19] to examine the existence of higher order factor structure in the Indian general population.

Results

Sample characteristics

[Table 1] summarizes the socio-demographic and clinical attributes of the study population. The population ranged in age from 18 to 76 years, with a mean age of 37.44 years.

Table 1: Socio-demographic and clinical characteristics of the study population

[Table 2] summarizes the descriptive statistics for the eight SF-36 scales. Within each scale, the correlation between items and their hypothesized scale were of similar magnitude. Item internal consistency and item-discriminant validity criteria were met for most of the scales. The floor effect was found to be less than 10% for all the scales except RP and RE. A ceiling effect of more than 50% was observed for all the scales except GH, VT and MH. For all the scales, the internal consistency (Cronbach alfa) equaled or exceeded 0.70 [Table 3]. The internal consistency of each scale exceeded the correlations between that scale and the other scales.

[Table 4] summarizes the tests for known group comparisons. Females scored lower on all the eight scales, with the differences being statistically significant in PF, RP, VT, MH and GH. With age, the range of scores related to physical health scales (PF, RP, BP and GH) were more than those related to mental health scales (RE, VT, MH and SF). Scales associated with mental health showed significantly different scores in groups classified as per residential area, and only one physical health scale (RP) was found to be significantly different. All the scales were found to be lower in the sub-group having comorbidities, and these differences were significant for all the scales except for RE.

Exploratory factor analysis of the eight SF-36 scales yielded a two-factor solution [Table 5]. PF, RP and BP correlated more with factor 1, whereas SF, RE and MH correlated more with factor 2. VT and GH scales correlated moderately with both the factors. The two principal components explained 63.42% of the total variance, which was within the prescribed range. [20],[21] The obtained two-factor structure was similar to that obtained in other studies (factor 1 as "physical" and factor 2 as "mental" domains of health underlying the SF-36). [22]

The study sample can be considered adequate because a sample of 100 or more has been recommended for reliability and validity [23],[24] studies, and to perform second-order analysis on SF-36 scales. [6],[25] SF-36 translation and validation studies in several languages [26],[27],[28],[29] have been carried out using similar sample sizes. The overrepresentation of males (72%) in the study population is not an issue as the study did not aim at generating normative QoL data.

SF-36 was adapted to make it suitable for use in India. Similar adaptations have also been done while translating SF-36 for use in other countries in order to preserve the conceptual meaning of the original question to make it relevant within each country and language. In Iran, [30] in items regarding activities, bowling and playing golf were changed to light sport activities, mile was changed to kilometer and walking one block or walking several blocks were changed to walking one alley or several alleys to refer to a similar distance. In the United Kingdom, [31] walking one block was changed to walking 100 yards and downhearted and blue was changed to downhearted and low. In Sweden, [31] walking in the forest or gardening was used as an example for moderately strenuous activity instead of bowling or playing golf.

The adapted and translated version was well understood by the respondents as there was a low (1%) missing data at both item and scale levels, and all the response choices were used. The suggested Cronbach alfa values of 0.70 for group-level comparison [7] were met for all the scales, thus, supporting the item homogeneity and internal consistency across scales. Tests of known group comparison discriminated well among the groups, differing in socio-demographic and clinical variables. These findings were similar to those reported in other studies. [30],[32],[33],[34],[35] In Greece, [32] women reported worse health than men. Moreover, age was an important health status factor affecting physical health relatively more than mental health. Also, in Iran, [30] SF-36 discriminated well between sub-groups of people differing in gender and age. The findings demonstrated that women and old people had poorer health as compared with men and younger people.

Exploratory factor analyses were performed as the higher order factor structure of SF-36 has not been established in the general Indian population. [25] The higher order factor structure was found to be similar to that in other countries. [22] The two factors explained 63.42% of the total variance of the SF-36 scale scores and 68-97% of the reliable variance of each scale. SF and MH showed, respectively, lower and higher association with physical domain in congruence with the findings from other Asian countries. [26],[27],[30],[36]

To conclude, the item and scale level analyses supported the validity and reliability of the translated and adapted version of SF-36 for use in India. Studies with larger sample size with representative general Indian population are encouraged to generate normative QoL data.

de Vet HC, Adèr HJ, Terwee CB, Pouwer F. Are factor analytical techniques used appropriately in the validation of health status questionnaires? A systematic review on the quality of factor analysis of the SF-36. Qual Life Res 2005;14:1203-18.