Abstract

Background:
An absolute prerequisite to the effective management of dementia is its early diagnosis. Successful dementia screening requires precise and sensitive instruments that can be completed even by illiterate elderly people.

Objectives:
The aim of this study was to evaluate the psychometric properties of the Persian version of the cognitive state test (COST).

Materials and Methods:
This methodologic study was conducted in Kashan, Iran, during 2013 - 2014. A purposeful sample of 150 healthy elderly people and 50 elderly patients with dementia was recruited. After translating the instrument by using the standard forward-backward technique, we assessed its qualitative and quantitative face and content validity. The validity of the test was assessed by using the concurrent validity and the exploratory factor analysis. We also calculated Cronbach’s alpha and employed the test-retest method for evaluating the internal consistency and the stability of the test, respectively. Study data were analyzed by using the SPSS v16.0, the Spearman-Brown, and the intraclass correlation coefficient tests and the principal components factor analysis with varimax rotation.

Results:
The Persian COST consists of nineteen items. The impact scores, the content validity ratios and the content validity indices of all test items were greater than 4.5, 0.69, and 0.84, respectively. The COST had a significant correlation with the clinical dementia rating (rS = -0.76, P value < 0.001), indicating an acceptable concurrent validity for the test. The exploratory factor analysis revealed a five-factor structure that explained 60.59% of the total variance of the total cognitive state score. The Cronbach’s alpha, Spearman-Brown, and interclass correlation coefficients were 0.82, 0.95, and 0.88, respectively (P value < 0.001).

Conclusions:
The Persian version of the COST can be used as a valid and reliable instrument for assessing cognitive state and screening dementia in literate and illiterate elderly people.

1. Background

Given the recent changes in the world population pyramid, aging is currently considered as one of the most critical health issues worldwide (1). It is estimated that the elderly population will grow from 600 million in 2000 to more than two billion in 2050 (2, 3). The Iranian elderly population is also increasing. It was five million in 2006 and has been estimated to double by 2019 (4).

Along with aging and age-related physiologic changes, people face greater risk of developing other health conditions (5). Dementia is one of the most important age-related problems that affect most elderly people. The incidence of dementia doubles every five years between the ages 65 to 90 (6-8). Dementia is a chronic and progressive brain disorder that is manifested mainly by the impairment of cognitive functions (9, 10).

Dementia is the fourth leading cause of death in developed countries (11). According to the United Nations, the prevalence of dementia in 2010 was 35.6 million people worldwide and it is estimated to double every 20 years (12, 13). Currently, there are no reliable statistics on the prevalence of dementia in Iran. In a local study conducted by Mohammadi et al. (2010), however, the prevalence of mild, moderate, and severe dementia was 7.8%, 5.7%, and 4.8%, respectively (14).

Dementia is associated with impaired quality of life, malnutrition, increased likelihood of being admitted to a nursing home, and elevated risk of developing additional health conditions. In addition to the afflicted individuals, dementia also significantly affects family members and caregivers. It compromises patients’ functional independence and hence heightens patients’ need for care. Accordingly, it requires family members and caregivers to spend more time and a bigger budget for providing care to their patients. Moreover, family members and caregivers of patients with dementia may experience various health problems such as anxiety and depression, burnout, social isolation, loneliness, physical problems, and sense of intense guilt (15, 16).

More than half of early-stage dementias remain undiagnosed (17, 18). Accordingly, early diagnosis is among the key issues relating to dementia (15, 19). Early diagnosis of dementia helps slow its progression, prevent or postpone functional disability, lower dementia-related medical costs, relieve the burden of the disease on family members, and postpone patients’ admission to nursing homes (20).

There are many instruments for the early diagnosis and screening of dementia (21, 22). The psychometric properties of some of these have been evaluated in Iran (23-25). These instruments, however, have several limitations. One limitation is that the scores of the instruments are significantly correlated with patients’ educational status. In other words, poorly educated people obtain inaccurate lower scores in these tests, while well-educated people who really have dementia or impaired judgment may be inaccurately ranked by these instruments as healthy. This limitation reduces the effectiveness of these instruments in diagnosing early-stage dementias. Another limitation of these instruments is that people who are going to use them need to receive specific training (18, 23-31). On the other hand, the clinical dementia rating (CDR) which is among the most valid dementia tests and is applicable to even illiterate people has numerous items and hence requires a great deal of time to complete (11).

One of the most recent dementia screening tests is the cognitive state test (COST). This test was developed by Babacan et al. (2013). The COST consists of six domains, including orientation, memory, attention, executive functions, language, agnosia, apraxia, and visuospatial function. It is a brief and simple test (32). Moreover, people with different levels of literacy can use and complete it. Contrary to other dementia screening tests that assess only cognitive function, the COST examines all aspects of dementia. Babacan et al. (2013) reported a Cronbach’s alpha of 0.86 for the COST. Moreover, the sensitivity, specificity, and positive and negative predictive values of the test at the cut-off point of 24.25 have been reported as 81%, 87%, 86%, and 83%, respectively (32). Given that more than 60% of Iranian elderly people are illiterate (33), the importance of early diagnosis of dementia, and the limitations of other dementia screening tests, the COST seems to be an appropriate instrument for dementia screening.

2. Objectives

This study aimed at evaluating the psychometric properties of the Persian version of the COST (P-COST).

3. Materials and Methods

This was a methodologic study. The study was conducted in Kashan, Iran, during 2013 - 2014. A demographic questionnaire, the geriatric depression scale (GDS), the CDR, and the COST were used for data collection. The GDS-15 was used for diagnosing false dementias. We also used the CDR for assessing the concurrent validity of the COST. This study was conducted in three phases: the preliminary phase (instrument translation), phase 1 (face and content validity assessment) and phase 2 (concurrent and construct validity and reliability assessment). These phases are explained below.

3.1. Primary Phase: Instrument Translation

In this phase, we primarily obtained permission for using the COST from its developers. Then we went through the standard forward-backward translation process (34) to translate the COST from English into Persian. Accordingly, three healthcare professionals who were skilled in English-Persian translation were invited to translate the COST independently. Then we assessed the three Persian translations of the COST, amended (Table 1 and Figure 1), and unified them to generate a single P-COST. Thereafter, two specialists were asked to assess the congruence and the similarity of the P-COST with the original English COST. They confirmed that the English and the Persian versions of the COST were similar. Then the generated P-COST was edited by a Persian language editor. Finally, a medical specialist who had mastery of Persian-English translation was invited to back-translate the P-COST into English. This translator was independent of the three aforementioned ones. The generated English version of the COST was sent to the developers of the test and they were asked to assess the congruence between the original COST and the P-COST. They confirmed the similarity of the two versions.

Table 1. The COST Changes Made by the Researchers in the Translation Phase

Original Domain and Item

Description

Changes and Their Causes

Registration and recall

Measured by test three words. These three words are: Melon, Anchovy, Green

Persian equivalent of the word “Anchovy” is a compound word and made up of two separate words, and therefore cannot be used in this test, so it was replaced with the word “fish.”

Abstraction/judgment

Measurement of abstract thinking is to be done by interpreting the following proverb: “A rolling stone gathers no moss”

For cultural adaptation, the mentioned proverb was replaced with an Iranian proverb: “Chick was counted at the end of autumn.”

Retrograde memory/general information

1, what is the name of founder and first president of the Republic of Turkey?; 2, what is the name of current prime minister?

Two questions were specific to the country of Turkey. For this reason, they were replaced with the Iranian equivalent and changed to the following: 1, what was the name of the founder/the first leader of the Islamic Republic of Iran?; 2, What is the name of the current president of Iran?

Language /repetition

Ask the subject to repeat “above, beyond, and below.”

Because “beyond” is not a common word in the Persian language, this word was replaced with the word “behind.”

Figure 1. The COST Changes Made by the Researchers in the Translation Phase in Language/ Naming Area3.2. Phase 1: Face and Content Validity Assessment

After preparing the P-COST, we assessed its qualitative face validity by inviting 20 elderly people to complete the test and identify any problematic items. Elderly people from both genders and of diverse educational status were recruited. The questionnaires were completed by using the interview method. The problematic items were identified and revised. Then a panel of 20 experts in the medical sciences was invited to assess the quantitative face validity and the qualitative and quantitative content validity of the P-COST. They were asked to comment on the importance, relevance, clarity, simplicity, adequacy, scoring, and wording of the COST items. Accordingly, the impact score, the content validity ratio (CVR), and the content validity index (CVI) were calculated for each item. The total CVI of the P-COST was determined by calculating the means of the CVIs of all the items (35, 36). The calculated CVRs and CVIs were compared with the Lawshe’s table and Waltz and Bausell’s indices, respectively. Accordingly, the final version of the P-COST was prepared for reliability, construct, and concurrent validity assessments. The final P-COST consisted of nineteen items and it was completed by using the interview method. Scoring of the items in the questionnaire is based on their weight over a 0–4 range. Answering all items is mandatory. Overall, this tool determines a respondent’s cognitive state on a 0–30 scale. Higher scores reflect better cognitive states (32).

The study sample size was determined by multiplying the number of COST items by ten (35). The convenience sampling method was used; accordingly, a purposeful sample of 200 elderly people was recruited. Study participants were recruited from university-affiliated public health centers and private neurology care clinics located in Kashan, Iran. We referred to the Medical Records Unit of the study setting and recruited 50 elderly people who had been diagnosed with early- or intermediate-stage dementias. The diagnosis of dementia had been established during the Elderly People’s Health Monitoring Program by neurologists and was based on computerized tomography and clinical interview findings. Moreover, 150 healthy elderly people who had been diagnosed with having no cognitive disorders were also recruited. The inclusion criteria were Iranian nationality, an age of 60 years or more, being able to understand and speak Persian, being able to respond to the COST items, and having no history of known psychosis, depression (a GDS score of below 5), or mental disability. Elderly people who chose to withdraw from the study or were reluctant to answer to the COST items were excluded from the study.

In this study, we adhered to the ethical principles of the Declaration of Helsinki. The institutional review board and the ethics committee of Kashan University of Medical Sciences, Kashan, Iran, approved the study (approval code P/29/5/1/4536, date 04.02.2014). We provided detailed information about the aim and the flow of the study to the participants and ensured them that withdrawal from the study was voluntary. Moreover, we assured them of the confidentiality of their information. Written informed consent was obtained from either the participants or their family members.

The validity of the P-COST was evaluated by using the concurrent and construct validity assessment methods. The concurrent validity of the COST was evaluated by calculating the Spearman-Brown correlation coefficient between the COST and the CDR. The CDR contains 75 items in six domains, including memory, orientation, community affairs, judgment, problem solving, home and hobbies, and personal care. The reliability and validity of the Persian CDR have been assessed and confirmed by Sadeghi et al. (2011). They reported a Cronbach’s alpha of 0.73 for the scale. Moreover, they assessed the reliability of the scale by using the test-retest method and reported an intra-class correlation coefficient (ICC) of 0.69 (11).

We used the factor analysis method for evaluating the construct validity of the COST. The appropriateness of the factor analysis model and the sampling adequacy were assessed by employing Bartlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) test, respectively. The principal components analysis with varimax rotation was then used for determining the factor structure of the COST. We used Eigenvalues of greater than 1 and the scree plot for determining the number of factors. The minimum factor load was considered to be 0.35. If an item was loaded on more than one factor, the item was allocated to the factor that had a greater factor load to prevent secondary factor loading.

The internal consistency of the COST was evaluated by calculating its Cronbach’s alpha. Moreover, the stability of the COST was assessed by using the test-retest method. Accordingly, 50 participants were randomly recruited from the study sample and asked to complete the test twice with a one-week interval in between. The test-retest Spearman and ICC were then calculated.

Study data were collected by a master’s geriatric nursing student (the first author) and were analyzed by using the SPSS v16.0. We used the Kolmogrov-Smirnov test for assessing the normality of the study variables. The P-COST scores of the two groups were compared by the Mann-Whitney U test with a significance level set at less than 0.05.

4. Results

4.1. Findings of the First Phase (Face and Content Validity Assessment)

In assessing the qualitative face validity, the invited 20 participants recommended some revisions to improve the simplicity and the readability of the items in the five domains of the COST. These included registration memory, recall memory, retrograde memory and general information, language and understanding, and agnosia. Items were revised based on the participants’ comments. For instance, in the registration and recall memory domains, we substituted the word “green” with “yellow” because some participants related the word “green” with the green color of a watermelon peel. A prerequisite to the soundness of the test is that there should be no relationship between the words in the registration and the recall memory domains. Moreover, some elderly participants had perceived the Persian equivalent of the word “green” (i.e. sabz) as the Persian equivalent of the word “vegetable” (i.e. sabzi) due to an optical illusion.

The impact scores and the CVRs of all items were higher than 4.5 and 0.69, respectively. The CVIs of each item in terms of the relevance, simplicity, and clarity criteria were between 0.84 and 1. The total CVIs of the COST for the relevance, simplicity, and clarity criteria were 0.97, 0.98, and 0.98, respectively.

In the qualitative content validity assessment sub-phase, the invited experts recommended that we provide respondents with information about how to complete the orientation domain. The recommended information was included in the COST. Moreover, as the Persian equivalent of the word “watermelon” (i.e. hendevaneh) has four syllables, we substituted it with the Persian equivalent of the word “cucumber” (i.e. khiyar), which has only two syllables. Some minor revisions were also made in the wordings of the abstract thinking and judgment, the retrograde memory and general information, the agnosia, and the apraxia domains to simplify them and enhance their readability.

4.2. Findings of the Second Phase (Reliability, Construct, and Concurrent Validity Assessment)

About 52.5% of the participants were female, 77% were married, and 25% had primary education. The means of participants’ age and GDS scores were 69.25 ± 7.27 and 1.41 ± 1.44, respectively (Table 2). The results of the concurrent validity assessment showed that the mean of participants’ P-COST and CDR scores were 26.41 ± 3.70 and 0.50 ± 0.61, respectively. The Spearman-Brown correlation coefficient between the P-COST and CDR scores was -0.76 (P value < 0.001). The P-COST scores for the no dementia and dementia groups were 27.98 ± 2.05 and 21.68 ± 3.52, respectively (P value < 0.001).

The results of the KMO test and Bartlett’s test of sphericity revealed that the study sample was adequate (0.692) and the correlation matrix was suitable for factor analysis, respectively (χ2 = 171 and P-value < 0.001). In factor analysis, five factors were extracted that were accountable for 60.59% of the total variance of the P-COST score (Table 3). The scree plot also confirmed the five-factor structure of the P-COST (Figure 2). These five factors were named “orientation and memory”, “registry”, “agnosia and purposeful actions”, “language skills”, and “attention and visuospatial function”, respectively. Table 4 shows factor parameters of the five extracted factors.

Table 3. Eigenvalues and Amount of the Explained Variance by Each of the Extracted Factors in the Factor Analysis Stage

Factor

Eigenvalue

%Variance

1 (6 items)

3.54

18.659

2 (4 items)

2.57

13.571

3 (4 items)

1.99

10.514

4 (2 items)

1.76

9.282

5 (3 items)

1.62

8.569

Total factors

NA

60.595

Abbreviation: NA, not available.

Figure 2. Scree Plot of P-COSTTable 4. Factor Loading of the Items in the Factor Extraction Stage by Factor Analysisa

5. Discussion

This study evaluated the psychometric properties of the P-COST. Study findings revealed that the P-COST has acceptable validity and reliability.

In the translation phase, some changes were made to a number of domains. Changes were related mainly to the wording and cross-cultural adaptation of the items in such a way that the items would better suit the characteristics of Iranian elderly people. One of the key objectives of forward-backward translation is to generate an instrument that is understandable and culturally appropriate to the target population (34).

Regarding face validity, some revisions were made to the registry, recalling, retrograde memory and general information, language, and agnosia domains. Seif (2012) noted that the face of questions and items should be acceptable and reasonable to the target population; otherwise, respondents may complete the test either reluctantly or imprecisely (37). Polit et al. (2012) also referred to face validity as a key factor in persuading the target population to complete a test accurately. They also noted that face validity significantly affects the results of the intended test (35). The results of the quantitative face validity assessment also showed that all items of the P-COST had an impact score of higher than 4.5. According to Vakili et al. (2012), impact scores of more than 1.5 are considered as acceptable. Accordingly, all the items of the P-COST were maintained in the test (38).

In the process of qualitative content validity assessment, we made some changes to the orientation, registry and recalling, abstract thinking and judgment, retrograde memory and general information, agnosia, and apraxia domains to enhance the objectivity of the test and to minimize potential biases. Polit et al. (2012) mentioned that content validity assessment is essential for developing a valid instrument. They also added that expert review is the most common method for content validity assessment (35). The CVRs of all the items in the P-COST were higher than 0.7. According to Lawshe (1978), the acceptable CVR value for 20 experts is 0.42 (39). Accordingly, all the items of the P-COST were considered as acceptable. The CVIs of all the items in the relevance, clarity, and simplicity areas were 0.84 - 1. Items that have a CVI of greater than 0.79 are considered as suitable. When CVI is 0.7 - 0.79, however, the corresponding item(s) need(s) revision. Moreover, if the CVI of an item falls below 0.7, that item should be deleted. The total CVIs of the test in the three aforementioned areas were also higher than 0.9, indicating an acceptable content validity of the P-COST (35, 36).

The correlation coefficient between the scores of the COST and the CDR was -0.76. This is in line with the findings of Babacan et al. (2013), who reported that the correlation coefficient between the two tests was -0.77 (32). According to Polit et al. (2012), the minimum acceptable correlation coefficient for concurrent validity is 0.7 (35). Peers (2006) (40) and Houser (2013) (41), however, noted that a correlation coefficient of higher than 0.5 indicated extraordinary content validity (40, 41). Accordingly, the concurrent validity of the P-COST is confirmed.

Our exploratory factor analysis revealed a five-factor structure for the P-COST, which explained 61% of the total variance. Factors 1 and 2 (“orientation and memory” and “registry”) contributed the most to the total variance 18.659 and 13.571, respectively. The most common signs of early dementia are impaired short-term memory and disorientation to time, place, and person (42). The items that were loaded on factors 1 and 2 were also related to short-term memory and orientation. Accordingly, the greatest contribution of these two items to the total variance is perfectly justifiable. The proportion of variance explained by each of other factors was also greater than 5% (35). In exploratory factor analysis, one of the key factors contributing to construct validity is the total variance, which is explained by the extracted factors. Accordingly, if the total variance which is explained by all the identified factors and the proportion of variance explained by each factor are greater than 60% and 5%, respectively, the instrument is considered to have construct validity (35). Other studies considered a total variance of 50% as acceptable (43, 44). Consequently, the five-factor P-COST has acceptable construct validity.

Salari et al. (2014) evaluated the psychometric properties of the Rowland universal dementia assessment scale and identified only one factor with an Eigenvalue of greater than 1 for which it was accountable for 44.88% of the total variance (45). Moreover, Noroozian et al. (2013) evaluated the Cleveland scale of activity daily living for its psychometric properties and determined a three-factor structure that accounted for 47.76% of the total variance (46) These findings show that the total variance explained by the five extracted factors of the P-COST is greater than the total variance, which is explained by the factor structure of other cognitive assessment instruments. This denotes that the P-COST is more effective in dementia screening than other instruments. The original version of the COST has an eight-factor structure (32), while the P-COST had a five-factor structure. Babacan et al. (2013) did not use factor analysis for extracting the factor structure of the COST; rather, they developed the test and identified its domains based on predetermined concepts (32). According to Cooper and Greene (2005), cognitive assessment instruments should encompass six aspects of cognitive function, including orientation, registry, three-dimensional skills and executive functions, language, attention, and memory (47). The five-factor structure of the P-COST encompassed all six of these aspects.

The total Cronbach’s alpha of the P-COST was 0.82. Houser (2013) noted that the minimum acceptable value for Cronbach’s alpha is 0.7. Moreover, Cronbach’s alpha values of 0.7 - 0.9 and greater than 0.9 are considered as indicative of moderate and strong internal consistency, respectively (41). Tappen (2010) noted, however, that a Cronbach’s alpha of 0.7 is acceptable for newly developed instruments (48). Babacan et al. (2013) reported an alpha of 0.86 for the original version of the COST (32). The slight difference between the alpha values of the original version of the COST and the P-COST is probably due to the difference in the samples of the two studies. Babacan et al. (2013) recruited a sample of patients who had end-stage dementia, while our participants were at either the early or the intermediate stages of the disease.

The test-retest ICC of the P-COST was 0.883. According to Houser (2013), stability values of greater than 0.7 are considered as satisfactory (41). Test-retest is one of the common reliability assessment methods that assess the stability and the repeatability of an instrument. This method has been widely used for assessing the reliability of dementia-screening instruments (9, 41, 45). Polit et al. (2012) considered stability values of greater than 0.7, 0.8, and 0.9 as satisfactory, very good, and ideal, respectively (35). Accordingly, the P-COST has acceptable stability, repeatability, and reliability.

The P-COST can be used as a valid and reliable instrument for assessing cognitive state and screening dementia in literate and illiterate elderly people. Accordingly, it can be included in elderly people’s health monitoring programs. Our awareness of the study participants’ cognitive state was the major limitation of the study. We strived to remove this limitation through developing a cover letter for the test and enhancing the objectivity of the items. Conducting further studies to determine the diagnostic sensitivity of the P-COST for diagnosing reversible and irreversible dementias is recommended. Moreover, confirmatory factor analysis and comparing the diagnostic sensitivity of the P-COST with other dementia screening instruments are also recommended.

Acknowledgements

The researchers thank the research deputy of Kashan University of Medical Sciences, Dr. Babacan Yildiz, and all the other people who cooperated in this project.

Footnotes

Authors’ Contribution:
Mohammad-Sajjad Lotfi performed the data collection, literature review, and prepared the first draft of the manuscript. Zahra Tagharrobi supervised the study, performed data analysis, made critical revisions to the paper, and prepared the last revision of the manuscript. Khadijeh Sharifi supervised the study. Javad Abolhasani helped in the process of sampling.

Funding/Support:
This study was the part of an MS thesis in gerontological nursing that was funded by the Deputy of Research, Kashan University of Medical Sciences (KAUMS), Grant No: 92132.