OBJECTIVE:
In Brazil, not many studies have investigated the validation of cognitive tests
in the ageing population and none of them has analyzed the psychometric properties
of Tuokko's Clock Test. The objective of the study was to translate and adapt
the test to the Brazilian context, and to assess its construct validation.METHODS: This is a cross-sectional population-based study, involving
353 elderly patients from Juiz de Fora (Southeastern Brazil), from 2004-2005.
To assess convergent and divergent validities Pearson's correlation statistics
was used. The Clock Test substests were correlated with theses reference instruments:
mini-mental state examination, digits, block design to evaluate the convergent
validity. the divergent validity was assessed by comparing the substests to
the Center for Epidemiologic Studies Depression Scale.RESULTS: In the sample, 74.1% were women, aged between 63 and 107 years
(73.8±8.5), average schooling was 7.4 years (SD=4.7). In regard to convergent
validity, significant correlations were found between all CT subtests and MMSE,
Digits, and Block Design (p<0.01). As for the divergent validity,
the only subtest that had significant association with the reference scale was
the "clock setting" (p<0.05).CONCLUSIONS: The translated and validated Clock Test into a community
sample of aged showed to be a brief screening instrument, with good construct
validity when compared to other studies. Future research must investigate other
psychometric properties, such as content and criterion validities.

In the early days,
population ageing was mainly observed in developed countries, but nowadays,
it is in developing countries that the elderly population is soaring.24
This demographic transition is in tandem with a transition in the morbidity
and mortality profile, which is mirrored by the increase in prevalence of chronic-degenerative
disease, such as Alzheimer.

Dementia processes
are among the main health problems in the elderly population. An early diagnosis
is crucial to obtain better results during therapy. There are some actions that
enable delaying the functional loss period, mitigating unnecessary suffering
of patients and their family, in addition to improving the life quality of the
people involved. These actions include: pharmacological measures (galantamine,
rivastigmine and donepezil)15 and non-pharmacological measures, such
as cognitive rehabilitation.3

Diagnosing dementia
syndromes is frequently a problem, especially in Alzheimer's, where certainty
can only be achieved in post mortem studies. This being the case, the
adopted standard for clinical diagnoses are based on diagnosing criteria established
by international agencies (DSM-IV,2 CID-1013 and National
Institute of Neurological and Communicative Disorders and Stroke and Alzheimer's
Disease and Related Disorders Association).

Neuropsychological
tests are tools to aid in early diagnoses of dementia processes. Tests used
for cognitive monitoring are known to be easy to apply and can be carried out
in a short period of time. They are important since they enable most dementia
cases to be identified in places where there is shortage of time and/or qualified
human resources, such as in primary healthcare services.

Despite the absence
of a standardized way of applying and checking the Clock Test, it is widely
accepted as a cognitive monitoring tool. The most commonly used methods in medical
practice, according to Braunberger,a
are the ones proposed by: Shulman et al,17 Sunderland et al,21
Wolf-Klein et al,26 Mendez et al,12 Tuokko et al,23
Manos & Wu,11 and Shua-Haim et al.16 The differences
in applying them are in the instructions given for the task, the time patients
have to write down the answers, in addition to the scoring system adopted by
each test. From the point of view of cognitive impairment scores, it is assumed
that the clock test assesses visuospatial abilities, constructive abilities,
and executive functions.20

In a review study,
Shulman18 found that most clock test studies had sensitivity and
specificity rates of approximately 85%, and high acceptability rates on the
part of patients. However, despite the test's importance, in Brazil, only a
very limited number of studies have been devoted to investigating clock test
methods, scores, and psychometric characteristics, especially in regard to the
validity of the tool.5,10,b

Construct validity,
since it assesses the legitimacy of behavioral representation of latent features,14
is probably the most fundamental way of studying neuropsychological tools and
it has been employed by many authors in clock test assessment.

Hadjistavropoulos
et al,6 in order to assess clock test construct validity, investigated
convergent and discriminant validities. It was found that there was a significant
correlation between clock test score and Digits and Block Design WAIS-III scale
subtests, and there was no correlation with the Kozma & Stones8
happiness scale.

A more comprehensive
way of using the clock test in cognitive assessment was developed by Tuokko
et al.23 The authors took into consideration that the concept of
time is abstract and that the abilities to draw, set and read certain times
seem to be particularly sensitive in identifying cognitive impairment.

This new test,
herein called the Tuokko Clock Test (CT), is a clinical and research tool developed
to detect cognitive impairment, with emphasis in investigating visuospatial,
construction, visuoperceptive and abstract-conceptual abilities. The authors
included two further tasks - setting and reading the time -, believing that
theses additional tasks would enable a better understanding of cognitive impairment
during its initial stages, when compared to clock drawing in isolation.

According to Tuokko
et al,23 the CT is a low-cost, reliable, valid and very useful tool
in diagnosing dementia, especially when used in tandem with other cognitive
assessment techniques and interviews with the patient and his/her family members.

Since it is a broader
tool and uses tasks involving drawing, setting and reading the time, the CT
can be useful in early diagnosis of suspected dementia also in individuals with
little or no schooling. This is because performance in cognitive assessment
tools is affected by schooling, a very significant variable in this kind of
population study, such as the one addressed in this paper. Therefore, it is
important to develop Brazilian studies to further investigate the CT.

The objective of
the presnet study was to translate and operationally adapt the Tuokko clock
test model to the Brazilian context and to assess construct validation.

METHODS

The CT was applied
in a population of elderly individuals living in the city of Juiz de Fora (Southeastern
Brazil), between June 2004 and February 2005. This study is part of the Estudo
dos Processos de Envelhecimento Saudável - PENSA [Healthy Ageing
Process Study], carried out in the city of Juiz de Fora.

CT translation,
adaptation and validation were authorized by the author of the original tool.8
One of the authors (KCAS) translated the tool, which was submitted to a second
translator (a Brazilian proficient in English).

The way the CT
is applied and checked means a high cost to the reality in Brazil, because each
kit is made up of seven detachable cards with carbon paper. In the original
version the subject was not able to see what he/she had just done. This was
possible because the cards were detached as soon as the subject finished the
drawing.

In operationally
adapting the CT, the number of cards and the material used in making the test
kit were reduced, aiming at adapting the tool to the study's operational boundaries.

Pre-testing involved
a sample of 20 elderly patients at the outpatient clinic of the university hospital,
and resulted in making changes to the frame in the clock drawing subtest.

The frame consists
of windows that open and close. When using the frame, the examiner must lift
one window at a time (from biggest to smallest) and close them as soon as the
patient has finished drawing, so the patient does not see what he/she has just
finished doing or what will be the next drawing (Figure
1). After adjusting the frame, 20 other patients at a center for the elderly
connected to research activities of a local university were submitted to the
CT. After carrying out the test, the individuals were asked about the instructions
for the tasks in the CT. No additional operational change was needed seeing
that no patient reported difficulties in understanding or performing the tasks.

Between June and
July 2004, the field researchers - four interviewers and one recruiter - were
trained to apply the instrument. In the last stage, two assessors checked the
CT. Standardization of checking procedures among assessors reported inter- and
intra assessor agreement of 0.95 kappa in a sample of 20 patients. Test-retest
agreement based on two assessors and 20 tests, over a two-month period, had
kappa values of 1.0 and 9.4 per assessor.

The sample of the
larger study (PENSA) was made up of elderly individuals in the community, residing
in Juiz de Fora districts that accounted for 15% or more of the total city population.
By using the door-to-door method of the Instituto Brasileiro de Geografia
e Estatística (Brazilian Institute for Geography and Statistics),
956 elderly individuals were included in this study.4

From this total,
a randomized sample of 40% of the cases was selected to take part in this study.
The inclusion criteria were: being able to hear and understand well enough to
be able to take part in an interview and signing the informed consent form.

The total number
of subjects selected to take part in this study was 382, of which 353 agreed
to participate (92.4%). The reasons for not participating were: 27.7% death;
24.1% incapacity; 34.5% refusal and loss to follow up; and 13.7% moved.

The CT involves
three empirical tasks: clock drawing, clock setting, and clock reading. The
clock drawing subtest consists of a printed circle where the examinee must place
the numbers and the hands of the clock showing 11.10 (Figure 2).
Checking this subtest consists of a detailed qualitative and quantitative system
of addressing the mistakes made by the examinee. The score system classifies
the mistakes in seven broad categories: omission, absence, perseveration, distortion,
substitution, addition and rotation. To classify and obtain the score, mistakes
in drawing the face of the clock, the numbers, the hands and the spaces between
the numbers are analyzed. One point is attributed to each mistake and there
is no maximum score, although one rarely sees scores above 31 points.

The clock setting
subtest consists of five circles on which the examinee is asked to draw the
hands showing 3 o'clock, 9.15, 1 o'clock, 11.10, and 7.30 (Figure
1). Checking this subtest consists of giving one point for every hand placed
correctly, and an additional point when the hour hand is smaller than the minute
hand (total=15 points).

The clock reading
subtest (Figure 3) consists of a notebook where clocks (set
to the same time as in the clock setting subtest) are shown to the examinee
who must express the time shown on each clock. Checking consists of attributing
one point to each hand read correctly and one additional point when both hands
are read correctly (total=15 points).

To assess CT construct
validation, the tools we used were: the Mini-Mental State Examination (MMSE),
Digits, Block Design and the Center for Epidemiologic Studies Depression Scale
(CES-D). The MMSE is made up of 30 items, with subtests that assess spatial-temporal
orientation, immediate memory, evocation, procedural memory, and language. The
score ranges from 0 to 30 points. The cut-off points adopted in this study were
the ones in Almeida:1 19/20 for individuals with no schooling and
23/24 for individuals with some formal schooling.

The Digits and
Block Design subtests are part of the Wechsler Adult Intelligence Scale - Third
Edition (WAIS-III). The first aims at assessing memory, immediate repletion,
attention and working memory. The score is obtained through the total number
managed backwards and forwards added together, and the maximum score is 30.
The Block Design subtest assesses perceptual and visual organization, ability
of abstract thinking, non-verbal concept formation and spatial visualization.
The maximum score for this subtest is 68.

The CES-D consists
of items that assess symptoms of depression on the week prior to the application
of the tool. The score ranges from zero to 60. It was validated for Brazil by
Silveira & Jorge.19 The cutoff point adopted in this study was
16.

The clock-drawing
subtest score was converted into totaling correct answers to match the same
scale as the other subtests. To compare the studied population to the original
PENSA population, we used the T test matched to the continuous variable "schooling",
and McNemar for the dichotomous variable "gender".

Convergent and
discriminant validities were statistically analyzed through Pearson's correlation.
CT subtests were correlated to Block Design and Digits on the WAIS-III and MMSE
to assess convergent validity. Discriminant validity was obtained through the
correlation between CT subtests with the CES-D.

Data were analyzed
through SPSS version 10.0 statistics software.

The study received
approval of the Research Ethics Committee at the University Hospital of Juiz
de Fora, (#463.148.2004).

RESULTS

All individuals
completed the test, and the respective data were analyzed for the entire population
of which 74.1% were women, aged between 63 and 107 years (average age 73.8 years;
standard-deviation 8.5; median 73; and mode 63). Higher ages prevailed (63.1%
were older than 70 years of age). Approximately 7.9% of the elderly individuals
informed having a university degree (Table 1).

When compared to
the original PENSA population, there were no differences in the gender and schooling
variables. The mean and the standard deviation for schooling in the PENSA sample
were 7.1±4.4, and in the studied sample they were 7.4±4.6 (t=0.907;
p<0.36). Women accounted for 71.8% of the PENSA sample, whereas in this study
they represented 74.1% of the population (agreement rate=84%; p<0.38).

In the clock-setting
subtest there was a concentration of higher scale-values (mean 10.3 and SD=3.9).
The same was noted in the clock-drawing subtest (mean 3.9 and SD= 5.7). The
clock-reading subtest presented the most homogenous results among the elderly
(mean 12.5 and DP=3.4). In regard to time required, the average CT time was
4.9 minutes.

The sample's cognitive
performance when assessed by the MMSE averaged 25.3 (SD=3.3), median 26, and
mode 27. At the 23/24 cut-off point, 21.2% of the sample was cognitively impaired.

In the Block Design
subtest, the mean was 16.9 (SD=8,4), median 16, and mode 12. In the Digits subtest
the mean was 11.4 (SD=3.5), median 11, and mode 10.

Performance according
to the CES-D averaged 15.8 (SD=8.5), median 12, and mode 10. At the cut-off
point of 16, which was suggested by the Brazilian study on the validation of
the scale, 31.5% of the sample would be suspected of depression.

Correlations among
CT subtests, and the latter test and the following MMSE, Block Design, Digits,
and CES-D tests can be seen in Table 2.

Internal consistency
among CT subtests, according to Cronbach's alpha, was 0.773.

DISCUSSION

This is the first
Brazilian study which applied the CT to a sample of the elderly population,
by translating, operationally adapting, and validating the construct.

Herdman et al7
suggest that in a universalist approach to transcultural adaptation it is necessary
to investigate the conceptual, item, semantics, operational, measurement and
functional equivalences of the tool.

In the present
study we did not consider carrying out all the stages proposed by these authors
to be necessary for the following reasons: applying the CT is simple, there
is no semantic diversity, and it is not a scale of written items. And this is
because the CT consists of straightforward instructions for tasks which do not
require subjective interpretation on the part of the examiner or the examinee.
On the other hand, we adopted the conceptual equivalence of the CT in the Canadian
and Brazilian cultures, where the concept of hours and minutes as a measurement
of time and the use of analogical clocks as a means of measuring time are basic
premises.

For the same reason,
our initial impression was that it would not be necessary to have one group
of independent translators carry out the translation and another group for back
translation. During pre-testing, examinees said that the test instructions were
easy to understand, and there were no problems in performing the task after
instructions were given for the first time. It seems that the lack of difficulties
confirm our initial hypothesis concerning the kind of translation adopted. However,
it is possible that the lack of difficulties in understanding the instructions
may have resulted from the fact that the individuals in the sample had a higher
schooling rate than rates in other Brazilian population studies.10
Thus, conducting focus groups with elderly individuals with different schooling
rates, would probably yield more objective elements to support our hypothesis.

However, the possibility
of adopting similar forms of the instrument, concerning format, instructions,
applying and checking - operational equivalence -, was an essential task. Due
to the high costs of the original CT version, adapting the test operationally
was essential because of the limited resources available for this study. In
any case, operationally adapting cognitive tests in general is also necessary
to ensure the high cost of these tests does not prevent them from being acquired
and employed in Brazil's public health services. The adapted version of this
study seems not to have interfered in the results, because the construct validity
found was similar to the Canadian one.

The average CT
test time (4.9 minutes) positions the tool as a "short" monitoring instrument.
9 Another positive aspect is that, due to its simplicity and speedy
application, the CT can aid cognitive impairment diagnoses in places where there
is shortage of time and of qualified professionals in cognitive assessment.
However, no other studies were found on measuring the time of the CT, thus,
this study is the first to investigate this variable.

As our first approach
to the CT measurement equivalence, we examined its construct validity. Seeing
that this is the first Brazilian study on the tool, we believed that investigating
a population-based sample would be more interesting because it would enable
us to describe its construct validity and performance profile in a sample of
individuals who were supposedly not cognitively impaired. Based on the data
gathered, the normative data will be addressed in future papers. In the future,
as soon as clinical data on the cognitive performance of the individuals in
the sample are available, criterion validity will be addressed.

The correlation
with other cognitive assessment scales that have already been validated in Brazil
(MMSE and WAIS-III subtests) suggests that the CT is able to measure what it
aims at measuring. We found positive and significant correlations, however they
presented weak to moderate association strength. This leads us to believe that
maybe the constructs assessed by the CT, MMSE, Digits and Block Design are similar,
but not identical. The Digits subtest is the most specific to assess working
memory, and Block Design for assessing visuospatial planning. This is a commonly
found problem in psychometrics, seeing that there are no specific tests to assess
identical constructs. However, other studies22,23 assessing the correlation
between these subtests also found weak to moderate association strength. This
may show that maybe the association among these variables truly behaves in the
fashion found in this study, since there really are differences among the constructs
individually assessed per instrument.

On the other hand,
Kurzmanc states that, in
general, convergent validity correlations are high when they are calculated
based on clinical samples, contrary to what is found in population samples.
This may happen because the score variation in normal individuals is limited.
The Canadian CT study also found moderate associations for its correlations.23

In the discriminant
validity, there was no correlation between clock drawing and clock reading subtests
with the measure for symptoms of depression. In the Canadian CT study, none
of the subtests were correlated to the mood scale.23 However, in
this study, the clock setting subtest presented a negative and significant correlation
with the CES-D (r=-0.118; p< 0.05), meaning that the more symptoms
of depression individuals have, the worse their performance in this subtest.
This may be explained by the fact that this subtest requires more attention
capacity, which is frequently altered in individuals suffering from depression.
However, the strength of this correlation and the CES-D and the clock setting
subtest was very weak, accounting for a spurious association.

In conclusion,
the translated and adapted CT applied to a population sample of non-clinical
Brazilian elderly individuals proved to be a short cognitive monitoring tool,
which presented good construct validity when compared to other data in the literature.
Its psychometric features concerning content validity and criterion validity
deserve to be addressed in future studies, as do its sensitivity and specificity
values.