This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

The Foot Function Index (FFI) is a self-report, foot-specific instrument measuring
pain and disability and has been widely used to measure foot health for over twenty
years. A revised FFI (FFI-R) was developed in response to criticism of the FFI. The
purpose of this review was to assess the uses of FFI and FFI-R as were reported in
medical and surgical literature and address the suggestions found in the literature
to improve the metrics of FFI-R.

Methods

A systematic literature search of PubMed/Medline and Embase databases from October
1991 through December 2010 comprised the main sources of literature. To enrich the
bibliography, the search was extended to BioMedLib and Scopus search engines and manual
search methods. Search terms included FFI, FFI scores, FFI-R. Requirements included
abstracts/full length articles, English-language publications, and articles containing
the term "foot complaints/problems." Articles selected were scrutinized; EBM abstracted
data from literature and collected into tables designed for this review. EBM analyzed
tables, KJC, JM, RMS reviewed and confirmed table contents. KJC and JM reanalyzed
the original database of FFI-R to improve metrics.

Results

Seventy-eight articles qualified for this review, abstracts were compiled into 12
tables. FFI and FFI-R were used in studies of foot and ankle disorders in 4700 people
worldwide. FFI Full scale or the Subscales and FFI-R were used as outcome measures
in various studies; new instruments were developed based on FFI subscales. FFI Full
scale was adapted/translated into other cultures. FFI and FFI-R psychometric properties
are reported in this review. Reanalysis of FFI-R subscales' confirmed unidimensionality,
and the FFI-R questionnaires' response categories were edited into four responses
for ease of use.

Conclusion

This review was limited to articles published in English in the past twenty years.
FFI is used extensively worldwide; this instrument pioneered a quantifiable measure
of foot health, and thus has shifted the paradigm of outcome measure to subjective,
patient-centered, valid, reliable and responsive hard data endpoints. Edited FFI-R
into four response categories will enhance its user friendliness for measuring foot
health.

Keywords:

Background

Foot problems commonly arise during our daily living activities
[1,2]. The prevalence of foot problems in general ranges between 10% and 24%
[3]. Their prevalence is higher among older individuals and in chronic rheumatoid arthritis
(RA), gout, and diabetes mellitus with peripheral neuropathy
[4]. Foot pain and disability can affect workers’ productivity, work absenteeism, and
other issues
[5,6]. Because pain and disability are subjective complaints, they are difficult to quantify
without a valid patient report of the degree to which an individual is experiencing
foot pain. Without a valid measure, problems arise in documenting foot health status,
tracking the progression of diseases, and establishing the efficacy of treatment,
including assessment of treatment satisfaction and of health related quality of life
from a personal perspective.

In 1991, the Foot Function Index (FFI) was developed as a self-reporting measure that
assesses multiple dimensions of foot function on the basis of patient-centered values.
The FFI consists of 23 items divided into 3 subscales that quantify the impact of
foot pathology on pain, disability, and activity limitation in patients with RA
[7]. The FFI was developed using the classical test theory (CTT)
[8] method. It has been found to have good reliability and validity and has had wide
appeal to clinicians and research scientists alike
[3,9,10]. In the past 20 years, the FFI has been widely used by clinicians and investigators
to measure pain and disability in various foot and ankle disorders and its use has
expanded to involve children, adults, and older individuals. Furthermore, the FFI
has been widely used in the study of various pathologies and treatments pertaining
to foot and ankle problems such as congenital, acute and chronic diseases, injuries,
and surgical corrections.

In 2006, the FFI was revised (the FFI-R) on the basis of criticisms from researchers
and clinicians; items were added, including a scale to measure psychosocial activities
and quality of life related to foot health
[11].

A literature review was conducted to develop a theoretical model of foot functioning
[12], based on the World Health Organization International Classification of Functioning
(ICF) model. The FFI-R items were developed from the original 23 FFI items, and more
items were added as a result of the literature review. As a result of clinicians and
patients’ input, the final draft of the FFI-R, which consisted of 4 subscales and
68 items, was completed. The results were the FFI-R long form (FFI-R L; 4 subscales
and 68 items) and the FFI-R short form (FFI-R S; 34 items) as total foot function
assessment instruments. Both the 68-item and 34-item measures demonstrated good psychometric
properties.

The FFI-R in its current form is one of the most comprehensive instruments available.
However, in a review article
[13], questions were raised about the unidimensionality and independence of FFI-R subscales,
and we did not include such reports in our previous article about the FFI-R
[11]. We carefully reviewed the comments about the FFI-R and assessed the unidimensionality
of the subscales by use of the Rasch model. On the basis of these critiques, the FFI-R
required a periodic revision of its metrics to ensure it represented patient-centered
health values and state-of-the-art methodology.

Our aim is to assess the contribution of the FFI and FFI-R to the measurement of foot
health in the fields of rheumatology, podiatry, and orthopedic medicine. This assessment
should enable us to reflect on and improve the quality of the measure. Therefore,
we conducted a systematic review of literature pertaining to the FFI and FFI-R that
has been published in the English language from October 1991 through December 2010.
The objectives were to: (i), Assess the prevalence of uses of the FFI and FFI-R in
clinical studies of foot and ankle disorders; (ii), Describe the utility and clinimetric
properties of the FFI and FFI-R as they have been applied in various clinical and
research settings; (iii), Enumerate the strengths and weaknesses of the FFI and FFI-R
as reported in the literature; (iv), Address the suggestions found in the literature
for improving the FFI-R metrics.

Methods for systematic search of the literature

This study was about a systematic review of articles in which the FFI and/or FFI-R
were used as measures of a variety of foot and ankle problems. Relevant studies were
identified by English language publication searches of the electronic bibliographic
databases Pub Med/MEDLINE, EMBASE, BioMedLib and Scopus from October 1991 through
December 2010.

Search terms and eligibility criteria

The key words: foot function index, FFI scores, foot function index scores, and foot function index revised (FFI-R).were used as search terms and was applied to all databases. FFI instruments/measure and/or FFI-R instruments/measure had to be mentioned in the abstracts and in the full articles to be collected for
in-depth scrutiny. Articles fulfilling the inclusion criteria were selected for the
review. The article criteria included: (i) the words foot function index/FFI or revised foot function index/FFI-R in its reports/measures; (ii) full-length articles; (iii) written in English and
published from October 1991 through December 2010; (iv) the study population described
needed to have foot complaint(s)/problems; and (v) regardless of the country conducting
the study, the full-length article must have been published in English or in a foreign
language with the abstract in English.

Objectives with method of data collection and organization of tables

Selected articles that fulfilled the criteria were independently reviewed and collected
by the authors to address the objectives and organize collected data into several
tables.

Objective 1. Uses of the FFI and FFI-R

We created four tables to address the first objective of describing the measurement’s
uses (Tables
1,
2,
3, and
4).

Objective 2. Utility and clinimetric properties

We designed a data-collection form to address the second objective. This form was
assessed in a pilot study by collecting data from ten articles out of the collection
of qualified articles; it was revised before being used in its current format. The
variables used in this data-collection form were: (i) the instrument and year the
article was published; (ii) the first author’s name; (iii) the objectives of the study;
(iv) the population characteristics, sample size, and diagnosis; (v) psychometric
analysis (reliability and validity, etc.); (vi) items/domains/subscales of the FFI
or FFI-R used in the study; (vii) response type; and, (viii) a short summary evaluation
of each study. Therefore, this data form recorded the analytic statements extracted
from each article, and 6 tables were created (Tables
5,
6,
7,
8,
9, and
10). Data were arranged in each table in chronological order.

Objective 3. Enumerate the strengths and weaknesses of the FFI and FFI-R as reported
in the literature

This was a qualitative summary of the results as found in Table
5 and Table
6.

Objective 4. Improving the FFI-R metrics

Table
11 summarizes results of the Rasch analysis. This was a reanalysis of the FFI-R database
collected in 2002 with the aim of improving FFI-R metrics.

Table 11.Reliability and unidimensionality of the full scale, short form, and subscales

Descriptive analysis methods

Quantitative data were reported using simple statistics expressed as the sum, means,
and standard deviations for continuous variables and as frequencies for categorical
data. (Tables
1,
2,
3, and
4) Analytic statements and evaluations/comments for each article collected are summarized
in Table
12. This depicts the summary of FFI and FFI-R uses as illustrated in Objective 2, and
in six tables (Tables
5,
6,
7,
8,
9 and
10).

Table 12.Summary of FFI and FFI-R uses as provided in detail in Tables 5-10

Rasch analysis method

To address specific critiques of the FFI-R found in the literature, the unidimensionality
of the FFI-R and its subscales were evaluated against the Rasch model. The statistical
package Winsteps version 3.72.3
[14] was used to conduct a principal components analysis (PCA) of the standardized residuals
to determine whether substantial subdimensions existed within the items
[15-17] and whether the FFI-R L, the FFI-R S, and the 5 subscales were unidimensional. The
criterion used to define unidimensionality was a large variance (> 40%) explained
by the measurement dimension
[18]. Unexplained variance in the first contrast of the data should be small and fall
under the criterion of 15% for a rival factor. We chose a ratio of variance of at
least 3 to 1 in the first principal component
[19], compared to the variance of the first component of residuals.

Rasch reliability statistics

Reliability was estimated with Cronbach’s Alpha and Rasch person reliability statistics.
Both indices reflect the proportion of variance of the person scores or measures to
total variance (i.e., including measurement error). Unlike Cronbach’s Alpha, Rasch
person reliability is based on the estimated locations of persons along the measurement
continuum, excluding those with measures reflecting extreme (zero or perfect) scores
and including cases with missing data. For both indices, our criterion for acceptability
was .80.

Response category analysis

One requirement of the Rasch model is monotonicity: the requirement that, as person
ability increases, the item step response function increases monotonically
[20]. This means that choosing one categorical response over the prior—for example, moving
from selecting “2 = A little of the time,” to selecting, “3 = Most of the time,”—increases
with person ability. The proper functioning of the rating scale is examined using
fit statistics, where: (i) outfit mean squares should be less than 2.0, (ii) average
measures advance monotonically with each category, and (iii) step calibrations increase
monotonically
[21,22].

Results

Review of the literature

Articles were obtained by using the search method defined in the Methods section;
the search results included 752 articles from PubMed/MEDLINE and 640 articles from
Embase. Further screening and selection procedures, as detailed in Figure
1, yielded 182 full-text articles. Of these, 53 articles were qualified for review.
Twenty-five more articles were obtained from the search engine BioMedLib and from
manual searches. A total of 78 articles qualified for this review, summarized and
categorized into several tables,

Objective 1: Assessment of the prevalence of the FFI or FFI-R usage, population characteristics,
and study locations

Among the 78 studies, we identified 4714 study participants for whom the FFI or FFI-R
instrument had been used to measure foot health. This sample consisted of 1914 (41%)
male participants and 2688 (57%) female participants, with a mean age of 48.58 years
(SD, 4.9 years). There was a discrepancy of 2% between the sums of male and female
participants, because gender was not reported in three studies (Table
1). Most of the participants were individuals and young adults, and a few studies involved
juvenile participants. The types of studies included measurement practice studies
(n=17), surgery studies (n=30), studies of orthotics (n=19) or other clinical interventions
(n=4), and observational studies (n=8). We identified 20 different diagnoses of foot
and ankle pathology that were measured by FFI and FFI-R (Table
2). Among them, RA and plantar fasciitis were the two most common diagnoses and were
also noted to be the most painful and disabling foot conditions. These studies were
conducted by investigators in 17 countries; the United States, the Netherlands, and
the United Kingdom were the three most frequent users of the FFI and FFI-R in studies
involving foot and ankle problems (Table
3).

Table
4 displays the versatility of the FFI with all 3 domains and FFI Subscales and FFI-R
uses across the studies. This shows that clinicians and researchers were choosing
the FFI scales depending on the nature of their studies. Among the various scales
of the FFI, we found the FFI with all 3 domains (full scale), the FFI pain subscale
only, and a combination of the pain and disability subscales to be the most frequently
used, whereas the FFI-R was the least frequently used. The Dutch adaptation of the
FFI, the FFI-5pts, was mostly used in the Netherlands as an outcome measure in studies
of many surgical interventions.

In summary, the FFI with all 3 domains, or as subscales, was frequently chosen as
a measurement instrument across various studies and countries and among various age
groups and sexes, for the assessment of acute and chronic foot and ankle conditions.

Objective 2: Uses of the FFI and FFI-R in the field of foot health research

The uses of the FFI and FFI-R are provided in detail in Tables
5,
6,
7,
8,
9, and
10. Table
12 describes the study types, the name of the instruments, and the first author’s name
and the reference number. The studies are grouped by how the instruments were used
and ordered chronologically within group.

Measurement, validation and cultural adaptation

Table
12 describes the utility of the FFI and FFI-R in studies of foot function measures and
includes 17 articles. Category A New Instruments. Includes four articles in which foot health measures are described including the
original FFI
[7], the FFI-R
[11]. The FFI Side to Side was derived from pain and disability subscales of the FFI
[23]. The Ankle Osteoarthritis Scale (AOS)
[24]; measured foot problems related to foot and ankle osteoarthritis. Agel et al.
[25] modified the rating scale of the FFI pain and function subscales from the visual
analog rating scale (VAS) to the Likert categorical scale; this modification was tested
in a sample of individuals with non-traumatic foot complaints, and the metric of the
Likert scale was valid. Category B FFI as Criterion Validity. Articles in this category describe several health measures and use the FFI full
scale or subscales to validate these measures. Bal et al.
[26] found a strong correlation of FFI scores and scores of RA functional measures: the
Health Assessment Questionnaire (HAQ) and Steinbrocker Functional Class (SFC). SooHoo
et al.
[27] found that the Rand 36-Item Short Form Health Survey (SF-36) scores of a sample of
individuals with foot and ankle disorders were moderately correlated with FFI scores
and concluded that FFI scores can be used to monitor the quality of life of these
patients. Shrader et al.
[28] measured the stability of navicular joint alignment and found that this measure correlated
well with the FFI scores of the sample. Helliwell et al.
[29] developed a new measure, the Foot Impact Scale (FIS), to measure the impact of foot
problems on foot health in a sample of individuals with RA; the metric of FIS was
validated with the FFI and HAQ. In an RA study, van der Leeden et al.
[30] reported that Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC)
and Disease Activity Scores in 44 joints (DAS 44) were correlated with FFI scores;
furthermore, this author discerns the correlations that the FFI pain subscale scores
correlated with forefoot pain while the FFI function subscale scores correlated with
hindfoot problems. The FFI scores were also used as validation measures of the American
Orthopedic Foot and Ankle Society (AOFAS) clinical rating scales, an instrument that
was widely used by foot and ankle surgeons
[31]. These validation studies were reported by Baumhauer et al.
[32] for the AOFAS hallux clinical rating scale and by Ibrahim et al.
[33] for the AOFAS clinical rating scale, which was well to moderately correlated with
FFI scores. The latter finding was based on his study with a 41% response rate in
a sample consisting of 45 individuals. Category C Cultural Adaptation or Translation. The first translation of the FFI was the Dutch-language instrument known as Dutch
FFI-5pts
[3]. The German-language translation of the instrument is the FFI-G
[34]; the FFI was also translated into Brazilian Portuguese
[35], Taiwan Chinese
[36], Turkish
[26], and Czech
[37]. There was also a Spanish translation conducted by the MAPI Institute in Lyon, France
[38]. These translations complied with rigorous language translation procedures; occasionally,
some item adjustments of the scales were needed. In summary, the FFI was developed
with good reliability and validity; it also inspired and served as criterion validity
for newer foot health measures and attracted the attention of researchers around the
world, who conducted translations and adaptations of the tool into their native languages
and cultures.

Table
6 is a supplement to Table
5 and displays the clinimetrics of the instruments listed in Table
5; measures were metrically good, with reliability and validity values greater than
0.7 with one exception where the pain subscale had a reliability of 0.64
[3].

Surgical intervention

The FFI is one of the outcome measures most frequently used by AOFAS members
[31]. It was first used to measure surgical outcomes. The surgical interventions and outcomes
are summarized in Table
7. There are 30 articles, categorized generally according to type and location of surgical
procedure. Five distinct procedural categories were identified as follows: (a) arthrodeses
within the foot or ankle
[39-47], (b) arthroplasty within the foot or ankle
[48-51], (c) fracture care of the foot or ankle
[52-55], (d) deformity reconstruction surgery of the foot or ankle
[56-60], and (e) various surgical interventions for chronic conditions
[61-64]. The FFI was also used to assess outcomes of less invasive procedures, such as calcaneal
spur treatment by arthroscopy
[37], distal tibia repair using fixation with cannulation osteosyntheses
[65], arthroscopic chondrocyte implant of the tibia and fibula
[66], and surgical interventions for complex ankle injuries
[67]. In summary, the FFI and the Dutch FFI-5pts appeared to be useful in measuring outcomes
of various surgical procedures in children, adults, and individuals with acute, chronic,
and congenital foot and ankle problems.

Orthotic interventions

Table
8 lists studies using foot function outcome measures in orthotic interventions in the
foot and ankle. The studies assessed the impact of orthotic treatment on forefoot,
midfoot, and hindfoot/ankle pathology. Orthotic treatment on the forefoot in patients
with RA improved the scores for pain, disability and activities
[68,69], however the scores were unchanged in the study by Conrad et al.
[70]. Other studies using special shoes and shoe inserts showed symptoms of relief in
hallux valgus pain
[71] hindfoot and forefoot problems
[72,73]; and slowing the progression of hallux valgus in early RA
[74]. Midfoot studies assessing the treatment of full length orthoses on pain relief
[75], and mobility were performed using the FFI-R as an outcome measures
[76]. For hindfoot conditions treatment with orthoses included studies of heel pain
[77], plantar fasciitis
[35,78,79], stabilizing hindfoot valgus
[80], correction of posterior tibialis tendon dysfunction
[81], destructive hemophilic arthropathy of the foot and ankle
[82] and juvenile idiopathic arthritis of the foot and ankle
[83]. Shoes/shoe inserts have also been found to relieve foot and ankle pain from arthritides
[84,85]. In summary, the FFI and FFI-R clearly provided useful outcome measures for orthotic
management of a wide range of foot and ankle disorders.

Medical intervention

The FFI also was used to measure foot health outcomes associated with medical interventions
(Table
9), such as cortisone injection of the ankle adhesive capsulitis
[86]; the injection resulted in improved FFI pain and disability subscale scores. Di Giovanni
et al.
[87] measured the outcome of stretching exercises for plantar fasciitis versus Achilles
tendonitis; both groups showed improvement in FFI pain subscale scores. Kulig et al.
[88] used the FFI pain and disability subscales to measure the outcomes of exercise intervention
in posterior tibial tendon dysfunction. Rompe et al.
[89] reported the FFI pain score improved in the stretching treatment group of a randomized
clinical trial using stretching and shockwave therapy to treat patients with plantar
fasciopathy. Overall, the FFI was useful in measuring the outcomes of conservative
interventions in chronic foot and ankle conditions.

Observational studies

Investigators had chosen the FFI scores or the subscale scores to determine the prevalence
and disease burden of foot and ankle conditions in the general population (Table
10). Novak et al.
[4] used FFI scores to evaluate type 2 diabetes with and without neuropathy and identified
that group with neuropathy had worse FFI scores. Williams and Bowden
[90] correlated high FFI scores to foot morbidity in rheumatic diseases, and estimated
cost of care/staffing concerns for that patient subset. Williams
[91] also used the FFI scores in patients with Paget’s disease and noted the impacts on
plantar foot pressures, gaits, and ambulation abilities. Kamanli et al.
[92] correlated the scores of the FFI and foot bone mineral density, then extrapolated
these scores to that individual’s skeletal bone density. Kavlak and Demitras
[93] reported a strong correlation of FFI scores with the scores of VAS pain scale, foot
pain scale (FPS), and hindfoot function scale (HFS) in patients with foot problems.
Goldstein et al.
[94] noted that FFI scores of individuals with previous foot injuries had a high correlation
with 6 other foot function instruments. Rosenbaum et al.
[95] found that plantar sensory impairment of the foot in patients with RA was correlated
with poor FFI scores. Schmiegel et al.
[96] found that pedobarograph scores of patients with RA with foot pain were correlated
with poor FFI and HAQ scores. In summary, FFI scores were useful in detecting the
prevalence of foot and ankle problems and as a measure of concurrent validity for
other foot health measures in various chronic foot conditions.

In all, we found the FFI instrument was frequently chosen as an outcome measure of
surgical, orthotic, and medical treatments, but its application was wider than we
originally imagined. It was not limited to outcome measures; FFI scores were also
applied in the promotion of foot health as a common public health issue and in increasing
the awareness of health system administrators. The FFI was also used in the validation
of newly developed foot health measures.

Objective 3: The strengths and weaknesses of the FFI and FFI-R as reported in the
literature

FFI: The FFI questionnaire had good psychometric properties
[97-100], and the pain subscale was sensitive to change during instrument development
[13]. In a study about treatment of plantar fasciitis in individuals with chronic foot
pain, SooHoo et al.
[64] reported that the pain subscale of the FFI had high standard response mean (SRM)
and high effect size (ES) as outcome measures of surgery in chronic foot and ankle
problems. While Landorf and Radford measured the clinical ability to detect a change
as minimal important difference (MID) in plantar fasciitis
[101]. All these clinical measures add to the credibility of the FFI as a self-reporting
measure, the FFI reflects patients’ assessment of their symptoms/health status, which
directs providers about proper care planning and progress toward treatment goals.
FFI is one of the most cited measures of its kind
[102].

There are weaknesses of the FFI. During the development of the index, clinicians generated
the questionnaire items without patient participation
[13,97]; therefore, items might not fully reflect patients’ needs, might be sex biased
[7], and might not be applicable to high-functioning individuals. A theoretical model
was not part of the design, nor were the items related to footwear
[13,103], which are essential to support the construct of this instrument. It is also lacking
items for measuring quality of health and satisfaction with care; however, these items
can be appended as a global statement in the questionnaire. In all, the FFI has been
the most studied and widely used foot-specific self-reporting measure; however, further
testing by gender, age, race, language, etc. would provide assurance of its generalizability.

FFI-R: The FFI-R was developed in response to criticism of the FFI and to address issues
of contemporary interest. Most original items from the FFI were selected in the development
of FFI-R, and new items about footwear and psychosocial factors were added, which
improved its construct coverage. Patients and clinicians were involved in the generation
of items. Its design closely followed the ICF theoretical model
[13]; its psychometric properties are strong and are based on the IRT 1-parameter or the
Rasch measurement model. It was designed to be a comprehensive measure of foot health–related
quality of life, with both long and short forms
[99], allowing clinicians and researchers to choose the measures they need for the intended
study. Although the FFI-R did not include information on clinical ability to measure
change in its development, Rao et al.
[75,76] did measure the minimal detectible change (MDC) and the effect size, in individuals
with midfoot arthritis, which also added to the credibility of its metrics.

The full scale and short form

For the FFI-R L (68 items)
[11], person reliability was high: 0.96, respectively. In the PCA, 56.8% of the variance
was explained by the measure, with only 10.6% of the variance explained by the first
factor of residuals. These findings support that the full FFI-R meets the unidimensionality
requirement of the Rasch model. Further, the criterion for unidimensionality was a
ratio of the raw variance in the first contrast of residuals that was 5.4 (i.e., greater
than 3). For the FFI-R S (34 items)
[11], person reliability was 0.95, similar to the reliability estimates of the FFI-R L.
The PCA of the FFI-R S revealed that unidimensionality criteria were also satisfied.
This supports the use of a short form of the measure, because the item response burden
on patients is lower, at 34 questions. Because this measure is as reliable as the
full measure, its use is supported for clinical settings.

Subscales

All subscales of the FFI-R had strong person reliability estimates (Table
11), ranging from 0.78 to 0.94 for person reliability. The PCA indicated that unidimensionality
held for each subscale, with the exception of the stiffness subscale. Further inspection
of the data revealed that the two-factor solution reflected groups of the low-severity
and high-severity items and was not the result of a competing factor. Unidimensionality
for the limitation subscale was met after dropping item 41 (ASSISTO), an item listed
in the FFI-R database. Overall, the subscales of the FFI-R satisfied unidimensionality
criteria and were reliable measures of the latent traits (Table
11).

Response category analysis

The response category analyses for each of the subscales (done after collapsing Categories
5 and 6) revealed that, for the first three subscales (pain, stiffness, and difficulty),
the response categories behaved as required by the Rasch model. However, for the subscales
of limitation and social issues (both of which are time scales), there was some indication
that respondents had difficulty distinguishing between, “2 = A little of the time,”
and, “3 = Some of the time.” We considered, then, collapsing these categories and
making all FFI-R subscales have four possible response categories. This would ensure
uniformity of the measure and decrease the burden on patient response. Therefore,
the first three subscales, which measure severity, “3 = Severe pain,” “4 = Very severe
pain,” and “5 = Worst pain imaginable,” were collapsed. This was justified because
all three captured the notion of severe pain. Overall, the analyses showed that the
response to each item functioned well with the four-item response categories.

Discussion

This review evaluated 78 eligible articles (Figure
1). In the past 20 years, it appears that the FFI and FFI-R were widely used across
national and international clinical and research communities. The instruments were
administered to over 4700 study participants of males and females worldwide, across
age groups, with 20 different diagnoses consisting of congenital, inflammatory/degenerative,
acute and chronic foot and ankle problems. The FFI was also incorporated into other
newer foot health measures
[23,24], and also underwent changes in the measurement scale from VAS to Likert scale such
as the one conducted by Agel et al.
[25]. The scale changes also occurred in FFI adaptation to the Dutch
[3], German
[34], and Taiwanese Chinese
[36] including our revised FFI-R
[11] to give a few examples. The strong metrics of FFI subscales and full scale (Table
12, Category A), facilitated the investigator’s choice to use its subscale(s) or full
scale in clinical or research applications as appropriate. The FFI was also frequently
used as validation criterion for other foot health measures (Table
12, Category B); this validation usage has elevated the credibility of the FFI as an
outcome measure for foot and ankle problems. Since the FFI was developed using CTT
procedures, it is sample and content dependent, therefore its metrics were tested
in many different samples, where its metrics were proven to be consistently strong.
The exception was in the study of Baumhauer et al.
[32] where high foot functioning was evident in the sample; therefore, investigators should
exercise caution in the interpretation of this result. While the FFI was developed
initially as disease specific for early RA, in later years, it was used in many non-RA
foot and ankle problems and was proven to be a valid measure as well. The FFI and
FFI-R were frequently used as outcome measures in surgical and clinical interventions
with positive results (Tables
7,
8,
9, and
10). The FFI scores were also used in many observational studies (Table
10) and those reports might be helpful for researchers and the health system administrators
in establishing a health policy. Although the FFI was extensively studied and generally
received positive ratings
[23,29,102], we realized the need for improvement in the measures of FFI and FFI-R and have discussed
this issues comprehensively under Objective 3 in this paper. We conducted a re-analysis
and made improvements to the metrics and scales of FFI-R as presented in Table
11 and questionnaires FFI-R Long Form (See Additional file
1), and Short Form (See Additional file
2).

In recent articles about FFI used as outcome measures, the authors have included the
clinical measures; the effect size, and standard response mean
[64], and minimal important difference
[101], while Rao et al. reports minimal detectible change and effect size of the FFI-R
[75], all these have increased the credibility of the clinical use of the FFI to help
in power analysis and sample size estimation for future studies.

Limitations of this review

Our literature search was limited to publications written in the English language
and covered only publications until 2010; therefore, this might exclude the FFI- and
FFI-R–related published articles not written in English, as well as those more recent
articles published in English.

Conclusions

The FFI pioneered measuring outcomes in foot health. This instrument has been tested
through time and adapted in its measures as it was frequently used in full scales
or subscales to measure outcomes in various clinical practice or research studies.
The FFI has also had a role in shifting the paradigm from a reliance on physical and
biochemical findings as outcomes to the use of outcomes that are relevant to patients.
Thus, the measure established patient-centered, valid, reliable, and responsive hard
data endpoints. The rating scales also underwent changes; for practicality and user-friendliness
in clinical and research settings. The FFI was recognized as a valid instrument and
used as a validation criterion of other measures. It was adapted and translated into
multiple languages. It was applied to all age groups, across genders and was useful
in measuring varied medical and surgical conditions.

In realizing the scope of FFI applications, we acknowledge the contributions of friends
and colleagues around the world who not only used the FFI in their studies but also
made adaptations and translations to make the FFI a versatile instrument in promoting
and maintaining foot health. The FFI-R has good psychometric properties and is available
in long and short forms for ease of clinical use. In response to findings in this
review, we conducted a rigorous analysis to strengthen the metrics of the FFI-R and
changed the rating scales to be more user-friendly and practical.

Both the FFI and FFI-R are in the public domain and permission to use them is free
of charge. They are available from the developers of these instruments and from the
AOFAS web site. These instruments are self-administered and are written at an eighth-grade
reading level. The FFI scores are interpreted as 0%-100% for each subscale and the
overall score. Higher FFI and FFI-R scores indicate poor foot health and poor foot
health-related quality of life. The FFI and FFI-R put minimal burden on respondents
and the questionnaires are not emotionally sensitive. The administrative burden is
also minimal and it does not require formal training to score or to interpret
[104]. Translations and adaptations are available in Dutch
[3], Taiwan Chinese
[36], German
[34], Turkish
[26], Brazilian Portuguese
[35], and Spanish
[38].

This review attests to the widespread use of foot health measures, and we have noticed
the advancement of foot health in general across diagnoses. It has been a privilege
for us to serve patients, clinicians, and researchers to fulfill the mission in improving
foot health through the use of the FFI and FFI-R. These instruments are available
for users, and can be downloaded as they are presented as electronic files.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

EBM, KJC, have contributed in drawing the concept and design of this paper, EBM initiated
the literature search, reviewed, scrutinized them, and collected the abstracts and
organized into tables. KJC, RMS and JM reviewed the tables and all authors participated
in drafting the manuscript. KJC and JM also reanalyzed the original FFI-R data and
revised the subscales and FFI-R response categories. All authors participated in revising
the manuscript and have given final approval of the version to be published.

Acknowledgements

The authors gratefully acknowledge the support from the Center for Management of Complex
Chronic Care, Hines VA Hospital, Hines, IL, USA. The paper presents the findings and
conclusions of the authors; it does not necessarily represent the Department of Veterans
Affairs or Health Services Research and Development Service. We are also grateful
to Cindi Fiandaca and the Hines VA medical library staff for assisting in the literature
search, Madeline Thornton for assisting in designing the tables, Leahanne Sarlo and
Mary Reidy for editing the manuscript.