Abstract

Objectives Comparison of PennHIP and a novel method to diagnose hip laxity, called the Vezzoni modified Badertscher distension device technique.

Methods In a total of 10 dogs, it was first assessed whether the distraction index (DI) from the PennHIP evaluation center could be reproduced by two individual observers. In the next two steps, the DI measurements made by the individual observers and the PennHIP evaluation center were compared with the laxity index (LI) measured on the Vezzoni modified Badertscher distension device view. Finally, the interobserver agreement of the DI, LI and Norberg angle was assessed and compared with classification criteria.

Results The results were similar for the first three comparisons: there was no evidence for bias, the relation between DI and LI was linear and the variability was small. A comparison of the interobserver agreement showed that the measurement variability for the NA was substantial, while the reproducibility for the DI and LI was equal.

Clinical Significance While the standard ventrodorsal hip extended radiograph is most commonly used for diagnosis and screening of canine hip dysplasia, it lacks sensitivity to diagnose laxity. To improve the identification of hip joint laxity, distraction-based radiographic techniques are helpful. The Vezzoni modified Badertscher distension device technique allows for a reliable in-house evaluation of canine hip joint laxity.

Keywords

Introduction

Canine hip dysplasia is a multifactorial disorder with prevalence estimates being influenced by a combination of factors.[1]
[2] Although the aetiology is not completely understood, increased laxity of the hip joint is the most frequent cause reported and usually results in secondary osteoarthritis.[3] While hip dysplasia can be suspected based on clinical symptoms, the actual diagnosis is confirmed radiographically. The most popular radiographic technique is the standard ventrodorsal hip extended radiographic view (VD view).[4] Aside from its clinical use, this VD view is also used as a screening tool against hip dysplasia by the Fédération Cynologique Internationale (FCI), the Orthopedic Foundation for Animals (OFA) and the British Veterinary Association/Kennel Club (BVA/KC).[5] Although the scoring systems and assessment protocols are not identical, evaluation of all three organizations is based on one radiograph per animal and breeding advice is based on this evaluation. As such, scoring hip dysplasia currently combines assessment of both the degree of hip joint laxity [determination of the amount of subluxation, commonly reflected by the Norberg angle (NA)] and the severity of secondary degenerative changes.

Alternative radiographic evaluation methods often divide the assessment into an evaluation of the hip joint laxity on the one hand and degenerative changes on the other hand. Examples of these so-called laxity-based diagnostic techniques are the half-axial position and its improved version, subsequently called the Vezzoni modified Badertscher distension device (VMBDD), the dorsolateral subluxation index, the subluxation index and PennHIP.[6]
[7]
[8]
[9]
[10] PennHIP is based on three radiographs, using the VD view to evaluate degenerative changes, a compression view to evaluate congruency and to determine landmarks for measurements and a distraction view to evaluate hip joint laxity.[10]
[11] In contrast to the ordinal grading systems of the FCI and OFA, PennHIP reports a distraction index (DI), measured on the distraction view, that is on a continuous scale (between 0 and >1) and relates the DI of the assessed animal to the laxity scores of that breed.[5]
[12]

While PennHIP is rather popular in the United States, it has not gained general acceptance in the rest of the world. The reasons may be manifold: a costly mandatory training and certification process, evaluation fees imposed by PennHIP, the obligation towards digital radiography and that a veterinarian always has to wait for the official PennHIP report. Alternative techniques that allow a complete and a correct in-house evaluation of the hip joint by trained clinicians might increase the popularity of laxity-based radiographic techniques.

The purpose of this study is to compare VMBDD to PennHIP by the three steps listed below:

Assessment of the agreement of DI measurements between a veterinarian and the PennHIP evaluation center (comparison 1).

Assessment of the agreement of the measurements made on the distraction views obtained with PennHIP and the distension views obtained with the VMBDD (comparisons 2 and 3).

Comparison of the interobserver agreement of the DI, the laxity index (LI) obtained with the VMBDD and the NA by comparing the results of two veterinarians (comparison 4).

Materials and Methods

Animals

This prospective method-comparison study was approved by the local ethical (Faculty of Veterinary Medicine, Ghent University, Ghent, Belgium) and deontological (Federal Public Service Health, Food Chain Safety and Environment, Brussels, Belgium) committee (EC2013_53, 23th of May, 2013). A total of 10 consecutive assistance and rescue dogs presented for obligatory orthopaedic screening at the Department of Orthopaedics and Medical Imaging at the Faculty of Veterinary Medicine (Ghent University, Belgium) were evaluated. All animals were premedicated with dexmedetomidine 5 µg/kg and butorphanol 0.2 mg/kg intravenously (IV), followed 10 minutes later by midazolam 0.2 mg/kg IV. Anaesthesia was induced with propofol 1 to 4 mg/kg IV for effect and further maintained with isoflurane vaporized in oxygen using a circle rebreathing system.

Radiographic Procedure

Radiographs were obtained in the same sequence: a VD, a compression view, a distraction view (PennHIP) and a distension view with the VMBDD. All laxity radiographs were taken by the same PennHIP-certified veterinarian.

Vezzoni Modified Badertscher Distension Device Technique

As previously described, the dog was positioned in dorsal recumbency and the distension device was placed between both hindlimbs.[7]
[13] Both femurs were adducted against the distension device and slightly extended (±10° extension, compared with the neutral position) to expose the acetabulum. The tibiae were kept parallel and a medially directed pressure, subjectively similar to the amount of pressure used during the PennHIP distraction procedure, was applied. As such, the distension device acted as a lever that allowed demonstration of the laxity present in the hip joints. An example of the distension device and the positioning are shown in [Fig. 1]. An example of a radiograph taken applying the VMBDD is given in [Fig. 2].

Measurement

The PennHIP radiographs of each patient were submitted to the PennHIP evaluation center. In addition, two observers [one ECVS (European College of Veterinary Surgeons) diplomate and one experienced and PennHIP-certified veterinarian] were asked to measure the DI on the PennHIP view, the LI on the VMBDD view and the NA on the VD view in three separate sessions. Obtaining the LI starts by delineating the femoral head and the acetabulum with a circle. The distance between the centres of both circles is next divided by the radius of the circle around the femoral head to yield the LI. Throughout the study, both observers were unaware of the reports of the PennHIP evaluation center, each other's measurements and the animal to which the radiographs belonged. Each observer performed the measurements individually and according to the measurement guidelines previously reported.[10]
[14] To have a truly independent confirmation and result, no prior meeting was held to harmonize a measurement protocol and each observer used their own preferred measurement software (Observer 1 used Digimizer, MedCalc Software, Ostend, Belgium; Observer 2 used Keynote, Apple, Cupertino, California, United States).

Statistical Analysis

The statistical analysis was conducted in R (version 3.3.1, “Bug in your hair”). As detailed in the landmark paper of Bland and Altman in 1986, a comparison of methods entails an evaluation of the bias, which is defined as a consistent tendency for one method to exceed the other, and the variability, which is defined as the random variation.[15] To evaluate the bias, mixed models were used with patient and side (left or right) within patient as random effect and the observer (comparison 1) or technique (comparisons 2 and 3) as fixed effect. If a fixed effect in this model was significant, this implies a consistent bias. To evaluate the variability, random effects models were used with patient and side (left or right) within patient as random effect. The residual standard deviation (SD) of this model provides a direct value for the variability as defined earlier. All comparisons up until this point were done for each observer separately to evaluate whether the observed relations would hold independently. Finally, the interobserver agreement for the DI, LI and NA of the two observers was calculated (comparison 4). The bias was calculated with a mixed model with patient and side within patient as random effects and observer as fixed effect. The variability was calculated with a random effects model with patient and side within patient as random effects. Additionally, by dividing the obtained variability by the distance between two classification categories, the measurement variability is related to classification and can be directly compared for all three measurements. Throughout all analyses, the α-threshold was set at ≤0.05. In all mixed models, significance of the fixed effects was evaluated with a likelihood ratio test.

Results

Of the 10 dogs evaluated, 5 were Labrador retrievers, two Golden retrievers, one Border collie, one English springer spaniel and one crossbreed dog. The body weights ranged from 17.3 to 35.9 kg (median: 26.8 kg) and the age varied between 11 and 16 months (median: 13.5 months). The median DI, measured by the PennHIP evaluation center was 0.49 (range: 0.34–0.80). The median LI was 0.50 (range: 0.30–0.73) for observer 1 and 0.50 (range: 0.29–0.72) for observer 2.

Comparison 1 evaluated whether two observers were able to reproduce the PennHIP evaluation center DI results independently. As detailed in [Table 1], there was no evidence for a bias (p = 0.37 and 1), whereas the variability had a standard deviation of 0.03 or 0.04.

Table 1

Comparison 1: agreement of the distraction index (DI) measured by the observers with the DI values from the PennHIP evaluation center

In comparisons 2 and 3, the measurements of the two different laxity-based techniques were compared directly and with the original results of the PennHIP evaluation center ([Tables 2]
[3], [Fig. 3]) for each individual observer. In both cases and for both observers, no significant bias was found. A comparison of the variability of the different comparisons showed that the variability tended to increase from comparison 1 to 2 and remained stable thereafter.

Fig. 3 A direct comparison of the distraction index (DI) and laxity index (LI) measurements made by observer 1 (A) and observer 2 (B) on the radiographs obtained with the PennHIP distraction device (y-axis) and the Vezzoni modified Badertscher distension device (x-axis). The full line represents the line of equivalence (intercept = 0, coefficient = 1).

Table 2

Comparison 2: agreement of the DI measurements on the PennHIP distraction view (DI) with the VMBDD laxity radiograph (LI) as measured by the observers

When comparing the results of the two observers directly (comparison 4), a significant bias was found for the NA (4.76°, p < 0.001), but not for the DI (0.01, p = 0.48) and LI (< 0.01, p = 0.61). The variability was 4.15°, 0.04 and 0.03 for the NA, DI and LI, respectively. Based on the FCI classification, the difference in NA between ‘A’ (NA ≥ 105°) and ‘C’ hips (NA ≈ 100°) was at least 5° and the difference between ‘A’ and ‘E’ (NA < 90°) hips was at least 15°.[5]
[16] For the DI and LI, two cut-off points were used: the difference between minimal passive hip joint laxity (DI < 0.3), associated with a low probability of osteoarthritis development, and extreme passive hip joint laxity (DI > 0.7), associated with a high probability of osteoarthritis development, was at least 0.4.[17]
[18] The ratio of the variability with the difference was 83% (FCI: A vs. C) or 28% (FCI: A vs. E) for the NA. For the DI and LI, the ratios of the variability were 10% and 8%, respectively.

Discussion

While the VD radiographic view is the common denominator in screening programmes, it has been demonstrated that this technique lacks sensitivity to diagnose hip laxity.[11]
[19]
[20] This lack of sensitivity has been attributed to the positioning of the dog for the VD view, resulting in spiral tensioning of the non-elastic joint capsule.[19] Inadequate muscle relaxation when taking the VD view further conceals maximum laxity.[4]

While diagnosing laxity on the VD view is unreliable, secondary degenerative changes are readily identified, but severity of osteoarthritis is dependent on age and activity of the dog. In a longitudinal follow-up study, it has been shown that out of all dogs that developed osteoarthritis by the end of life, 78% developed it after 2 years and 63% only after 5 years.[21] Screening however can already be performed at a minimum age of 24 months (OFA), 12 to 18 months (FCI, breed dependent) and 12 months (BVA/KC). Combining these results, it is clear that the current screening programmes have several limitations: osteoarthritis has often not yet developed and laxity is underdiagnosed.

While the VMBDD is often used in Italy for early assessment of the degree of laxity in puppies, none of the laxity-based diagnostic techniques is currently incorporated in the screening programme of any cynological federation, even though they already date from the 1990s and the half-axial position even dates from the 1970s.[6]
[7] This might be because some of these techniques are relatively unknown among general practitioners and due to poor acceptance of laxity techniques by breeders. PennHIP was however actively promoted with trainings worldwide, but has not gained that much popularity outside the United States. In countries like the United Kingdom where manual restraint while taking a radiograph is only allowed in exceptional cases, it is difficult to implement these techniques, even though recently a hands-free method has been published.[22]
[23] However, these arguments probably only partially explain the unpopularity of laxity-based techniques. Other reasons might be the ones already addressed earlier in this paper: the obligation towards digital radiography, that, even after certification, a veterinarian has to await the official PennHIP report, is not allowed to do the measurements and the higher cost.

In this study, a novel technique to quantify laxity of the canine hip was compared with PennHIP. Overall, when comparing methods, a linear relation between the two techniques and a small variability is critical, whereas a significant bias is of less importance as it can be corrected easily. A complicating factor in this study is that a direct comparison of both techniques is difficult as the PennHIP evaluation center only evaluates radiographs obtained with the official PennHIP distractor. This was solved by applying a stepwise approach.

Critical for our laxity technique is that distraction indices can be measured reliably. In a direct comparison (comparison 1) of the PennHIP evaluation center DI and the observer's DI, no evidence for bias was found and the variability was comparable to a previous study.[12] The variability obtained in this comparison reflects the effect of the evaluator and provides a baseline value to compare the other results with. Based on these results, we concluded that the DI can be measured with sufficient confidence.

The next comparison (comparison 2) evaluated the bias and variability associated with a different technique while the evaluator remained identical. For both observers, the variability increased slightly. It can be expected that measurements on the same radiograph tend to be more alike compared with measurements on two different radiographs. The theoretically worst-case scenario was reflected in the penultimate comparison (comparison 3): it combines the effect of a different technique and a different evaluator as two potential sources for increased variability. The results were however unexpectedly good in every comparison: there was no evidence for bias and the variability tended to remain stable. The close and stable relationship between DI and LI is confirmed by the independent results of the two observers. The acceptable range of variability is open for discussion. However, based on the consistent results [identical SD for both observers and smaller SD (0.037) when compared with a previous publication (0.050)], we conclude that the LI approximates the PennHIP evaluation center DI closely.[12]

Ideally, a diagnostic criterion should always be unambiguous: there should be no disagreement in the measurements of two persons and the results should be sufficiently close to each other. This was assessed in the final comparison (comparison 4). All three laxity indices were measured independently by two observers using the published guidelines for measurement only without prior accord among the examiners about how to measure them. While this again reflects a worst-case scenario, it is however realistic: in everyday practice, it is unlikely that veterinarians from different practices will discuss the measurement method or use the same measurement software. For the NA, a significant bias was found, indicating that the measurements of both observers consistently differed. As this is a consistent error, it can be corrected for by adding or subtracting this error. For the DI and the LI, the bias was not significant. When the variability is considered, it is clear that the values obtained for the DI (unrounded SD = 0.039) and LI (unrounded SD = 0.034) are close to each other and that both are similar to the reported intraobserver variability and smaller than the reported interobserver variability.[12] For the NA, the variability was 4.15° (95% confidence interval: 3.13–5.86°), which is significantly different from a 1.7° intraobserver variability previously cited.[24] Important however is that the latter study was a study on repeatability, while here, two different observers measured the NA, without any prior discussion on how to perform the measurements. As such, a higher variability can be expected.

As the DI and LI are both unitless quantities, their variability can be compared directly. This is not the case for the NA which is measured in degrees. To solve this, variability ratios were calculated. These ratios have two benefits. First, they allow a direct comparison of the NA, DI and LI variability. In addition, they tell how variable the measurements are relative to what is clinically important to classify hips. Especially this second benefit is important as it reflects the usability of the measurement. Ideally, this ratio is small, indicating that the measurement variability is far smaller than what is clinically used in decision making. The results for the DI and LI are again (close to) identical, which is no surprise given the previous results. This ratio is far larger for the NA than for the DI and LI. Our results imply that the variability of the NA measurement is substantial. An unambiguous NA measurement protocol should be implemented a priori, while the DI and LI are both reliably reproducible and quite intuitive to measure.

A limiting aspect of this study is the fact that no relation between the LI and later severity of osteoarthritis was established.[25]
[26]
[27] In addition, future studies should include both large and small sized dogs.

Despite these limitations, we consider the results obtained with the VMBDD technique to be promising. The technique might resolve some of the obstacles that may have decreased the popularity of the laxity-based techniques. For PennHIP, three radiographs are necessary and the main function of the compression view is to obtain the landmarks necessary for accurate measurements.[10] In this study, however, it is demonstrated that this radiograph is not necessary: both the DI measured by the two observers and the VMBDD LI were close to the PennHIP evaluation center DI. The VMBDD is thus less expensive as two instead of three radiographs are made and it is more flexible in use as there are no restrictions in terms of doing the measurements or performing the technique. In addition, for this technique, only one skilled person is necessary, while for PennHIP two people are required. A final remark is that laxity-based diagnostic techniques always require at least deep sedation. While the choice for a certain chemical restraint to perform the procedure might still influence the result, it is conceivable that the influence will be far less compared with the present situation: when performing screening for the OFA, one can choose from the entire spectrum between awake and being fully anaesthetized. For the FCI and BVA/KC, the minimum requirement is deep sedation, although it is not implemented in every country (e.g., the Netherlands).[28]
[29]
[30]
[31]

To reduce the prevalence of hip dysplasia, we advocate the usage of laxity-based radiographs for screening with the following recommendations. Both the obtainment and the interpretation of the radiographs have to be standardized and scoring ought to be done by experienced scrutineers. Selection should always be performed carefully, with respect for the genetic characteristics of the target population. To accurately identify appropriate breeding stock, the phenotypical distribution in the population has to be determined, that is, submission bias is to be avoided at all cost. In addition, a dog is more than its hips alone: only when it is relevant, should hip dysplasia be taken into account, and when selection is performed, all other phenotypes relevant for that specific population have to be considered. As these aspects might be difficult to be implemented by individual breeders, we suggest breeding recommendations developed by a centralized committee that at least consults geneticists. Finally, breeding recommendations should be followed to get results.

In conclusion, the following were demonstrated:

The LI obtained with the VMBDD technique yields similar results as the PennHIP-based DI, measured by the PennHIP evaluation center.

The interobserver agreements of the PennHIP DI and the LI are similar and they both outperform that of the NA.