Translation of Hedges in Medical Databases to Other Platforms’ Syntax May Cause Significantly Different Search Results

Heather Ganshorn

Abstract

Objective – To determine whether the methodological search filters in OvidSP MEDLINE and OvidSP EMBASE also known as Clinical Queries hedges had been modified from the originals which were written by the McMaster University Health Information Research Unit Hedges Group (the Haynes Group) and whether the translations of these hedges by the National Library of Medicine used in PubMed and EBSCO MEDLINE were reliable. The hedges examined are for the clinical categories of diagnosis, therapy, etiology, prognosis, clinical prediction guides, and reviews. The author also examined the translated National Library of Medicine (NLM) Systematic Reviews hedges in OvidSP MEDLINE and EBSCO MEDLINE.
Design – Validity of hedges used in various databases.

Subjects – The Clinical Queries hedges designed to facilitate enhanced retrieval of particular types of studies in the above-mentioned databases were compared.

Methods – The author ran the Clinical Queries hedges in OvidSP MEDLINE, OvidSP EMBASE and PubMed. Next, she manually entered the original Haynes Group published
hedge search strings for each clinical query in these databases, and compared the results to the Clinical Queries. The author also compared the results obtained from the Ovid MEDLINE Clinical Queries versus the hedges in PubMed and EBSCO MEDLINE. The percentage difference in number of hits between the Ovid platform and the other platform was calculated. Where the difference was greater than 10%, the author modified the search string and re-tested it. There was no gold standard for comparison, so it was not possible to make calculations such as sensitivity, specificity, precision, or accuracy.

For the testing of the Review hedges, the author used the Cochrane Database of Systematic Reviews as a gold standard to compare search results. She also compared the results in OvidSP MEDLINE to the results in EBSCO MEDLINE and PubMed.

Main Results – Comparing the 27 OvidSP Clinical Queries limits to the equivalent Haynes search strings, the author found identical results, suggesting that the OvidSP hedges have not been changed from Haynes’ original search strings. However, when the OvidSP MEDLINE hedges were compared to PubMed and EBSCO, there were discrepancies. If the hedges were translated exactly, one should expect the result sets to be nearly identical, with the exception of records that had not yet been uploaded to OvidSP and EBSCO (PubMed contains records that are not yet fully indexed).

However, other problems became evident. While the majority of searches yielded similar numbers of records, there were discrepancies of >10% in the number of hits for five of the Clinical Queries. Some of the hedges involved truncated search terms that, in PubMed, generated a message indicating that only the first 600 variations of the word root would be used. The author modified these hedges in order to obtain potentially more accurate results, though as she does not have a gold standard set for comparison, the modified hedges could not be thoroughly evaluated. Three of the EBSCO MEDLINE Clinical Queries hedges also generated significantly different results from OvidSP MEDLINE. The author was able to modify these hedges to generate similar results to those found in PubMed.

The author’s examination of the various systematic review hedges identified other problems. For these hedges, it was possible to use the Cochrane Database of Systematic Reviews as a simple gold standard to assess the reliability of these filters. The Haynes Clinical Queries Review hedge is used in OvidSP EMBASE. The author found that this hedge’s sensitive filter retrieved 100% of the Cochrane Reviews, while the optimized filter retrieved all reviews but one. However, the specific filter retrieved only 16% of the Cochrane reviews. The author notes that the Haynes hedges were developed using a subset of journals that did not include the Cochrane Database of Systematic Reviews.

The Clinical Queries Review hedge in MEDLINE appeared to have better results. In OvidSP, the sensitive and optimized hedges found all but one record, while the specific hedge found 83% of the records, a result that was mirrored in EBSCO MEDLINE and PubMed.

Conclusion - Users of OvidSP MEDLINE can be confident that the Clinical Queries limits are true translations of the hedges published by Haynes et al., as they were found to give identical results to manual entry of these hedges. However, users cannot be confident that these queries will give the same results in PubMed, due to differences in syntax between the two interfaces. Users of EBSCO MEDLINE can be less confident that the Clinical Queries have been perfectly translated from the original Haynes queries, as three of these queries were found to yield significantly different results from the OvidSP MEDLINE search. The author recommends that OvidSP be the search interface of choice when using these hedges in MEDLINE.

The National Library of Medicine’s (NLM) Systematic Reviews hedge has been translated into OvidSP and EBSCO, but has never been validated. The author found significant errors in this hedge in the OvidSP version, which were rectified after she contacted Ovid. However, Ovid was reluctant to share its translation of the hedge, as this is proprietary information. The author recommends that for this reason, it is best to use PubMed to search for systematic reviews, as the search string for its hedge is publicly available. The author also notes that this issue of proprietary information is very problematic for librarians, as it makes it impossible for them to assess the hedges they are using from vendors, or to identify the source of the problem when they get unusual results.