Article excerpt

This study is an examination of the overlap between author-assigned keywords and cataloger-assigned Library of Congress Subject Headings (LCSH) for a set of electronic theses and dissertations in Ohio State University's online catalog. The project is intended to contribute to the literature on the issue of keywords versus controlled vocabularies in the use of online catalogs and databases. Findings support previous studies" conclusions that both keywords and controlled vocabularies complement one another Further, even in the presence of bibliographic record enhancements, such as abstracts or summaries, keywords and subject headings provided a significant number of unique terms that could affect the success of keyword searches. Implications for the maintenance of controlled vocabularies such as LCSH also are discussed in light of the patterns of matches and nonmatches found between the keywords and their corresponding subject headings.

**********

The usefulness of controlled vocabulary has been debated for a number of years. The question has come even more to the forefront with the popularity of online tools such as Google and the use of keywords as users' primary search strategy. For libraries, the debate also centers on whether controlled vocabularies, such as Library of Congress Subject Headings (LCSH), are worth the time (and associated expense) of assigning and adding to bibliographic records in catalogs and databases. Studies on the issue focus primarily on users as seekers of information and examine keyword terms as used in searches. Few studies exist that examine the use of keywords assigned by authors of online documents. The present study is intended to contribute to the literature on this issue of keywords versus controlled vocabularies in online catalogs and databases.

Literature Review

Several studies have addressed the uses of controlled vocabulary versus keywords in users' catalog searches. A representative selection will be reviewed here to provide context for the current project. Carlyle conducted a study matching catalog users' search terms with LCSH in which 47 percent of the search terms matched exactly. (1) When including partial matches, word order variations, and spelling variations, the figure rose to 74 percent. Only 5 percent of users' search terms could not be matched at all. The remaining 21 percent were matches that required two or more LCSH terms to cover the search term. In this study, users" searches were done through subject search fields, not general keyword searches, which were not available at the time of the study. Carlyle concluded that a maximum 74 percent match rate was not an acceptable performance for LCSH and that further analysis of LCSH vis-a-vis user language was needed. The study is important because it defined levels of matching and called both for better matching against cross-references and for making LCSH semantically more flexible.

Frost investigated the utility of keywords taken from titles as "entry vocabulary" to subject searches by examining the degree of match between title keywords and controlled vocabulary. (2) Matches could be exact over the entire heading in direct order (11 percent of Frost's sample), in any order (30 percent), exact main heading only (12 percent), exact in subdivision (5 percent), truncated variant in main heading (14 percent) or subdivision (1 percent), or no match at all (27 percent). Thus matches of some type occurred in 73 percent of the titles in her sample, leaving the remaining 27 percent with no matches at all. Frost concluded that keywords and subject headings are complementary.

Ansari replicated Frost's study using medical dissertations written in Farsi. (3) Her findings were very close to Frost's; 70.3 percent of Ansari's terms were matches of some type and 29.7 percent did not match at all, compared to Frost's 73 percent and 27 percent, respectively. Ansari also concluded that keywords and descriptors are complementary and that keywords for which there is no matching descriptor should be considered for addition to indexing lists. …