Searching ECCO

Eleanor called my attention to the fact that ECCO provides a list of the most common search terms by quarter. When I looked this up, I found that the most frequent search term last quarter was “Gold,” with 5981 searches. The next most popular searches were

Sleep (5829 searches)

America (3110 searches)

Woman (2520 searches)

Our ongoing discussion of searching methods in Burney (made possible by free access to the Burney Collection of Newspapers through October 30 via http://access.gale.com/emob) has been productive and will, I hope, continue. As we discuss Burney, I am also curious how best to approach searching ECCO. Do these search terms—“Gold,” “Sleep,” “America,” “Woman”—tell us anything about how scholars search ECCO? Are there particular methods that work?

14 Responses to “Searching ECCO”

I’ve examined the stats for searches for my institution alone, and often the most frequently common terms for a week or month are in a large part driven by undergraduate searches and tied to course topics and assignments. (We typically offer 18th-century grad courses once every three semesters, thus why I single out undergrads). Thus, popular terms could point as much to pedagogical trends as much as research trends.

The high occurrence of searches for “America” might point to the increasing interest in transatlantic work. In the past I have seen “Indian” or “native American” on the list. Similarly, the popularity of searches for “woman” suggests the continued interest in feminist and gender studies. “Gold” is perhaps harder to speculate about the rationale for its popularity–perhaps it speaks to work involving material culture, economic study, or studies of greed.

In some ways it is hard to determine effective keywords (or by extension words for searches by author, title, full-text, etc.) because the terms are so tied to individual (or course) interests and purposes. What could be helpful is determining what spellings, synonyms, and proximity terms work best for seeking certain kinds of information through keyword searches. A common example would be the need to use “cafe” if one is interested in case and cases as the terms pertain to the law especially beyond the subset of “law” in ECCO.

I am the Gale product manager responsible for ECCO and Burney and wanted to just clarify one item regarding our terminology just so that everyone is clear how you can search in either database.

For ECCO, we allow either a keyword or entire document (full-text search). A Keyword search looks at the words/terms captured in the following fields: author (and sub-author), title and all of the words that have been keyed in for the eTable of Contents. When Library of Congress subject headings are added later this year, that information will also be part of a keyword search.

An entire document search looks at all of the fields in the Keyword search (the metadata) as well as all of the words from all of the pages as captured from the OCR process. The exception are the running chapter headings on each page which we exclude from the OCR so you don’t get meaningless hits on page after page.

For Burney, because of the nature of the data and the way it was captured, we do not allow the equivalent of a keyword search – just an entire document (full-text search). For a newspaper, the keyword search would typically be on the headline only.

I hope that this helps everyone understand what one can do (and can’t do) with each of the databases.

Thanks, Scott, for your helpful distinction between “keyword search” and “full-text word search.” If I understand you correctly, you would like a firmer distinction between “keyword” searching and “full-text” searching. Keyword searching searches metadata captured by cataloguers: (author, sub-author, title, words keyed in for the eTable of Contents, and eventually LOC subject headings.) By contrast, a full-text word search allows for a fuller degree of searching, depending on the accuracy of the OCR process.

Yes, many thanks for the reminder about the terminology. In Burney I appreciate the way that the default advanced search is set to search the entire document. In Gale’s digital Digital Times Historical archive, I believe keyword is the default and often forget to change to Txt search that will allow me to search the entire document.

To illustrate the ways different types of searches will yield different hits, here’s what happens if you search “Pamela” (without any other limiters such as date, author, etc):

Title: 124 hits
Keyword: 209 hits
Entire Document: 1,729 hits

As one can see, the returns for an entire document can be unwieldy (depends on the search term). At the same time, it is important to remember that a keyword or title search will not return all occurrences of a term.

I do search use the full-text search fairly frequently, but I do so with other limiters–and most frequently within a particular title or a particular author’s body of work.

In light of Scott’s comments, I looked over some postings to see if my comments about searching helped create confusion. (I often use authors and titles as “keywords” in the sense that Scott clarifies above–and I sensed that others were using “keyword” to refer primarily to generic nouns (as well as other parts of speech).

In my first reply to this posting, I wrote:

What could be helpful is determining what spellings, synonyms, and proximity terms work best for seeking certain kinds of information through keyword searches. A common example would be the need to use “cafe” if one is interested in case and cases as the terms pertain to the law especially beyond the subset of “law” in ECCO.

Issues of “spellings, synonyms, and proximity terms” (as well as the example) are applicable to both keyword and full-text. These remarks, however, may have caused people to think that “keyword” is the same as “full-text” searching.

There was nothing at all confusing about your use of the term, “keyword,” Eleanor. It is easy, however, for scholars to grab that term “keyword” and equate it with “term search” or “full-text search.” I have had to scour my posts to recify the habit of using those terms interchangeably. Now that Scott has reminded us that keywords are exclusively the metadata captured by cataloguers, we can proceed with greater accuracy.

Yes– I realized the potential for confusion when I saw Scott’s very helpful clarification.

As for keyword searching in Burney, if we had Jim Tierney’s index in an electronic form (or another similar tool–his index does not cover all the titles in Burney) aligned with Burney, then yes, we probably would want that capability.

But the newspapers in Burney, for example, really don’t have the type of headlines that could be searched as contemporary newspapers do. Even if Burney had keyword search capability, a number of my real finds about my publishers would have probably only been found via full-text searches.

To return to searching with ECCO, with ECCO II, portions of EEBO can also be searched, at least by those whose institutions subscribe to both ECCO II and EEBO. At this point, the portion of EEBO that can be searched is relatively small. Rich as full-text searching is in ECCO alone, the advantages of searching a larger database, including EEBO, seem clear, and might, eventually, assist the kinds of century-spanning projects that Mathew Wilkens describes elsewhere both on this blog and on his blog, “Work Product.”

Are there other improvements to searching ECCO that would be desirable?

Those institutions who have ECCO but have not yet purchased ECCO II can also now search EEBO simultaneously if they had the ECCO interface updated to the one developed for ECCO II. I would assume that most institutions did convert to the new interface.

I don’t know if this is significant, but under ECCO’s “Frequently Asked Questions,” which can be found under the “Research Tools” tab, question #15 indicates that for mutual subscribers to ECCO and EEBO,

Your institution needs to indicate to Gale that you are interested in having this functionality [searching both EEBO and ECCO] enabled, at which point the ability to cross search with be an option from within ECCO. Works from EEBO will be clearly designated as such within the results list. A user may view that works Full Citation from within ECCO, but to see the entire work they will be transferred over to EEBO through a separate, pop-up window.

Yes, the libraries do need to request–but I would think that most in charge of electronic resources would do so. I know that I contacted my library after participating in a Gale webinar (very helpful by the way: click here for a list of September offerings–including ones for ECCO). The director of electronic resources was already on top of ths option, change.