Discussions on the art and craft of research

Month: January 2018

Despite the fact that Wikipedia was born almost two decades ago, despite the fact that many libraries (mine included) have cancelled all other print and digital general encyclopedias and use it by preference, despite the fact that an increasing number of academics have actually found interesting uses for it within their classrooms – Wikipedia remains controversial. There are of course questions about bias and accuracy in any crowd-sourced site. But a short look into the history of encyclopedic works should alleviate some fears.

Wikipedia first came into being in 2001. The Internet itself had already grown beyond the “primordial swamp” that Paul Evans Peters called it in 1990 (Discussion at Institute on Collection Development for the Electronic Library. April 29-May 2, 1990,) but it was still a place that held a wild mix of legitimate, questionable, and not-so-legitimate sources. Graphical user interfaces were relatively new, search engines were unsophisticated, and there was little consistency in who was making digital materials available, and what it was they were offering the public.
To complicate things, the wiki platform confused many people in the academic world. Wikipedia was created by what seemed to be a world-wide group of interested readers, readers that might or might not have any recognized authority about what they wrote. This made Wikipedia seem amateurish and intellectually suspect.

To put it very simply, Wikipedia seemed to have little claim to any intellectual authority. The term “crowdsourcing” had not yet been coined; to the serious eye, Wikipedia was based on unvetted volunteerism. It was a kind of “stone soup,” where people were adding, trading off, editing each other, reporting inappropriate posts, always always creating something with no obvious recipe.

Wikipedia’s main competition, of course, was the venerable Encyclopedia Britannica.

Photo by Valentin on Unsplash

Between 1768-1771, the first edition of the Encyclopedia Britannica was compiled in Edinburgh and published in three volumes. As the first English-language encyclopedia, it quickly became an important title in the ever-increasing number of published reference works. It was heavily edited, and articles came to be written and signed by well-known scholars. As the scope of scholarship expanded rapidly, so did the Britannica’s size. When the 11th edition was published in 1910, it had increased to a whopping twenty-nine volumes. With that edition, its publication passed to the United States.

Society had come to look at encyclopedias in two ways. First, they were a convenient way of holding large amounts of information, paper cans to put facts and knowledge in. But an equally significant characteristic was that they were also a way of talking about that knowledge in an authoritative way.
So our crowd-sourced, stone-soup encyclopedia, Wikipedia, was born into a world that, on the surface, already had a hugely historic and effective title dominating the encyclopedic landscape.

But did it really?

The value of a reference work lies in its timeliness, its accuracy, and its authority. By 2001, even Britannica’s conservative editorship had allowed digital publication. But they maintained tight control over authorship and editing, leading, of course, to an issue with timeliness. Wikipedia, although the sourcing and authorship was distributed, was able to add, update and correct entries very quickly, literally on an hourly basis.

And that leads to the second important aspect of the value of a reference work: accuracy. The founders and serious participants of Wikipedia quickly developed mechanisms by which entered articles could be flagged, corrected, and objected to. Pieces of missing information could be added, explanations could be expanded, and articles could be removed. And although all of that remained the basis for the greatest objections to Wikipedia, the organization and its world-wide community soldiered on. Finally, in 2005, the highly respected journal Nature published an article in which the two titles were put head to head on the question of accuracy. And although Wikipedia was found to have a few more errors in the selected articles, it was determined that both Britannica and Wikipedia had errors. (Nature 438, 900–901 (15 December 2005))

There is also the ever-important argument of the importance of “authority.” For although Britannica’s reputation had been diminished somewhat when its editorship moved the United State, that could regarded as an issue of intellectual snobbery. The editors remained committed to finding the best possible authors for articles. Wikipedia, of course, was dependent on the intellectual efforts of unvetted volunteers.

But, against our belief in authority, we must place cultural and temporal bias. So, in the 11th edition of the Britannica, in the article on “The Negro,” the scholar Thomas Joyce writes “Mentally the negro is inferior to the white.” Clearly such a statement would never appear in the current edition of any decent encyclopedia. But I put it here to suggest that at the time that anything is published, an author and a few editors might not be in a good position to have the cultural distance to see bias.

So what can be our conclusion on Wikipedia?

Crowdsourcing clearly has its dangers, and therefore its detractors. But faith in unseen authority in edited reference works also has its dangers. Both types of sources inevitably reflect cultural biases and, frequently, have factual errors.

How do we teach students to use Wikipedia? We teach it the way we teach them to use any kind of reference work: read entries carefully and critically, examine them for bias. Use their bibliographies and added links to other materials and collections. Use them as jumping off points to more scholarly works. Use them (carefully) for a general orientation to a subject. And, of course, never use them as a citable source.

In short, as we all know, thoughtful, analytic reading of any source, at any time, is central to a researcher’s successful process. And don’t forget: the stone soup of fable turned out to be really tasty.

The first column represents the number of items distributed by the Government Publishing Office (GPO) to Federal Depository Library Program (FDLP) libraries in 2011 (appx. 10,200 items). The second column represents the total number of items distributed by GPO to FDLP over its entire 200 year history (appx. 2-3 million items). The third column is the number of URLs harvested by the 2008 End of Term crawl (appx. 160 million URLs).

Clearly, the scope of government information produced outside of the GPO and FDLP is very large. So large in fact that what is produced online each year makes the entire 200 year history of the Depository Library Program look like a drop in the bucket. This vast array of online government information can be called fugitive. No one knows how much born-digital government information has been created or where it all is.

Add rare, hard-to-find, and/or local government documents to your library catalog, as well as digitizing those that are not already available online, and upload them to Internet Archive, ideally with as much catalog metadata as possible

Advocate for the long-term value of seemingly obscure government information and help spread the word that short-term ease of accessibility actually masks the major problems associated with long-term preservation, access, and usability

Some of the documents we harvested in this capacity (see a few examples below) are local government publications that may not be easy to find online and which may not be accessible through any other library catalog anywhere. By finding them, adding them to Internet Archive, downloading them, physically adding them to our collection, and adding records to OCLC/WorldCat we are actively supporting preservation and discovery.

This is a very small way of responding to the very large problem of web preservation in general. However, as a small institution with a selective collection of government publications, it is a practical strategy for contributing to the efforts of larger institutions involved with the fascinating and complex problems like the End of Term (EOT) Web Archive.