23 March 2010

Hey, I finally figured out that thing about citation styles that annoys me. Basically, it’s FRBR.

Let me back up a moment. Back when electronic content was starting to explode, lots of citation styles were getting all persnickety about how to cite the electronic vs. the paper version of different things, and which database it came from, and all this crud. And I was thinking, why? Do I care? Does it really matter where I found an article? What possible way does its provenance matter to my argument?

In other words, I’m really not interested in item- or manifestation-level citations. The kind of arguments I make — the kind of arguments people in most disciplines, I think, make — are expression-level, caring only about the content in question and not the particular form in which it’s realized.

It reminds me of some of the discussion at The Past’s Digital Presence about the Google Books digitization, which went off in the opposite direction — that, by treating books solely in terms of their intellectual content and treating physical distinctions among items as irrelevant or uninteresting, Google Books was stripping out a vital part of the historical record. And that’s true, too — there are kinds of scholarship for which you need to see how history has nicked and scratched a particular object. There are kinds of scholarship where subtle differences among versions are important. And for those kinds of scholarship, we need both access (one of those distinct advantages of libraries, by the way) and citation with fine levels of granularity. Even in everyday but monograph-heavy scholarship, where we’re going to be citing page numbers, we need enough edition-specific description to contextualize that (except where there are discipline-specific conventions for avoiding that — yay, classics!).

But most of the papers I’ve written? I’m reading journal articles, and it really does not matter where I accessed them. So, dear citation formats of the world, thank you for noticing, and chilling the heck out a bit about this.

(Why, yes, my entire life has been eaten lately by putting together a paper for the LITA/Ex Libris student writing contest…it’s a good thing I didn’t realize in advance that “3000-5000 words” meant I would be writing a 20-page paper in the scraps of time during the 2 weeks when my daughter, presently on spring break, was asleep! Because, I mean, that’s impossible, and if I’d known it was impossible…

20 February 2010

Here I am at the Past’s Digital Presence conference at Yale (gorgeous campus, wonderfully multidisciplinary crowd). One of this morning’s sessions apparently had a lively debate on Google Books — I say “apparently” because I was not at it but ah, the magic of Twitter. I had a thought on it at the time but it really didn’t compress well into 140 characters (oft-revised, failed attempt).

So here’s the story as I got it from the fragments: a flap covering up a scandalous part of a circa-1900 book is not in the Google-digitized version; the original text is revealed. (Why? Flap treated by Google as unimportant, or as an impediment? After a century of use the flap fell off? Who knows?)

Which got me to thinking about the revealed (implicit? explicit?) choice here: that the text of the book — in contrast to its history or context — is the important thing to be preserved, possibly the whole of its value.

Which is, I think, an easy assumption to have as we swim in streams of digital texts, streams where “platform-independent” is a good thing, streams where content and presentation layers can be separate (and that’s a good thing too). Indeed, digital texts can seem not to have a history, in that they do not tend to accumulate visually apparent marks of their use, and often the marks they do accumulate require special technical skills to see. (And when they do accumulate obvious history it can badly break paradigms — what was that controversy the other year when revision-history information in a Word document revealed classified information? People’s surprise indicates that they had a paradigm which was broken. Similarly, things like CommentPress or Copia or Sidewiki or Wikipedia, which make histories and marginalia obvious, are striking in part because of that feature.)

All of which is to say: we have a blind spot for the non-textual content of books; we tend to think that the textual content is the content. And I wonder if Google thought about this when doing Books. And I wonder if their blind spot is more severe than the norm, as I expect Googlers’ text consumption patterns are even more geared toward the digital than everybody else’s these days.

My cryptic notes tell me that somehow this issue came up in a talk on digital curation afterward, but alas, they are too cryptic for my current caffeine level…