The Reports of Our Professional Deaths Have Been Greatly Exaggerated, Part 2

We can’t blame the internet for it all. Whilst it’s undoubtedly exacerbating the issue, information overload has been around for decades. It’s just that today it’s instantaneous. With transmission of data from one person to another so effortless, we’re oblivious to the potential anxiety of the person who may not need (or care about) the information we’re conveying.

The opinion piece refers to a survey done by LexisNexis in which white collar workers detailed their frustration in dealing with the massive volume of information (such as papers, emails [especially cc’d emails], faxes, social media, and other sources) coming their way. In reading the summary of the survey, it hits on three issues that the workers identified:

“a surfeit of information” (people passing on too much information due to fear of passing too little);

“a lack of relevance” (employees left to sort all the information that they get);

inadequate systems for storing and retrieving information easily

Compare the results of that survey to this article from The Economist from earlier this year in describing the new superabundant information world that the advances in computing and communication have afforded us. (I wrote about it when it came out.) Now, if the survey results are to be trusted and combined with the increased data outlook in the Economist article, I daresay the librarian profession could capitalize on these trends to assert their value in the private sector (either as an ‘embedded’ librarian, consultant, or information service provider). I’m sure some savvy librarian entrepreneurs out there could make the case to companies that, as the oft quoted motto is “time is money”, the loss of productively could be offset with training in information vetting, effective and efficient searching, or even someone on staff who is trained in information management.

I can see such a niche industry popping up within the next five years, if not sooner. What do you think? Is this something that the profession would rise with the moment? Or has that train left the station?

Post navigation

2 thoughts on “The Reports of Our Professional Deaths Have Been Greatly Exaggerated, Part 2”

I would like to provide some technical perspective on this issue and provide a motivation as to the usefulness of librarians in this problem. I’ve dealt with information classification at my job (although only in passing since we decided not to pursue further research into it at the moment). I would group this problem of information overload / classification / retrieval under the auspices of applied computer science. In particular the area of research is machine learning, natural language processing, textual classification, document classification, etc.

I’m going to limit what I discuss to textual (document) information that is available in an electronic form. The idea is how to automatically classify documents into a taxonomy (and how to define such a taxonomy). Since there is such an abundant quantity of information, it will require farms of computers to do the work of document association and classification. Since we would like a computer to do this, that means we need to develop algorithms for the job, however these algorithms must be based on some mathematical framework.

Some of the issues that must be addressed in this mathematical framework are what features of a document are necessary for its classification, how to quantify the similarity between documents, and how to cluster the resulting data points, etc. Depending on how you want to spin this, this is a supervised/unsupervised machine learning problem. On top of this there is the more subtle (and difficult) problem of training a computer how to learn and detect contextual (semantic) information in documents.

Ultimately this is a nontrivial problem and the focus of current research. Of course practical progress has been made on classification of information,and some have managed to make enormous sums of money from their efforts (see Google…).

Now to provide my small plug for librarians…As librarians, a contribution could definitely be (among other potential contributions) in providing researchers with area expertise to develop better features (metrics to define on documents) and metrics of similarity. Basically, what it is that should be measured in a document that helps it be classified correctly (for example the percentage of words in a document that are “literary terms” vs. other preset classes of words) and how to successfully compare two documents for similarity using those measurements.