Search form

Information Retrieval

The Bits Blog online with The New York Times reports that programmer Aaron Swartz was indicted for allegedly stealing 4 million documents from MIT and JSTOR. According to documents posted to Scribd, the arrest warrant cites alleged violation of 18 USC 1343, 18 USC 1003(a)(4), 18 USC 1003(a)(2), 18 USC 1003(a)(5)(B), and 18 USC 2.
The Boston Globe summed up the charges stating:

Aaron Swartz, 24, was charged with wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. He faces up to 35 years in prison and a $1 million fine.

In 2004, we spoke with law professor Cass Sunstein about the echo chamber effect, the phenomenon by which the explosion of information streams allows us to cherry-pick our media diet so we encounter only news that reinforces our worldview (while evading facts and opinions that contradict it). And so, seven years later are we on a path to ever more intellectual isolation? Eli Pariser, Lee Rainie, Clay Shirky, Joseph Turow and Ethan Zuckerman weigh in.
If you do not want to listen to the piece you can read the transcript.

Panizzi, Lubetzky, and Google: How the Modern Web Environment is Reinventing the Theory of Cataloguing: This paper uses cataloguing theory to interpret the partial results of an exploratory study of university students using Web search engines and Web-based OPACs. The participants expressed frustration with the OPAC; while they sensed that it was "organized," they were unable to exploit that organization and attributed their failure to the inadequacy of their own skills. In the Google searches, on the other hand, students were getting the support traditionally advocated in catalogue design. Google gave them starting points: resources that broadly addressed their requirements, enabling them to get a greater sense of the knowledge structure that would help them to increase their precision in subsequent searches. While current OPACs apparently fail to provide these starting points, the effectiveness of Google is consistent with the aims of cataloguing as expressed in the theories of Anthony Panizzi and Seymour Lubetzky

'Scrapers' Dig Deep for Data on Web
The market for personal data about Internet users is booming, and in the vanguard is the practice of "scraping." Firms offer to harvest online conversations and collect personal details from social-networking sites, résumé sites and online forums where people might discuss their lives.

At the same time of the December announcement the handful of engineers who were developing the Delicious system are understood to have either been sacked or redeployed inside Yahoo, leaving only support staff.

Services like Pinboard and Opera Link exist as potential replacements among other offerings online.

The developers of Mendeley, a research-management tool that has more than a million users, want to put more than 70 million academic papers, reader recommendations, and social-networking tags to new and innovative uses. The company announced Tuesday its “Binary Battle,” a contest for outside developers to build applications drawing from Mendeley’s collected information, with a $10,001 grand prize for the best new application.

What if instead of relying on search engines to get our information, we relied on each other - friends, experts, journalists - to deliver us information by way of carefully curated websites? Steven Rosenbaum, CEO of Magnify.net and author of Curation Nation: How to Win in a World Where Consumers are Creators tells Bob that our curated content future may have already arrived.
If player does not show above or you want to download MP3 or read transcript that is here.